Run TGI (Text Generation Interface) by Hugging Face
TGI (Text Generation Interface) is a toolkit for deploying and service Large Language Models (LLMs). TGI enables high-performance text generation for the most popular open-source LLMs, including Llama, Falcon, StarCoder, BLOOM, GPT-NeoX, and T5.
Deploying TGI on Salad
Container
Hugging Face provides a pre-built docker available via the Github Container registry.
ghcr.io/huggingface/text-generation-inference:1.1.1
In order to deploy the container on Salad, you will need to configure the container with your desired models, plus any additional settings. These options can either be configured as part of the CMD
or can be set as Environment Variables when creating your container group. Here is a complete list of all TGI options

Example TGI Options
Required - Networking Setup
In addition to any options that you need to run the model, you will need to configure TGI to use IPv6 in order to be compatible with Salad's networking feature. This is done by simply setting HOSTNAME
to ::

Salad portal Env Vars
Recommended - Health Probes
Health Probes help ensure that your container only serves traffic it is ready and ensures that the container continues to run as expected. When the TGI container starts up, it begins to download the specific model before it can start serving requests. While the model is downloading, the API is unavailable. The simplest health probe is to check the /health
endpoint. If the endpoint is running, then the model is ready to serve traffic.
Exec Health Probe
The exec
health probe will run the given command inside the container, if the command returns an exit code of 0, the container is considered in a healthy state. Any other exit codes indicate the container is not ready yet.
The TGI container does not include curl
or wget
so in order to check the :80/health
API we decided to use python's requests to check the API.
python -c "import requests,sys;sys.exit(0 if requests.get('http://localhost:80/health').status_code == 200 else -1)"

Http Health Probe - Coming Soon
Updated 15 days ago