This document outlines the deployment process for a Translation service utilizing the GenAIComps microservice pipeline on Intel Xeon server. This example includes the following sections:
- Translation Quick Start Deployment: Demonstrates how to quickly deploy a Translation service/pipeline on Intel® Xeon® platform.
- Translation Docker Compose Files: Describes some example deployments and their docker compose files.
- Translation Service Configuration: Describes the service and possible configuration changes.
This section describes how to quickly deploy and test the Translation service manually on Intel® Xeon® platform. The basic steps are:
- Access the Code
- Generate a HuggingFace Access Token
- Configure the Deployment Environment
- Deploy the Service Using Docker Compose
- Check the Deployment Status
- Test the Pipeline
- Cleanup the Deployment
Clone the GenAIExample repository and access the Translation Intel® Xeon® platform Docker Compose files and supporting scripts:
git clone https://github.com/opea-project/GenAIExamples.git
cd GenAIExamples/Translation/docker_compose/intel/cpu/xeon/
Checkout a released version, such as v1.2:
git checkout v1.2
Some HuggingFace resources, such as some models, are only accessible if you have an access token. If you do not already have a HuggingFace access token, you can create one by first creating an account by following the steps provided at HuggingFace and then generating a user access token.
To set up environment variables for deploying Translation service, source the set_env.sh script in this directory:
cd ../../../
source set_env.sh
cd intel/cpu/xeon
The set_env.sh script will prompt for required and optional environment variables used to configure the Translation service. If a value is not entered, the script will use a default value for the same. It will also generate a env file defining the desired configuration. Consult the section on Translation Service configuration for information on how service specific configuration parameters affect deployments.
To deploy the Translation service, execute the docker compose up
command with the appropriate arguments. For a default deployment, execute:
docker compose up -d
The Translation docker images should automatically be downloaded from the OPEA registry
and deployed on the Intel® Xeon® Platform:
[+] Running 6/6
✔ Network xeon_default Created 0.1s
✔ Container tgi-service Healthy 328.1s
✔ Container llm-textgen-server Started 323.5s
✔ Container translation-xeon-backend-server Started 323.7s
✔ Container translation-xeon-ui-server Started 324.0s
✔ Container translation-xeon-nginx-server Started 324.2s
After running docker compose, check if all the containers launched via docker compose have started:
docker ps -a
For the default deployment, the following 5 containers should be running:
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
89a39f7c917f opea/nginx:latest "/docker-entrypoint.…" 7 minutes ago Up About a minute 0.0.0.0:80->80/tcp, :::80->80/tcp translation-xeon-nginx-server
68b8b86a737e opea/translation-ui:latest "docker-entrypoint.s…" 7 minutes ago Up About a minute 0.0.0.0:5173->5173/tcp, :::5173->5173/tcp translation-xeon-ui-server
8400903275b5 opea/translation:latest "python translation.…" 7 minutes ago Up About a minute 0.0.0.0:8888->8888/tcp, :::8888->8888/tcp translation-xeon-backend-server
2da5545cb18c opea/llm-textgen:latest "bash entrypoint.sh" 7 minutes ago Up About a minute 0.0.0.0:9000->9000/tcp, :::9000->9000/tcp llm-textgen-server
dee02c1fb538 ghcr.io/huggingface/text-generation-inference:2.4.0-intel-cpu "text-generation-lau…" 7 minutes ago Up 7 minutes (healthy) 0.0.0.0:8008->80/tcp, [::]:8008->80/tcp tgi-service
Once the Translation service are running, test the pipeline using the following command:
curl http://${host_ip}:8888/v1/translation -H "Content-Type: application/json" -d '{
"language_from": "Chinese","language_to": "English","source_language": "我爱机器翻译。"}'
Note The value of host_ip was set using the set_env.sh script and can be found in the .env file.
To stop the containers associated with the deployment, execute the following command:
docker compose -f compose.yaml down
[+] Running 6/6
✔ Container translation-xeon-nginx-server Removed 10.4s
✔ Container translation-xeon-ui-server Removed 10.3s
✔ Container translation-xeon-backend-server Removed 10.3s
✔ Container llm-textgen-server Removed 10.3s
✔ Container tgi-service Removed 2.8s
✔ Network xeon_default Removed 0.4s
All the Translation containers will be stopped and then removed on completion of the "down" command.
The compose.yaml is default compose file using tgi as serving framework
Service Name | Image Name |
---|---|
tgi-service | ghcr.io/huggingface/text-generation-inference:2.4.0-intel-cpu |
llm | opea/llm-textgen:latest |
translation-xeon-backend-server | opea/translation:latest |
translation-xeon-ui-server | opea/translation-ui:latest |
translation-xeon-nginx-server | opea/nginx:latest |
The table provides a comprehensive overview of the Translation service utilized across various deployments as illustrated in the example Docker Compose files. Each row in the table represents a distinct service, detailing its possible images used to enable it and a concise description of its function within the deployment architecture.
Service Name | Possible Image Names | Optional | Description |
---|---|---|---|
tgi-service | ghcr.io/huggingface/text-generation-inference:2.4.0-intel-cpu | No | Specific to the TGI deployment, focuses on text generation inference using Xeon hardware. |
llm | opea/llm-textgen:latest | No | Handles large language model (LLM) tasks |
translation-xeon-backend-server | opea/translation:latest | No | Serves as the backend for the Translation service, with variations depending on the deployment. |
translation-xeon-ui-server | opea/translation-ui:latest | No | Provides the user interface for the Translation service. |
translation-xeon-nginx-server | opea/nginx:latest | No | Acts as a reverse proxy, managing traffic between the UI and backend services. |