|
| 1 | +<h1 align="center" id="title">Deploy CodeGen on OpenShift Cluster with RHOAI</h1> |
| 2 | + |
| 3 | +## Prerequisites |
| 4 | + |
| 5 | +1. **Red Hat OpenShift Cluster** with dynamic *StorageClass* to provision *PersistentVolumes* e.g. **OpenShift Data Foundation**) and installed Operators: **Red Hat - Authorino (Technical Preview)**, **Red Hat OpenShift Service Mesh**, **Red Hat OpenShift Serverless** and **Red Hat Openshift AI**. |
| 6 | +2. Exposed image registry to push there docker images (https://docs.openshift.com/container-platform/4.16/registry/securing-exposing-registry.html). |
| 7 | +3. Access to S3-compatible object storage bucket (e.g. **OpenShift Data Foundation**, **AWS S3**) and values of access and secret access keys and S3 endpoint (https://docs.redhat.com/en/documentation/red_hat_openshift_data_foundation/4.16/html/managing_hybrid_and_multicloud_resources/accessing-the-multicloud-object-gateway-with-your-applications_rhodf#accessing-the-multicloud-object-gateway-with-your-applications_rhodf)\ |
| 8 | +4. Account on https://huggingface.co/, access to model _ise-uiuc/Magicoder-S-DS-6.7B_ (for Xeon) or _meta-llama/CodeLlama-7b-hf_ (for Gaudi) and token with _Read permissions_. Update the access token in your repository using following commands. |
| 9 | + |
| 10 | +On Xeon: |
| 11 | +``` |
| 12 | +cd GenAIExamples/CodeGen/openshift-rhoai/manifests/xeon |
| 13 | +export HFTOKEN="YourOwnToken" |
| 14 | +sed -i "s/insert-your-huggingface-token-here/${HFTOKEN}/g" codegen.yaml servingruntime-magicoder.yaml |
| 15 | +``` |
| 16 | + |
| 17 | +On Gaudi: |
| 18 | +``` |
| 19 | +cd GenAIExamples/CodeGen/openshift-rhoai/manifests/gaudi |
| 20 | +export HFTOKEN="YourOwnToken" |
| 21 | +sed -i "s/insert-your-huggingface-token-here/${HFTOKEN}/g" codegen.yaml servingruntime-codellama.yaml |
| 22 | +``` |
| 23 | + |
| 24 | +## Deploy model in Red Hat Openshift AI |
| 25 | + |
| 26 | +1. Log in to OpenShift CLI and run following commands to create new serving runtime. |
| 27 | + |
| 28 | +On Xeon: |
| 29 | +``` |
| 30 | +cd GenAIExamples/CodeGen/openshift-rhoai/manifests/xeon |
| 31 | +oc apply -f servingruntime-magicoder.yaml |
| 32 | +``` |
| 33 | + |
| 34 | +On Gaudi: |
| 35 | +``` |
| 36 | +cd GenAIExamples/CodeGen/openshift-rhoai/manifests/gaudi |
| 37 | +oc apply -f servingruntime-codellama.yaml |
| 38 | +``` |
| 39 | + |
| 40 | +Verify if template has been created with ```oc get template -n redhat-ods-applications``` command. |
| 41 | + |
| 42 | +2. Find the route for **Red Hat OpenShift AI** dashboard with below command and open it in the browser: |
| 43 | +``` |
| 44 | +oc get routes -A | grep rhods-dashboard |
| 45 | +``` |
| 46 | +3. Go to **Data Science Project** and clik **Create data science project**. Fill the **Name** and click **Create**. |
| 47 | +4. Go to **Workbenches** tab and clik **Create workbench**. Fill the **Name**, under **Notebook image** choose *Standard Data Science*, under **Cluster storage** choose *Create new persistent storage* and change **Persistent storage size** to 40 GB. Click **Create workbench**. |
| 48 | +5. Open newly created Jupiter notebook and run following commands to download the model and upload it on s3: |
| 49 | +``` |
| 50 | +%env S3_ENDPOINT=<S3_RGW_ROUTE> |
| 51 | +%env S3_ACCESS_KEY=<AWS_ACCESS_KEY_ID> |
| 52 | +%env S3_SECRET_KEY=<AWS_SECRET_ACCESS_KEY> |
| 53 | +%env HF_TOKEN=<PASTE_HUGGINGFACE_TOKEN> |
| 54 | +``` |
| 55 | +``` |
| 56 | +!pip install huggingface-hub |
| 57 | +``` |
| 58 | +``` |
| 59 | +import os |
| 60 | +import boto3 |
| 61 | +import botocore |
| 62 | +import glob |
| 63 | +from huggingface_hub import snapshot_download |
| 64 | +bucket_name = 'first.bucket' |
| 65 | +s3_endpoint = os.environ.get('S3_ENDPOINT') |
| 66 | +s3_accesskey = os.environ.get('S3_ACCESS_KEY') |
| 67 | +s3_secretkey = os.environ.get('S3_SECRET_KEY') |
| 68 | +path = 'models' |
| 69 | +hf_token = os.environ.get('HF_TOKEN') |
| 70 | +session = boto3.session.Session() |
| 71 | +s3_resource = session.resource('s3', |
| 72 | + endpoint_url=s3_endpoint, |
| 73 | + verify=False, |
| 74 | + aws_access_key_id=s3_accesskey, |
| 75 | + aws_secret_access_key=s3_secretkey) |
| 76 | +bucket = s3_resource.Bucket(bucket_name) |
| 77 | +``` |
| 78 | +For Xeon download *ise-uiuc/Magicoder-S-DS-6.7B*: |
| 79 | +``` |
| 80 | +snapshot_download("ise-uiuc/Magicoder-S-DS-6.7B", cache_dir=f'./models', token=hf_token) |
| 81 | +``` |
| 82 | +For Gaudi download *meta-llama/CodeLlama-7b-hf*: |
| 83 | +``` |
| 84 | +snapshot_download("meta-llama/CodeLlama-7b-hf", cache_dir=f'./models', token=hf_token) |
| 85 | +``` |
| 86 | +Upload the downloaded model to S3: |
| 87 | +``` |
| 88 | +files = (file for file in glob.glob(f'{path}/**/*', recursive=True) if os.path.isfile(file) and "snapshots" in file) |
| 89 | +for filename in files: |
| 90 | + s3_name = filename.replace(path, '') |
| 91 | + print(f'Uploading: {filename} to {path}{s3_name}') |
| 92 | + bucket.upload_file(filename, f'{path}{s3_name}') |
| 93 | +``` |
| 94 | +6. Go to your project in **Red Hat OpenShift AI** dashboard, then "Models" tab and click **Deploy model** under *Single-model serving platform*. Fill the **Name**, choose newly created **Serving runtime**: *Text Generation Inference Magicoder-S-DS-6.7B on CPU* (for Xeon) or *Text Generation Inference CodeLlama-7b-hf on Gaudi* (for Gaudi), **Model framework**: *llm* and change **Model server size** to *Custom*: 16 CPUs and 64 Gi memory. For deployment with Gaudi select proper **Accelerator**. Click the checkbox to create external route in **Model route** section and uncheck the token authentication. Under **Model location** choose *New data connection* and fill all required fields for s3 access, **Bucket** *first.bucket* and **Path**: *models*. Click **Deploy**. It takes about 10 minutes to get *Loaded* status.\ |
| 95 | +If it's not going to *Loaded* status and revision changed status to "ProgressDeadlineExceeded" (``oc get revision``), scale model deployment to 0 and than to 1 with command ``oc scale deployment.apps/<model_deployment_name> --replicas=1`` and wait about 10 minutes for deployment. |
| 96 | + |
| 97 | +## Deploy CodeGen |
| 98 | + |
| 99 | +1. Login to OpenShift CLI, go to your project and find the URL of TGI_LLM_ENDPOINT: |
| 100 | +``` |
| 101 | +oc get service.serving.knative.dev |
| 102 | +``` |
| 103 | +Update the TGI_LLM_ENDPOINT in your repository. |
| 104 | + |
| 105 | +On Xeon: |
| 106 | +``` |
| 107 | +cd GenAIExamples/CodeGen/openshift-rhoai/manifests/xeon |
| 108 | +export TGI_LLM_ENDPOINT="YourURL" |
| 109 | +sed -i "s#insert-your-tgi-url-here#${TGI_LLM_ENDPOINT}#g" codegen.yaml |
| 110 | +``` |
| 111 | + |
| 112 | +On Gaudi: |
| 113 | +``` |
| 114 | +cd GenAIExamples/CodeGen/openshift-rhoai/manifests/gaudi |
| 115 | +export TGI_LLM_ENDPOINT="YourURL" |
| 116 | +sed -i "s#insert-your-tgi-url-here#${TGI_LLM_ENDPOINT}#g" codegen.yaml |
| 117 | +``` |
| 118 | + |
| 119 | +2. Build docker images locally |
| 120 | + |
| 121 | +- LLM Docker Image: |
| 122 | +``` |
| 123 | +git clone https://github.com/opea-project/GenAIComps.git |
| 124 | +cd GenAIComps |
| 125 | +docker build -t opea/llm-tgi:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/llms/text-generation/tgi/Dockerfile . |
| 126 | +``` |
| 127 | +- MegaService Docker Image: |
| 128 | +``` |
| 129 | +git clone https://github.com/opea-project/GenAIExamples |
| 130 | +cd GenAIExamples/CodeGen |
| 131 | +docker build -t opea/codegen:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f Dockerfile . |
| 132 | +``` |
| 133 | +- UI Docker Image: |
| 134 | +``` |
| 135 | +cd GenAIExamples/CodeGen/ui |
| 136 | +docker build -t opea/codegen-ui:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f ./docker/Dockerfile . |
| 137 | +``` |
| 138 | +To verify run the command: `docker images`. |
| 139 | + |
| 140 | +3. Login to docker, tag the images and push it to image registry with following commands: |
| 141 | + |
| 142 | +``` |
| 143 | +docker login -u <user> -p $(oc whoami -t) <openshift-image-registry_route> |
| 144 | +docker tag <image_id> <openshift-image-registry_route>/<namespace>/<image_name>:<tag> |
| 145 | +docker push <openshift-image-registry_route>/<namespace>/<image_name>:<tag> |
| 146 | +``` |
| 147 | +To verify run the command: `oc get istag`. |
| 148 | + |
| 149 | +4. Use the *IMAGE REFERENCE* from previous step to update images names in manifest files. |
| 150 | + |
| 151 | +On Xeon: |
| 152 | +``` |
| 153 | +cd GenAIExamples/CodeGen/openshift-rhoai/manifests/xeon |
| 154 | +export IMAGE_LLM_TGI="YourImage" |
| 155 | +export IMAGE_CODEGEN="YourImage" |
| 156 | +export IMAGE_CODEGEN_UI="YourImage" |
| 157 | +sed -i "s#insert-your-image-llm-tgi#${IMAGE_LLM_TGI}#g" codegen.yaml |
| 158 | +sed -i "s#insert-your-image-codegen#${IMAGE_CODEGEN}#g" codegen.yaml |
| 159 | +sed -i "s#insert-your-image-codegen-ui#${IMAGE_CODEGEN_UI}#g" ui-server.yaml |
| 160 | +``` |
| 161 | + |
| 162 | +On Gaudi: |
| 163 | +``` |
| 164 | +cd GenAIExamples/CodeGen/openshift-rhoai/manifests/gaudi |
| 165 | +export IMAGE_LLM_TGI="YourImage" |
| 166 | +export IMAGE_CODEGEN="YourImage" |
| 167 | +export IMAGE_CODEGEN_UI="YourImage" |
| 168 | +sed -i "s#insert-your-image-llm-tgi#${IMAGE_LLM_TGI}#g" codegen.yaml |
| 169 | +sed -i "s#insert-your-image-codegen#${IMAGE_CODEGEN}#g" codegen.yaml |
| 170 | +sed -i "s#insert-your-image-codegen-ui#${IMAGE_CODEGEN_UI}#g" ui-server.yaml |
| 171 | +``` |
| 172 | + |
| 173 | +5. Create *rhoai-ca-bundle* secret: |
| 174 | +``` |
| 175 | +oc create secret generic rhoai-ca-bundle --from-literal=tls.crt="$(oc extract secret/knative-serving-cert -n istio-system --to=- --keys=tls.crt)" |
| 176 | +``` |
| 177 | + |
| 178 | +6. Deploy CodeGen with command: |
| 179 | +``` |
| 180 | +oc apply -f codegen.yaml |
| 181 | +``` |
| 182 | + |
| 183 | +7. Check the *codegen* route with command `oc get routes` and update the route in *ui-server.yaml* file: |
| 184 | + |
| 185 | +On Xeon: |
| 186 | +``` |
| 187 | +cd GenAIExamples/CodeGen/openshift-rhoai/manifests/xeon |
| 188 | +export CODEGEN_ROUTE="YourCodegenRoute" |
| 189 | +sed -i "s/insert-your-codegen-route/${CODEGEN_ROUTE}/g" ui-server.yaml |
| 190 | +``` |
| 191 | + |
| 192 | +On Gaudi: |
| 193 | +``` |
| 194 | +cd GenAIExamples/CodeGen/openshift-rhoai/manifests/gaudi |
| 195 | +export CODEGEN_ROUTE="YourCodegenRoute" |
| 196 | +sed -i "s/insert-your-codegen-route/${CODEGEN_ROUTE}/g" ui-server.yaml |
| 197 | +``` |
| 198 | + |
| 199 | +8. Deploy UI with command: |
| 200 | +``` |
| 201 | +oc apply -f ui-server.yaml |
| 202 | +``` |
| 203 | + |
| 204 | +## Verify Services |
| 205 | + |
| 206 | +Make sure all the pods are running, and restart the codegen-xxxx pod if necessary. |
| 207 | + |
| 208 | +``` |
| 209 | +oc get pods |
| 210 | +curl http://${CODEGEN_ROUTE}/v1/codegen -H "Content-Type: application/json" -d '{ |
| 211 | + "messages": "Implement a high-level API for a TODO list application. The API takes as input an operation request and updates the TODO list in place. If the request is invalid, raise an exception." |
| 212 | + }' |
| 213 | +``` |
| 214 | + |
| 215 | +## Launch the UI |
| 216 | + |
| 217 | +To access the frontend, find the route for *ui-server* with command `oc get routes` and open it in the browser. |
0 commit comments