Skip to content

Commit 5e204d3

Browse files
committed
Initial commit for CodeGen on OpenShift
1 parent 872e93e commit 5e204d3

File tree

12 files changed

+1604
-0
lines changed

12 files changed

+1604
-0
lines changed
Lines changed: 217 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,217 @@
1+
<h1 align="center" id="title">Deploy CodeGen on OpenShift Cluster with RHOAI</h1>
2+
3+
## Prerequisites
4+
5+
1. **Red Hat OpenShift Cluster** with dynamic *StorageClass* to provision *PersistentVolumes* e.g. **OpenShift Data Foundation**) and installed Operators: **Red Hat - Authorino (Technical Preview)**, **Red Hat OpenShift Service Mesh**, **Red Hat OpenShift Serverless** and **Red Hat Openshift AI**.
6+
2. Exposed image registry to push there docker images (https://docs.openshift.com/container-platform/4.16/registry/securing-exposing-registry.html).
7+
3. Access to S3-compatible object storage bucket (e.g. **OpenShift Data Foundation**, **AWS S3**) and values of access and secret access keys and S3 endpoint (https://docs.redhat.com/en/documentation/red_hat_openshift_data_foundation/4.16/html/managing_hybrid_and_multicloud_resources/accessing-the-multicloud-object-gateway-with-your-applications_rhodf#accessing-the-multicloud-object-gateway-with-your-applications_rhodf)\
8+
4. Account on https://huggingface.co/, access to model _ise-uiuc/Magicoder-S-DS-6.7B_ (for Xeon) or _meta-llama/CodeLlama-7b-hf_ (for Gaudi) and token with _Read permissions_. Update the access token in your repository using following commands.
9+
10+
On Xeon:
11+
```
12+
cd GenAIExamples/CodeGen/openshift-rhoai/manifests/xeon
13+
export HFTOKEN="YourOwnToken"
14+
sed -i "s/insert-your-huggingface-token-here/${HFTOKEN}/g" codegen.yaml servingruntime-magicoder.yaml
15+
```
16+
17+
On Gaudi:
18+
```
19+
cd GenAIExamples/CodeGen/openshift-rhoai/manifests/gaudi
20+
export HFTOKEN="YourOwnToken"
21+
sed -i "s/insert-your-huggingface-token-here/${HFTOKEN}/g" codegen.yaml servingruntime-codellama.yaml
22+
```
23+
24+
## Deploy model in Red Hat Openshift AI
25+
26+
1. Log in to OpenShift CLI and run following commands to create new serving runtime.
27+
28+
On Xeon:
29+
```
30+
cd GenAIExamples/CodeGen/openshift-rhoai/manifests/xeon
31+
oc apply -f servingruntime-magicoder.yaml
32+
```
33+
34+
On Gaudi:
35+
```
36+
cd GenAIExamples/CodeGen/openshift-rhoai/manifests/gaudi
37+
oc apply -f servingruntime-codellama.yaml
38+
```
39+
40+
Verify if template has been created with ```oc get template -n redhat-ods-applications``` command.
41+
42+
2. Find the route for **Red Hat OpenShift AI** dashboard with below command and open it in the browser:
43+
```
44+
oc get routes -A | grep rhods-dashboard
45+
```
46+
3. Go to **Data Science Project** and clik **Create data science project**. Fill the **Name** and click **Create**.
47+
4. Go to **Workbenches** tab and clik **Create workbench**. Fill the **Name**, under **Notebook image** choose *Standard Data Science*, under **Cluster storage** choose *Create new persistent storage* and change **Persistent storage size** to 40 GB. Click **Create workbench**.
48+
5. Open newly created Jupiter notebook and run following commands to download the model and upload it on s3:
49+
```
50+
%env S3_ENDPOINT=<S3_RGW_ROUTE>
51+
%env S3_ACCESS_KEY=<AWS_ACCESS_KEY_ID>
52+
%env S3_SECRET_KEY=<AWS_SECRET_ACCESS_KEY>
53+
%env HF_TOKEN=<PASTE_HUGGINGFACE_TOKEN>
54+
```
55+
```
56+
!pip install huggingface-hub
57+
```
58+
```
59+
import os
60+
import boto3
61+
import botocore
62+
import glob
63+
from huggingface_hub import snapshot_download
64+
bucket_name = 'first.bucket'
65+
s3_endpoint = os.environ.get('S3_ENDPOINT')
66+
s3_accesskey = os.environ.get('S3_ACCESS_KEY')
67+
s3_secretkey = os.environ.get('S3_SECRET_KEY')
68+
path = 'models'
69+
hf_token = os.environ.get('HF_TOKEN')
70+
session = boto3.session.Session()
71+
s3_resource = session.resource('s3',
72+
endpoint_url=s3_endpoint,
73+
verify=False,
74+
aws_access_key_id=s3_accesskey,
75+
aws_secret_access_key=s3_secretkey)
76+
bucket = s3_resource.Bucket(bucket_name)
77+
```
78+
For Xeon download *ise-uiuc/Magicoder-S-DS-6.7B*:
79+
```
80+
snapshot_download("ise-uiuc/Magicoder-S-DS-6.7B", cache_dir=f'./models', token=hf_token)
81+
```
82+
For Gaudi download *meta-llama/CodeLlama-7b-hf*:
83+
```
84+
snapshot_download("meta-llama/CodeLlama-7b-hf", cache_dir=f'./models', token=hf_token)
85+
```
86+
Upload the downloaded model to S3:
87+
```
88+
files = (file for file in glob.glob(f'{path}/**/*', recursive=True) if os.path.isfile(file) and "snapshots" in file)
89+
for filename in files:
90+
s3_name = filename.replace(path, '')
91+
print(f'Uploading: {filename} to {path}{s3_name}')
92+
bucket.upload_file(filename, f'{path}{s3_name}')
93+
```
94+
6. Go to your project in **Red Hat OpenShift AI** dashboard, then "Models" tab and click **Deploy model** under *Single-model serving platform*. Fill the **Name**, choose newly created **Serving runtime**: *Text Generation Inference Magicoder-S-DS-6.7B on CPU* (for Xeon) or *Text Generation Inference CodeLlama-7b-hf on Gaudi* (for Gaudi), **Model framework**: *llm* and change **Model server size** to *Custom*: 16 CPUs and 64 Gi memory. For deployment with Gaudi select proper **Accelerator**. Click the checkbox to create external route in **Model route** section and uncheck the token authentication. Under **Model location** choose *New data connection* and fill all required fields for s3 access, **Bucket** *first.bucket* and **Path**: *models*. Click **Deploy**. It takes about 10 minutes to get *Loaded* status.\
95+
If it's not going to *Loaded* status and revision changed status to "ProgressDeadlineExceeded" (``oc get revision``), scale model deployment to 0 and than to 1 with command ``oc scale deployment.apps/<model_deployment_name> --replicas=1`` and wait about 10 minutes for deployment.
96+
97+
## Deploy CodeGen
98+
99+
1. Login to OpenShift CLI, go to your project and find the URL of TGI_LLM_ENDPOINT:
100+
```
101+
oc get service.serving.knative.dev
102+
```
103+
Update the TGI_LLM_ENDPOINT in your repository.
104+
105+
On Xeon:
106+
```
107+
cd GenAIExamples/CodeGen/openshift-rhoai/manifests/xeon
108+
export TGI_LLM_ENDPOINT="YourURL"
109+
sed -i "s#insert-your-tgi-url-here#${TGI_LLM_ENDPOINT}#g" codegen.yaml
110+
```
111+
112+
On Gaudi:
113+
```
114+
cd GenAIExamples/CodeGen/openshift-rhoai/manifests/gaudi
115+
export TGI_LLM_ENDPOINT="YourURL"
116+
sed -i "s#insert-your-tgi-url-here#${TGI_LLM_ENDPOINT}#g" codegen.yaml
117+
```
118+
119+
2. Build docker images locally
120+
121+
- LLM Docker Image:
122+
```
123+
git clone https://github.com/opea-project/GenAIComps.git
124+
cd GenAIComps
125+
docker build -t opea/llm-tgi:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/llms/text-generation/tgi/Dockerfile .
126+
```
127+
- MegaService Docker Image:
128+
```
129+
git clone https://github.com/opea-project/GenAIExamples
130+
cd GenAIExamples/CodeGen
131+
docker build -t opea/codegen:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f Dockerfile .
132+
```
133+
- UI Docker Image:
134+
```
135+
cd GenAIExamples/CodeGen/ui
136+
docker build -t opea/codegen-ui:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f ./docker/Dockerfile .
137+
```
138+
To verify run the command: `docker images`.
139+
140+
3. Login to docker, tag the images and push it to image registry with following commands:
141+
142+
```
143+
docker login -u <user> -p $(oc whoami -t) <openshift-image-registry_route>
144+
docker tag <image_id> <openshift-image-registry_route>/<namespace>/<image_name>:<tag>
145+
docker push <openshift-image-registry_route>/<namespace>/<image_name>:<tag>
146+
```
147+
To verify run the command: `oc get istag`.
148+
149+
4. Use the *IMAGE REFERENCE* from previous step to update images names in manifest files.
150+
151+
On Xeon:
152+
```
153+
cd GenAIExamples/CodeGen/openshift-rhoai/manifests/xeon
154+
export IMAGE_LLM_TGI="YourImage"
155+
export IMAGE_CODEGEN="YourImage"
156+
export IMAGE_CODEGEN_UI="YourImage"
157+
sed -i "s#insert-your-image-llm-tgi#${IMAGE_LLM_TGI}#g" codegen.yaml
158+
sed -i "s#insert-your-image-codegen#${IMAGE_CODEGEN}#g" codegen.yaml
159+
sed -i "s#insert-your-image-codegen-ui#${IMAGE_CODEGEN_UI}#g" ui-server.yaml
160+
```
161+
162+
On Gaudi:
163+
```
164+
cd GenAIExamples/CodeGen/openshift-rhoai/manifests/gaudi
165+
export IMAGE_LLM_TGI="YourImage"
166+
export IMAGE_CODEGEN="YourImage"
167+
export IMAGE_CODEGEN_UI="YourImage"
168+
sed -i "s#insert-your-image-llm-tgi#${IMAGE_LLM_TGI}#g" codegen.yaml
169+
sed -i "s#insert-your-image-codegen#${IMAGE_CODEGEN}#g" codegen.yaml
170+
sed -i "s#insert-your-image-codegen-ui#${IMAGE_CODEGEN_UI}#g" ui-server.yaml
171+
```
172+
173+
5. Create *rhoai-ca-bundle* secret:
174+
```
175+
oc create secret generic rhoai-ca-bundle --from-literal=tls.crt="$(oc extract secret/knative-serving-cert -n istio-system --to=- --keys=tls.crt)"
176+
```
177+
178+
6. Deploy CodeGen with command:
179+
```
180+
oc apply -f codegen.yaml
181+
```
182+
183+
7. Check the *codegen* route with command `oc get routes` and update the route in *ui-server.yaml* file:
184+
185+
On Xeon:
186+
```
187+
cd GenAIExamples/CodeGen/openshift-rhoai/manifests/xeon
188+
export CODEGEN_ROUTE="YourCodegenRoute"
189+
sed -i "s/insert-your-codegen-route/${CODEGEN_ROUTE}/g" ui-server.yaml
190+
```
191+
192+
On Gaudi:
193+
```
194+
cd GenAIExamples/CodeGen/openshift-rhoai/manifests/gaudi
195+
export CODEGEN_ROUTE="YourCodegenRoute"
196+
sed -i "s/insert-your-codegen-route/${CODEGEN_ROUTE}/g" ui-server.yaml
197+
```
198+
199+
8. Deploy UI with command:
200+
```
201+
oc apply -f ui-server.yaml
202+
```
203+
204+
## Verify Services
205+
206+
Make sure all the pods are running, and restart the codegen-xxxx pod if necessary.
207+
208+
```
209+
oc get pods
210+
curl http://${CODEGEN_ROUTE}/v1/codegen -H "Content-Type: application/json" -d '{
211+
"messages": "Implement a high-level API for a TODO list application. The API takes as input an operation request and updates the TODO list in place. If the request is invalid, raise an exception."
212+
}'
213+
```
214+
215+
## Launch the UI
216+
217+
To access the frontend, find the route for *ui-server* with command `oc get routes` and open it in the browser.
Lines changed: 167 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,167 @@
1+
---
2+
# Source: codegen/charts/llm-uservice/charts/tgi/templates/service.yaml
3+
# Copyright (C) 2024 Intel Corporation
4+
# SPDX-License-Identifier: Apache-2.0
5+
6+
apiVersion: v1
7+
kind: Service
8+
metadata:
9+
name: codegen-llm-uservice
10+
labels:
11+
helm.sh/chart: llm-uservice-0.1.0
12+
app.kubernetes.io/name: llm-uservice
13+
app.kubernetes.io/instance: codegen
14+
app.kubernetes.io/version: "1.0.0"
15+
app.kubernetes.io/managed-by: Helm
16+
spec:
17+
type: ClusterIP
18+
ports:
19+
- port: 9000
20+
targetPort: 9000
21+
protocol: TCP
22+
name: llm-uservice
23+
selector:
24+
app.kubernetes.io/name: llm-uservice
25+
app.kubernetes.io/instance: codegen
26+
---
27+
apiVersion: v1
28+
kind: Service
29+
metadata:
30+
name: codegen
31+
labels:
32+
helm.sh/chart: codegen-0.1.0
33+
app.kubernetes.io/name: codegen
34+
app.kubernetes.io/instance: codegen
35+
app.kubernetes.io/version: "1.0.0"
36+
app.kubernetes.io/managed-by: Helm
37+
spec:
38+
type: ClusterIP
39+
ports:
40+
- port: 7778
41+
targetPort: 7778
42+
protocol: TCP
43+
name: codegen
44+
selector:
45+
app.kubernetes.io/name: codegen
46+
app.kubernetes.io/instance: codegen
47+
---
48+
apiVersion: apps/v1
49+
kind: Deployment
50+
metadata:
51+
name: codegen-llm-uservice
52+
labels:
53+
helm.sh/chart: llm-uservice-0.1.0
54+
app.kubernetes.io/name: llm-uservice
55+
app.kubernetes.io/instance: codegen
56+
app.kubernetes.io/version: "1.0.0"
57+
app.kubernetes.io/managed-by: Helm
58+
spec:
59+
replicas: 1
60+
selector:
61+
matchLabels:
62+
app.kubernetes.io/name: llm-uservice
63+
app.kubernetes.io/instance: codegen
64+
template:
65+
metadata:
66+
labels:
67+
app.kubernetes.io/name: llm-uservice
68+
app.kubernetes.io/instance: codegen
69+
spec:
70+
securityContext: {}
71+
containers:
72+
- name: codegen
73+
command:
74+
- /bin/bash
75+
- -c
76+
- |
77+
cp /usr/lib/ssl/cert.pem /tmp/bundle.crt && \
78+
cat /rhoai-ca/tls.crt | tee -a '/tmp/bundle.crt' && \
79+
bash ./entrypoint.sh
80+
env:
81+
- name: TGI_LLM_ENDPOINT
82+
value: "insert-your-tgi-url-here"
83+
- name: HUGGINGFACEHUB_API_TOKEN
84+
value: "insert-your-huggingface-token-here"
85+
- name: PYTHONPATH
86+
value: /home/user/.local/lib/python3.11/site-packages:/home/user
87+
- name: HOME
88+
value: /tmp/home
89+
- name: SSL_CERT_FILE
90+
value: /tmp/bundle.crt
91+
securityContext: {}
92+
image: "insert-your-image-llm-tgi"
93+
imagePullPolicy: IfNotPresent
94+
ports:
95+
- name: llm-uservice
96+
containerPort: 9000
97+
protocol: TCP
98+
volumeMounts:
99+
- mountPath: /tmp/home
100+
name: local-dir
101+
- mountPath: /rhoai-ca
102+
name: odh-ca-bundle
103+
resources: {}
104+
volumes:
105+
- emptyDir:
106+
sizeLimit: 5Gi
107+
name: local-dir
108+
- name: odh-ca-bundle
109+
secret:
110+
defaultMode: 420
111+
secretName: rhoai-ca-bundle
112+
---
113+
apiVersion: apps/v1
114+
kind: Deployment
115+
metadata:
116+
name: codegen
117+
labels:
118+
helm.sh/chart: codegen-0.1.0
119+
app.kubernetes.io/name: codegen
120+
app.kubernetes.io/instance: codegen
121+
app.kubernetes.io/version: "1.0.0"
122+
app.kubernetes.io/managed-by: Helm
123+
spec:
124+
replicas: 1
125+
selector:
126+
matchLabels:
127+
app.kubernetes.io/name: codegen
128+
app.kubernetes.io/instance: codegen
129+
template:
130+
metadata:
131+
labels:
132+
app.kubernetes.io/name: codegen
133+
app.kubernetes.io/instance: codegen
134+
spec:
135+
securityContext: null
136+
containers:
137+
- name: codegen
138+
env:
139+
- name: LLM_SERVICE_HOST_IP
140+
value: codegen-llm-uservice
141+
securityContext: null
142+
image: "insert-your-image-codegen"
143+
imagePullPolicy: IfNotPresent
144+
ports:
145+
- name: codegen
146+
containerPort: 7778
147+
protocol: TCP
148+
resources: null
149+
---
150+
apiVersion: route.openshift.io/v1
151+
kind: Route
152+
metadata:
153+
labels:
154+
app.kubernetes.io/instance: codegen
155+
app.kubernetes.io/managed-by: Helm
156+
app.kubernetes.io/name: codegen
157+
app.kubernetes.io/version: 1.0.0
158+
helm.sh/chart: codegen-0.1.0
159+
name: codegen
160+
spec:
161+
port:
162+
targetPort: codegen
163+
to:
164+
kind: Service
165+
name: codegen
166+
weight: 100
167+
wildcardPolicy: None

0 commit comments

Comments
 (0)