You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: CodeGen/openshift-rhoai/manifests/README.md
+47-13Lines changed: 47 additions & 13 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -2,19 +2,21 @@
2
2
3
3
## Prerequisites
4
4
5
-
1.**Red Hat OpenShift Cluster** with dynamic *StorageClass* to provision *PersistentVolumes* e.g. **OpenShift Data Foundation**) and installed Operators: **Red Hat - Authorino (Technical Preview)**, **Red Hat OpenShift Service Mesh**, **Red Hat OpenShift Serverless** and **Red Hat Openshift AI**.
5
+
1.**Red Hat OpenShift Cluster** with dynamic _StorageClass_ to provision _PersistentVolumes_ e.g. **OpenShift Data Foundation**) and installed Operators: **Red Hat - Authorino (Technical Preview)**, **Red Hat OpenShift Service Mesh**, **Red Hat OpenShift Serverless** and **Red Hat Openshift AI**.
6
6
2. Exposed image registry to push there docker images (https://docs.openshift.com/container-platform/4.16/registry/securing-exposing-registry.html).
7
7
3. Access to S3-compatible object storage bucket (e.g. **OpenShift Data Foundation**, **AWS S3**) and values of access and secret access keys and S3 endpoint (https://docs.redhat.com/en/documentation/red_hat_openshift_data_foundation/4.16/html/managing_hybrid_and_multicloud_resources/accessing-the-multicloud-object-gateway-with-your-applications_rhodf#accessing-the-multicloud-object-gateway-with-your-applications_rhodf)\
8
-
4. Account on https://huggingface.co/, access to model _ise-uiuc/Magicoder-S-DS-6.7B_ (for Xeon) or _meta-llama/CodeLlama-7b-hf_ (for Gaudi) and token with _Read permissions_. Update the access token in your repository using following commands.
8
+
4. Account on https://huggingface.co/, access to model _ise-uiuc/Magicoder-S-DS-6.7B_ (for Xeon) or _meta-llama/CodeLlama-7b-hf_ (for Gaudi) and token with _Read permissions_. Update the access token in your repository using following commands.
9
9
10
10
On Xeon:
11
+
11
12
```
12
13
cd GenAIExamples/CodeGen/openshift-rhoai/manifests/xeon
13
14
export HFTOKEN="YourOwnToken"
14
15
sed -i "s/insert-your-huggingface-token-here/${HFTOKEN}/g" codegen.yaml servingruntime-magicoder.yaml
15
16
```
16
17
17
18
On Gaudi:
19
+
18
20
```
19
21
cd GenAIExamples/CodeGen/openshift-rhoai/manifests/gaudi
20
22
export HFTOKEN="YourOwnToken"
@@ -26,35 +28,42 @@ sed -i "s/insert-your-huggingface-token-here/${HFTOKEN}/g" codegen.yaml servingr
26
28
1. Log in to OpenShift CLI and run following commands to create new serving runtime.
27
29
28
30
On Xeon:
31
+
29
32
```
30
33
cd GenAIExamples/CodeGen/openshift-rhoai/manifests/xeon
31
34
oc apply -f servingruntime-magicoder.yaml
32
35
```
33
36
34
37
On Gaudi:
38
+
35
39
```
36
40
cd GenAIExamples/CodeGen/openshift-rhoai/manifests/gaudi
37
41
oc apply -f servingruntime-codellama.yaml
38
42
```
39
43
40
-
Verify if template has been created with ```oc get template -n redhat-ods-applications``` command.
44
+
Verify if template has been created with `oc get template -n redhat-ods-applications` command.
41
45
42
46
2. Find the route for **Red Hat OpenShift AI** dashboard with below command and open it in the browser:
47
+
43
48
```
44
49
oc get routes -A | grep rhods-dashboard
45
50
```
46
-
3. Go to **Data Science Project** and clik **Create data science project**. Fill the **Name** and click **Create**.
47
-
4. Go to **Workbenches** tab and clik **Create workbench**. Fill the **Name**, under **Notebook image** choose *Standard Data Science*, under **Cluster storage** choose *Create new persistent storage* and change **Persistent storage size** to 40 GB. Click **Create workbench**.
51
+
52
+
3. Go to **Data Science Project** and click **Create data science project**. Fill the **Name** and click **Create**.
53
+
4. Go to **Workbenches** tab and click **Create workbench**. Fill the **Name**, under **Notebook image** choose _Standard Data Science_, under **Cluster storage** choose _Create new persistent storage_ and change **Persistent storage size** to 40 GB. Click **Create workbench**.
48
54
5. Open newly created Jupiter notebook and run following commands to download the model and upload it on s3:
files = (file for file in glob.glob(f'{path}/**/*', recursive=True) if os.path.isfile(file) and "snapshots" in file)
89
104
for filename in files:
90
105
s3_name = filename.replace(path, '')
91
106
print(f'Uploading: {filename} to {path}{s3_name}')
92
107
bucket.upload_file(filename, f'{path}{s3_name}')
93
108
```
94
-
6. Go to your project in **Red Hat OpenShift AI** dashboard, then "Models" tab and click **Deploy model** under *Single-model serving platform*. Fill the **Name**, choose newly created **Serving runtime**: *Text Generation Inference Magicoder-S-DS-6.7B on CPU* (for Xeon) or *Text Generation Inference CodeLlama-7b-hf on Gaudi* (for Gaudi), **Model framework**: *llm* and change **Model server size** to *Custom*: 16 CPUs and 64 Gi memory. For deployment with Gaudi select proper **Accelerator**. Click the checkbox to create external route in **Model route** section and uncheck the token authentication. Under **Model location** choose *New data connection* and fill all required fields for s3 access, **Bucket***first.bucket* and **Path**: *models*. Click **Deploy**. It takes about 10 minutes to get *Loaded* status.\
95
-
If it's not going to *Loaded* status and revision changed status to "ProgressDeadlineExceeded" (``oc get revision``), scale model deployment to 0 and than to 1 with command ``oc scale deployment.apps/<model_deployment_name> --replicas=1`` and wait about 10 minutes for deployment.
109
+
110
+
6. Go to your project in **Red Hat OpenShift AI** dashboard, then "Models" tab and click **Deploy model** under _Single-model serving platform_. Fill the **Name**, choose newly created **Serving runtime**: _Text Generation Inference Magicoder-S-DS-6.7B on CPU_ (for Xeon) or _Text Generation Inference CodeLlama-7b-hf on Gaudi_ (for Gaudi), **Model framework**: _llm_ and change **Model server size** to _Custom_: 16 CPUs and 64 Gi memory. For deployment with Gaudi select proper **Accelerator**. Click the checkbox to create external route in **Model route** section and uncheck the token authentication. Under **Model location** choose _New data connection_ and fill all required fields for s3 access, **Bucket**_first.bucket_ and **Path**: _models_. Click **Deploy**. It takes about 10 minutes to get _Loaded_ status.\
111
+
If it's not going to _Loaded_ status and revision changed status to "ProgressDeadlineExceeded" (`oc get revision`), scale model deployment to 0 and than to 1 with command `oc scale deployment.apps/<model_deployment_name> --replicas=1` and wait about 10 minutes for deployment.
96
112
97
113
## Deploy CodeGen
98
114
99
115
1. Login to OpenShift CLI, go to your project and find the URL of TGI_LLM_ENDPOINT:
116
+
100
117
```
101
118
oc get service.serving.knative.dev
102
119
```
120
+
103
121
Update the TGI_LLM_ENDPOINT in your repository.
104
122
105
123
On Xeon:
124
+
106
125
```
107
126
cd GenAIExamples/CodeGen/openshift-rhoai/manifests/xeon
108
127
export TGI_LLM_ENDPOINT="YourURL"
109
128
sed -i "s#insert-your-tgi-url-here#${TGI_LLM_ENDPOINT}#g" codegen.yaml
110
129
```
111
130
112
131
On Gaudi:
132
+
113
133
```
114
134
cd GenAIExamples/CodeGen/openshift-rhoai/manifests/gaudi
115
135
export TGI_LLM_ENDPOINT="YourURL"
@@ -119,22 +139,28 @@ sed -i "s#insert-your-tgi-url-here#${TGI_LLM_ENDPOINT}#g" codegen.yaml
Copy file name to clipboardExpand all lines: CodeGen/openshift/manifests/README.md
+19-4Lines changed: 19 additions & 4 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -2,18 +2,20 @@
2
2
3
3
## Prerequisites
4
4
5
-
1.**Red Hat OpenShift Cluster** with dynamic *StorageClass* to provision *PersistentVolumes* e.g. **OpenShift Data Foundation**)
5
+
1.**Red Hat OpenShift Cluster** with dynamic _StorageClass_ to provision _PersistentVolumes_ e.g. **OpenShift Data Foundation**)
6
6
2. Exposed image registry to push there docker images (https://docs.openshift.com/container-platform/4.16/registry/securing-exposing-registry.html).
7
-
3. Account on https://huggingface.co/, access to model *ise-uiuc/Magicoder-S-DS-6.7B* (for Xeon) or *meta-llama/CodeLlama-7b-hf* (for Gaugi) and token with _Read permissions_. Update the access token in your repository using following commands.
7
+
3. Account on https://huggingface.co/, access to model _ise-uiuc/Magicoder-S-DS-6.7B_ (for Xeon) or _meta-llama/CodeLlama-7b-hf_ (for Gaugi) and token with _Read permissions_. Update the access token in your repository using following commands.
8
8
9
9
On Xeon:
10
+
10
11
```
11
12
cd GenAIExamples/CodeGen/openshift/manifests/xeon
12
13
export HFTOKEN="YourOwnToken"
13
14
sed -i "s/insert-your-huggingface-token-here/${HFTOKEN}/g" codegen.yaml
14
15
```
15
16
16
17
On Gaudi:
18
+
17
19
```
18
20
cd GenAIExamples/CodeGen/openshift/manifests/gaudi
19
21
export HFTOKEN="YourOwnToken"
@@ -25,22 +27,28 @@ sed -i "s/insert-your-huggingface-token-here/${HFTOKEN}/g" codegen.yaml
0 commit comments