opea-project
diff --git a/‎CodeGen/openshift-rhoai/manifests/README.md‎
Lines changed: 217 additions & 0 deletions b/‎CodeGen/openshift-rhoai/manifests/README.md‎
Lines changed: 217 additions & 0 deletions
diff --git a/‎CodeGen/openshift-rhoai/manifests/gaudi/codegen.yaml‎
Lines changed: 167 additions & 0 deletions b/‎CodeGen/openshift-rhoai/manifests/gaudi/codegen.yaml‎
Lines changed: 167 additions & 0 deletions
@@ -0,0 +1,217 @@
+<h1 align="center" id="title">Deploy CodeGen on OpenShift Cluster with RHOAI</h1>
+
+## Prerequisites
+
+1. **Red Hat OpenShift Cluster** with dynamic *StorageClass* to provision *PersistentVolumes* e.g. **OpenShift Data Foundation**) and installed Operators: **Red Hat - Authorino (Technical Preview)**, **Red Hat OpenShift Service Mesh**, **Red Hat OpenShift Serverless** and **Red Hat Openshift AI**.
+2. Exposed image registry to push there docker images (https://docs.openshift.com/container-platform/4.16/registry/securing-exposing-registry.html).
+3. Access to S3-compatible object storage bucket (e.g. **OpenShift Data Foundation**, **AWS S3**) and values of access and secret access keys and S3 endpoint (https://docs.redhat.com/en/documentation/red_hat_openshift_data_foundation/4.16/html/managing_hybrid_and_multicloud_resources/accessing-the-multicloud-object-gateway-with-your-applications_rhodf#accessing-the-multicloud-object-gateway-with-your-applications_rhodf)\
+4. Account on https://huggingface.co/, access to model _ise-uiuc/Magicoder-S-DS-6.7B_ (for Xeon) or _meta-llama/CodeLlama-7b-hf_ (for Gaudi)  and token with _Read permissions_. Update the access token in your repository using following commands.
+
+On Xeon:
+```
+cd GenAIExamples/CodeGen/openshift-rhoai/manifests/xeon
+export HFTOKEN="YourOwnToken"
+sed -i "s/insert-your-huggingface-token-here/${HFTOKEN}/g" codegen.yaml servingruntime-magicoder.yaml
+```
+
+On Gaudi:
+```
+cd GenAIExamples/CodeGen/openshift-rhoai/manifests/gaudi
+export HFTOKEN="YourOwnToken"
+sed -i "s/insert-your-huggingface-token-here/${HFTOKEN}/g" codegen.yaml servingruntime-codellama.yaml
+```
+
+## Deploy model in Red Hat Openshift AI
+
+1. Log in to OpenShift CLI and run following commands to create new serving runtime.
+
+On Xeon:
+```
+cd GenAIExamples/CodeGen/openshift-rhoai/manifests/xeon
+oc apply -f servingruntime-magicoder.yaml
+```
+
+On Gaudi:
+```
+cd GenAIExamples/CodeGen/openshift-rhoai/manifests/gaudi
+oc apply -f servingruntime-codellama.yaml
+```
+
+Verify if template has been created with ```oc get template -n redhat-ods-applications``` command.
+
+2. Find the route for **Red Hat OpenShift AI** dashboard with below command and open it in the browser:
+```
+oc get routes -A | grep rhods-dashboard
+```
+3. Go to **Data Science Project** and clik **Create data science project**. Fill the **Name** and click **Create**.
+4. Go to **Workbenches** tab and clik **Create workbench**. Fill the **Name**, under **Notebook image** choose *Standard Data Science*, under **Cluster storage** choose *Create new persistent storage* and change **Persistent storage size** to 40 GB. Click **Create workbench**.
+5. Open newly created Jupiter notebook and run following commands to download the model and upload it on s3:
+```
+%env S3_ENDPOINT=<S3_RGW_ROUTE>
+%env S3_ACCESS_KEY=<AWS_ACCESS_KEY_ID>
+%env S3_SECRET_KEY=<AWS_SECRET_ACCESS_KEY>
+%env HF_TOKEN=<PASTE_HUGGINGFACE_TOKEN>
+```
+```
+!pip install huggingface-hub
+```
+```
+import os
+import boto3
+import botocore
+import glob
+from huggingface_hub import snapshot_download
+bucket_name = 'first.bucket'
+s3_endpoint = os.environ.get('S3_ENDPOINT')
+s3_accesskey = os.environ.get('S3_ACCESS_KEY')
+s3_secretkey = os.environ.get('S3_SECRET_KEY')
+path = 'models'
+hf_token = os.environ.get('HF_TOKEN')
+session = boto3.session.Session()
+s3_resource = session.resource('s3',
+                               endpoint_url=s3_endpoint,
+                               verify=False,
+                               aws_access_key_id=s3_accesskey,
+                               aws_secret_access_key=s3_secretkey)
+bucket = s3_resource.Bucket(bucket_name)
+```
+For Xeon download *ise-uiuc/Magicoder-S-DS-6.7B*:
+```
+snapshot_download("ise-uiuc/Magicoder-S-DS-6.7B", cache_dir=f'./models', token=hf_token)
+```
+For Gaudi download *meta-llama/CodeLlama-7b-hf*:
+```
+snapshot_download("meta-llama/CodeLlama-7b-hf", cache_dir=f'./models', token=hf_token)
+```
+Upload the downloaded model to S3:
+```
+files = (file for file in glob.glob(f'{path}/**/*', recursive=True) if os.path.isfile(file) and "snapshots" in file)
+for filename in files:
+    s3_name = filename.replace(path, '')
+    print(f'Uploading: {filename} to {path}{s3_name}')
+    bucket.upload_file(filename, f'{path}{s3_name}')
+```
+6. Go to your project in **Red Hat OpenShift AI** dashboard, then "Models" tab and click **Deploy model** under *Single-model serving platform*. Fill the **Name**, choose newly created **Serving runtime**: *Text Generation Inference Magicoder-S-DS-6.7B on CPU* (for Xeon) or *Text Generation Inference CodeLlama-7b-hf on Gaudi* (for Gaudi), **Model framework**: *llm* and change **Model server size** to *Custom*: 16 CPUs and 64 Gi memory. For deployment with Gaudi select proper **Accelerator**. Click the checkbox to create external route in **Model route** section and uncheck the token authentication. Under **Model location** choose *New data connection* and fill all required fields for s3 access, **Bucket** *first.bucket* and **Path**: *models*. Click **Deploy**. It takes about 10 minutes to get *Loaded* status.\
+If it's not going to *Loaded* status and revision changed status to "ProgressDeadlineExceeded" (``oc get revision``), scale model deployment to 0 and than to 1 with command ``oc scale deployment.apps/<model_deployment_name> --replicas=1`` and wait about 10 minutes for deployment.
+
+## Deploy CodeGen
+
+1. Login to OpenShift CLI, go to your project and find the URL of TGI_LLM_ENDPOINT:
+```
+oc get service.serving.knative.dev
+```
+Update the TGI_LLM_ENDPOINT in your repository.
+
+On Xeon:
+```
+cd GenAIExamples/CodeGen/openshift-rhoai/manifests/xeon
+export TGI_LLM_ENDPOINT="YourURL"
+sed -i "s#insert-your-tgi-url-here#${TGI_LLM_ENDPOINT}#g" codegen.yaml
+```
+
+On Gaudi:
+```
+cd GenAIExamples/CodeGen/openshift-rhoai/manifests/gaudi
+export TGI_LLM_ENDPOINT="YourURL"
+sed -i "s#insert-your-tgi-url-here#${TGI_LLM_ENDPOINT}#g" codegen.yaml
+```
+
+2. Build docker images locally
+
+- LLM Docker Image:
+```
+git clone https://github.com/opea-project/GenAIComps.git
+cd GenAIComps
+docker build -t opea/llm-tgi:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/llms/text-generation/tgi/Dockerfile .
+```
+- MegaService Docker Image:
+```
+git clone https://github.com/opea-project/GenAIExamples
+cd GenAIExamples/CodeGen
+docker build -t opea/codegen:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f Dockerfile .
+```
+- UI Docker Image:
+```
+cd GenAIExamples/CodeGen/ui
+docker build -t opea/codegen-ui:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f ./docker/Dockerfile .
+```
+To verify run the command: `docker images`.
+
+3. Login to docker, tag the images and push it to image registry with following commands:
+
+```
+docker login -u <user> -p $(oc whoami -t) <openshift-image-registry_route>
+docker tag <image_id> <openshift-image-registry_route>/<namespace>/<image_name>:<tag>
+docker push <openshift-image-registry_route>/<namespace>/<image_name>:<tag>
+```
+To verify run the command: `oc get istag`.
+
+4. Use the *IMAGE REFERENCE* from previous step to update images names in manifest files.
+
+On Xeon:
+```
+cd GenAIExamples/CodeGen/openshift-rhoai/manifests/xeon
+export IMAGE_LLM_TGI="YourImage"
+export IMAGE_CODEGEN="YourImage"
+export IMAGE_CODEGEN_UI="YourImage"
+sed -i "s#insert-your-image-llm-tgi#${IMAGE_LLM_TGI}#g" codegen.yaml
+sed -i "s#insert-your-image-codegen#${IMAGE_CODEGEN}#g" codegen.yaml
+sed -i "s#insert-your-image-codegen-ui#${IMAGE_CODEGEN_UI}#g" ui-server.yaml
+```
+
+On Gaudi:
+```
+cd GenAIExamples/CodeGen/openshift-rhoai/manifests/gaudi
+export IMAGE_LLM_TGI="YourImage"
+export IMAGE_CODEGEN="YourImage"
+export IMAGE_CODEGEN_UI="YourImage"
+sed -i "s#insert-your-image-llm-tgi#${IMAGE_LLM_TGI}#g" codegen.yaml
+sed -i "s#insert-your-image-codegen#${IMAGE_CODEGEN}#g" codegen.yaml
+sed -i "s#insert-your-image-codegen-ui#${IMAGE_CODEGEN_UI}#g" ui-server.yaml
+```
+
+5. Create *rhoai-ca-bundle* secret:
+```
+oc create secret generic rhoai-ca-bundle --from-literal=tls.crt="$(oc extract secret/knative-serving-cert -n istio-system --to=- --keys=tls.crt)"
+```
+
+6. Deploy CodeGen with command:
+```
+oc apply -f codegen.yaml
+```
+
+7. Check the *codegen* route with command `oc get routes` and update the route in *ui-server.yaml* file: 
+
+On Xeon:
+```
+cd GenAIExamples/CodeGen/openshift-rhoai/manifests/xeon
+export CODEGEN_ROUTE="YourCodegenRoute"
+sed -i "s/insert-your-codegen-route/${CODEGEN_ROUTE}/g" ui-server.yaml
+```
+
+On Gaudi:
+```
+cd GenAIExamples/CodeGen/openshift-rhoai/manifests/gaudi
+export CODEGEN_ROUTE="YourCodegenRoute"
+sed -i "s/insert-your-codegen-route/${CODEGEN_ROUTE}/g" ui-server.yaml
+```
+
+8. Deploy UI with command:
+```
+oc apply -f ui-server.yaml
+```
+
+## Verify Services
+
+Make sure all the pods are running, and restart the codegen-xxxx pod if necessary.
+
+```
+oc get pods
+curl http://${CODEGEN_ROUTE}/v1/codegen -H "Content-Type: application/json" -d '{
+     "messages": "Implement a high-level API for a TODO list application. The API takes as input an operation request and updates the TODO list in place. If the request is invalid, raise an exception."
+     }'
+```
+
+## Launch the UI
+
+To access the frontend, find the route for *ui-server* with command `oc get routes` and open it in the browser.
@@ -0,0 +1,167 @@
+---
+# Source: codegen/charts/llm-uservice/charts/tgi/templates/service.yaml
+# Copyright (C) 2024 Intel Corporation
+# SPDX-License-Identifier: Apache-2.0
+
+apiVersion: v1
+kind: Service
+metadata:
+  name: codegen-llm-uservice
+  labels:
+    helm.sh/chart: llm-uservice-0.1.0
+    app.kubernetes.io/name: llm-uservice
+    app.kubernetes.io/instance: codegen
+    app.kubernetes.io/version: "1.0.0"
+    app.kubernetes.io/managed-by: Helm
+spec:
+  type: ClusterIP
+  ports:
+    - port: 9000
+      targetPort: 9000
+      protocol: TCP
+      name: llm-uservice
+  selector:
+    app.kubernetes.io/name: llm-uservice
+    app.kubernetes.io/instance: codegen
+---
+apiVersion: v1
+kind: Service
+metadata:
+  name: codegen
+  labels:
+    helm.sh/chart: codegen-0.1.0
+    app.kubernetes.io/name: codegen
+    app.kubernetes.io/instance: codegen
+    app.kubernetes.io/version: "1.0.0"
+    app.kubernetes.io/managed-by: Helm
+spec:
+  type: ClusterIP
+  ports:
+    - port: 7778
+      targetPort: 7778
+      protocol: TCP
+      name: codegen
+  selector:
+    app.kubernetes.io/name: codegen
+    app.kubernetes.io/instance: codegen
+---
+apiVersion: apps/v1
+kind: Deployment
+metadata:
+  name: codegen-llm-uservice
+  labels:
+    helm.sh/chart: llm-uservice-0.1.0
+    app.kubernetes.io/name: llm-uservice
+    app.kubernetes.io/instance: codegen
+    app.kubernetes.io/version: "1.0.0"
+    app.kubernetes.io/managed-by: Helm
+spec:
+  replicas: 1
+  selector:
+    matchLabels:
+      app.kubernetes.io/name: llm-uservice
+      app.kubernetes.io/instance: codegen
+  template:
+    metadata:
+      labels:
+        app.kubernetes.io/name: llm-uservice
+        app.kubernetes.io/instance: codegen
+    spec:
+      securityContext: {}
+      containers:
+        - name: codegen
+          command:
+          - /bin/bash
+          - -c
+          - |
+            cp /usr/lib/ssl/cert.pem /tmp/bundle.crt && \
+            cat /rhoai-ca/tls.crt  | tee -a '/tmp/bundle.crt' && \
+            bash ./entrypoint.sh
+          env:
+            - name: TGI_LLM_ENDPOINT
+              value: "insert-your-tgi-url-here"
+            - name: HUGGINGFACEHUB_API_TOKEN
+              value: "insert-your-huggingface-token-here"
+            - name: PYTHONPATH
+              value: /home/user/.local/lib/python3.11/site-packages:/home/user
+            - name: HOME
+              value: /tmp/home
+            - name: SSL_CERT_FILE
+              value: /tmp/bundle.crt
+          securityContext: {}
+          image: "insert-your-image-llm-tgi"
+          imagePullPolicy: IfNotPresent
+          ports:
+            - name: llm-uservice
+              containerPort: 9000
+              protocol: TCP
+          volumeMounts:
+          - mountPath: /tmp/home
+            name: local-dir
+          - mountPath: /rhoai-ca
+            name: odh-ca-bundle
+          resources: {}
+      volumes:
+      - emptyDir:
+          sizeLimit: 5Gi
+        name: local-dir
+      - name: odh-ca-bundle
+        secret:
+          defaultMode: 420
+          secretName: rhoai-ca-bundle
+---
+apiVersion: apps/v1
+kind: Deployment
+metadata:
+  name: codegen
+  labels:
+    helm.sh/chart: codegen-0.1.0
+    app.kubernetes.io/name: codegen
+    app.kubernetes.io/instance: codegen
+    app.kubernetes.io/version: "1.0.0"
+    app.kubernetes.io/managed-by: Helm
+spec:
+  replicas: 1
+  selector:
+    matchLabels:
+      app.kubernetes.io/name: codegen
+      app.kubernetes.io/instance: codegen
+  template:
+    metadata:
+      labels:
+        app.kubernetes.io/name: codegen
+        app.kubernetes.io/instance: codegen
+    spec:
+      securityContext: null
+      containers:
+        - name: codegen
+          env:
+            - name: LLM_SERVICE_HOST_IP
+              value: codegen-llm-uservice
+          securityContext: null
+          image: "insert-your-image-codegen"
+          imagePullPolicy: IfNotPresent
+          ports:
+            - name: codegen
+              containerPort: 7778
+              protocol: TCP
+          resources: null
+---
+apiVersion: route.openshift.io/v1
+kind: Route
+metadata:
+  labels:
+    app.kubernetes.io/instance: codegen
+    app.kubernetes.io/managed-by: Helm
+    app.kubernetes.io/name: codegen
+    app.kubernetes.io/version: 1.0.0
+    helm.sh/chart: codegen-0.1.0
+  name: codegen
+spec:
+  port:
+    targetPort: codegen
+  to:
+    kind: Service
+    name: codegen
+    weight: 100
+  wildcardPolicy: None