Skip to content

Commit 2257489

Browse files
Add grpc docs and missing OIP docs for some runtimes (#306)
Add open inference protocol and grpc docs Signed-off-by: Sivanantham Chinnaiyan <[email protected]>
1 parent 74693a7 commit 2257489

File tree

12 files changed

+1529
-102
lines changed

12 files changed

+1529
-102
lines changed

docs/developer/developer.md

+17-15
Original file line numberDiff line numberDiff line change
@@ -37,7 +37,8 @@ they can be installed via the [directions here](https://github.com/knative/docs/
3737

3838
* If you already have `Istio` or `Knative` (e.g. from a Kubeflow install) then you don't need to install them explicitly, as long as version dependencies are satisfied.
3939

40-
> **_NOTE:_** Note: On a local environment, when using `minikube` or `kind` as Kubernetes cluster, there has been a reported issue that [knative quickstart](https://knative.dev/docs/install/quickstart-install/) bootstrap does not work as expected. It is recommended to follow the installation manual from knative using [yaml](https://knative.dev/docs/install/yaml-install/) or using [knative operator](https://knative.dev/docs/install/operator/knative-with-operators/) for a better result.
40+
!!! Note
41+
On a local environment, when using `minikube` or `kind` as Kubernetes cluster, there has been a reported issue that [knative quickstart](https://knative.dev/docs/install/quickstart-install/) bootstrap does not work as expected. It is recommended to follow the installation manual from knative using [yaml](https://knative.dev/docs/install/yaml-install/) or using [knative operator](https://knative.dev/docs/install/operator/knative-with-operators/) for a better result.
4142

4243
### Setup your environment
4344

@@ -152,12 +153,12 @@ make deploy
152153
make deploy
153154
```
154155

155-
==**Expected Output**==
156-
```console
157-
$ kubectl get pods -n kserve -l control-plane=kserve-controller-manager
158-
NAME READY STATUS RESTARTS AGE
159-
kserve-controller-manager-0 2/2 Running 0 13m
160-
```
156+
!!! success "Expected Output"
157+
```console
158+
$ kubectl get pods -n kserve -l control-plane=kserve-controller-manager
159+
NAME READY STATUS RESTARTS AGE
160+
kserve-controller-manager-0 2/2 Running 0 13m
161+
```
161162
!!! Note
162163
By default it installs to `kserve` namespace with the published controller manager image from master branch.
163164

@@ -177,12 +178,12 @@ make deploy-dev-xgb
177178
```
178179

179180
Run the following command to deploy explainer with your local change.
180-
```
181+
```bash
181182
make deploy-dev-alibi
182183
```
183184

184185
Run the following command to deploy storage initializer with your local change.
185-
```
186+
```bash
186187
make deploy-dev-storageInitializer
187188
```
188189

@@ -204,11 +205,12 @@ You should see model serving deployment running under default or your specified
204205
$ kubectl get pods -n default -l serving.kserve.io/inferenceservice=flower-sample
205206
```
206207

207-
==**Expected Output**==
208-
```
209-
NAME READY STATUS RESTARTS AGE
210-
flower-sample-default-htz8r-deployment-8fd979f9b-w2qbv 3/3 Running 0 10s
211-
```
208+
!!! success "Expected Output"
209+
```
210+
NAME READY STATUS RESTARTS AGE
211+
flower-sample-default-htz8r-deployment-8fd979f9b-w2qbv 3/3 Running 0 10s
212+
```
213+
212214
## Running unit/integration tests
213215
`kserver-controller-manager` has a few integration tests which requires mock apiserver
214216
and etcd, they get installed along with [`kubebuilder`](https://book.kubebuilder.io/quick-start.html#installation).
@@ -227,7 +229,7 @@ To setup from local code, do:
227229
3. `make deploy-dev`
228230

229231
Go to `python/kserve` and install kserve python sdk deps
230-
```
232+
```bash
231233
pip3 install -e .[test]
232234
```
233235
Then go to `test/e2e`.

docs/modelserving/v1beta1/lightgbm/README.md

+55-14
Original file line numberDiff line numberDiff line change
@@ -129,13 +129,13 @@ curl -v -H "Host: ${SERVICE_HOSTNAME}" http://${INGRESS_HOST}:${INGRESS_PORT}/v1
129129
{"predictions": [[0.9, 0.05, 0.05]]}
130130
```
131131

132-
## Deploy the model with [Open Inference Protocol](https://github.com/kserve/kserve/tree/master/docs/predict-api/v2)
132+
## Deploy the model with [Open Inference Protocol](https://github.com/kserve/open-inference-protocol/)
133133

134134
### Test the model locally
135135
Once you've got your model serialized `model.bst`, we can then use [KServe LightGBM Server](https://github.com/kserve/kserve/tree/master/python/lgbserver) to create a local model server.
136136

137137
!!! Note
138-
This step is optional and just meant for testing, feel free to jump straight to [deploying with InferenceService](#deploy-with-inferenceservice).
138+
This step is optional and just meant for testing, feel free to jump straight to [deploying with InferenceService](#deploy-inferenceservice-with-rest-endpoint).
139139

140140
#### Pre-requisites
141141

@@ -162,7 +162,7 @@ The `lgbserver` package takes three arguments.
162162
With the `lgbserver` runtime package installed locally, you should now be ready to start our server as:
163163

164164
```bash
165-
python3 lgbserver --model_dir /path/to/model_dir --model_name lightgbm-iris
165+
python3 lgbserver --model_dir /path/to/model_dir --model_name lightgbm-v2-iris
166166
```
167167

168168
### Deploy InferenceService with REST endpoint
@@ -205,7 +205,7 @@ kubectl apply -f lightgbm-v2.yaml
205205
You can now test your deployed model by sending a sample request.
206206

207207
Note that this request **needs to follow the [V2 Dataplane protocol](https://github.com/kserve/kserve/tree/master/docs/predict-api/v2)**.
208-
You can see an example payload below:
208+
You can see an example payload below. Create a file named `iris-input-v2.json` with the sample input.
209209

210210
```json
211211
{
@@ -263,13 +263,35 @@ curl -v \
263263
### Create the InferenceService with gRPC endpoint
264264
Create the inference service yaml and expose the gRPC port, currently only one port is allowed to expose either HTTP or gRPC port and by default HTTP port is exposed.
265265

266-
=== "Yaml"
266+
!!! Note
267+
Currently, KServe only supports exposing either HTTP or gRPC port. By default, HTTP port is exposed.
267268

269+
=== "Serverless"
268270
```yaml
269271
apiVersion: "serving.kserve.io/v1beta1"
270272
kind: "InferenceService"
271273
metadata:
272-
name: "lightgbm-v2-iris"
274+
name: "lightgbm-v2-iris-grpc"
275+
spec:
276+
predictor:
277+
model:
278+
modelFormat:
279+
name: lightgbm
280+
protocolVersion: v2
281+
runtime: kserve-lgbserver
282+
storageUri: "gs://kfserving-examples/models/lightgbm/v2/iris"
283+
ports:
284+
- name: h2c # knative expects grpc port name to be 'h2c'
285+
protocol: TCP
286+
containerPort: 8081
287+
```
288+
289+
=== "RawDeployment"
290+
```yaml
291+
apiVersion: "serving.kserve.io/v1beta1"
292+
kind: "InferenceService"
293+
metadata:
294+
name: "lightgbm-v2-iris-grpc"
273295
spec:
274296
predictor:
275297
model:
@@ -279,7 +301,7 @@ Create the inference service yaml and expose the gRPC port, currently only one p
279301
runtime: kserve-lgbserver
280302
storageUri: "gs://kfserving-examples/models/lightgbm/v2/iris"
281303
ports:
282-
- name: h2c
304+
- name: grpc-port # Istio requires the port name to be in the format <protocol>[-<suffix>]
283305
protocol: TCP
284306
containerPort: 8081
285307
```
@@ -299,22 +321,22 @@ After the gRPC `InferenceService` becomes ready, [grpcurl](https://github.com/fu
299321

300322
```bash
301323
# download the proto file
302-
curl -O https://raw.githubusercontent.com/kserve/kserve/master/docs/predict-api/v2/grpc_predict_v2.proto
324+
curl -O https://raw.githubusercontent.com/kserve/open-inference-protocol/main/specification/protocol/open_inference_grpc.proto
303325
304326
INPUT_PATH=iris-input-v2-grpc.json
305-
PROTO_FILE=grpc_predict_v2.proto
306-
SERVICE_HOSTNAME=$(kubectl get inferenceservice lightgbm-v2-iris -o jsonpath='{.status.url}' | cut -d "/" -f 3)
327+
PROTO_FILE=open_inference_grpc.proto
328+
SERVICE_HOSTNAME=$(kubectl get inferenceservice lightgbm-v2-iris-grpc -o jsonpath='{.status.url}' | cut -d "/" -f 3)
307329
```
308330

309-
The gRPC APIs follow the KServe [prediction V2 protocol](https://github.com/kserve/kserve/tree/master/docs/predict-api/v2).
310-
331+
[Determine the ingress IP and port](../../../get_started/first_isvc.md#4-determine-the-ingress-ip-and-ports) and set `INGRESS_HOST` and `INGRESS_PORT`. Now, you can use `curl` to send the inference requests.
332+
The gRPC APIs follow the KServe [prediction V2 protocol / Open Inference Protocol](https://github.com/kserve/kserve/tree/master/docs/predict-api/v2).
311333
For example, `ServerReady` API can be used to check if the server is ready:
312334

313335
```bash
314336
grpcurl \
315337
-plaintext \
316338
-proto ${PROTO_FILE} \
317-
-authority ${SERVICE_HOSTNAME}" \
339+
-authority ${SERVICE_HOSTNAME} \
318340
${INGRESS_HOST}:${INGRESS_PORT} \
319341
inference.GRPCInferenceService.ServerReady
320342
```
@@ -326,6 +348,25 @@ grpcurl \
326348
}
327349
```
328350

351+
You can test the deployed model by sending a sample request with the below payload.
352+
Notice that the input format differs from the in the previous `REST endpoint` example.
353+
Prepare the inference input inside the file named `iris-input-v2-grpc.json`.
354+
```json
355+
{
356+
"model_name": "lightgbm-v2-iris-grpc",
357+
"inputs": [
358+
{
359+
"name": "input-0",
360+
"shape": [2, 4],
361+
"datatype": "FP32",
362+
"contents": {
363+
"fp32_contents": [6.8, 2.8, 4.8, 1.4, 6.0, 3.4, 4.5, 1.6]
364+
}
365+
}
366+
]
367+
}
368+
```
369+
329370
`ModelInfer` API takes input following the `ModelInferRequest` schema defined in the `grpc_predict_v2.proto` file. Notice that the input file differs from that used in the previous `curl` example.
330371

331372
```bash
@@ -364,7 +405,7 @@ grpcurl \
364405
365406
Response contents:
366407
{
367-
"modelName": "lightgbm-v2-iris",
408+
"modelName": "lightgbm-v2-iris-grpc",
368409
"outputs": [
369410
{
370411
"name": "predict",

docs/modelserving/v1beta1/lightgbm/iris-input-v2-grpc.json

+1-1
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
{
2-
"model_name": "lightgbm-v2-iris",
2+
"model_name": "lightgbm-v2-iris-grpc",
33
"inputs": [
44
{
55
"name": "input-0",

docs/modelserving/v1beta1/lightgbm/lightgbm-v2-grpc.yaml

+2-2
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
apiVersion: "serving.kserve.io/v1beta1"
22
kind: "InferenceService"
33
metadata:
4-
name: "lightgbm-v2-iris"
4+
name: "lightgbm-v2-iris-grpc"
55
spec:
66
predictor:
77
model:
@@ -12,4 +12,4 @@ spec:
1212
ports:
1313
- name: h2c
1414
protocol: TCP
15-
containerPort: 9000
15+
containerPort: 8081

docs/modelserving/v1beta1/paddle/README.md

+264-12
Large diffs are not rendered by default.

docs/modelserving/v1beta1/paddle/jay-v2-grpc.json

+13
Large diffs are not rendered by default.

docs/modelserving/v1beta1/paddle/jay-v2.json

+12
Large diffs are not rendered by default.

0 commit comments

Comments
 (0)