|
| 1 | +# Announcing: KServe vx.xx |
| 2 | + |
| 3 | +We are excited to announce the release of KServe x.xx, in this release we made enhancements to the KServe control plane, especially brining RawDeployment for `InferenceGraph` as well. Previously `RawDeployment` existed only for `InferenceService` |
| 4 | + |
| 5 | +Here is a summary of the key changes: |
| 6 | + |
| 7 | +## KServe Core Inference Enhancements |
| 8 | + |
| 9 | +- Inference Graph enhancements for supporting `RawDeployment` along with Auto Scaling configuration right within the `InferenceGraphSpec` |
| 10 | + |
| 11 | +IG `RawDeployment` makes the deployment light weight using native k8s resources. See the comparison below |
| 12 | + |
| 13 | + |
| 14 | + |
| 15 | + |
| 16 | + |
| 17 | +AutoScaling configuration fields were introduced to support scaling needs in |
| 18 | +`RawDeployment` mode. These fields are optional and when added effective only when this annotation `serving.kserve.io/autoscalerClass` not pointing to `external` |
| 19 | + see the following example with Auto scaling fields `MinReplicas`, `MaxReplicas`, `ScaleTarget` and `ScaleMetric`: |
| 20 | + |
| 21 | + ```yaml |
| 22 | + apiVersion: serving.kserve.io/v1alpha1 |
| 23 | + kind: InferenceGraph |
| 24 | + metadata: |
| 25 | + name: graph_with_switch_node |
| 26 | + annotations: |
| 27 | + serving.kserve.io/deploymentMode: "RawDeployment" |
| 28 | + spec: |
| 29 | + nodes: |
| 30 | + root: |
| 31 | + routerType: Sequence |
| 32 | + steps: |
| 33 | + - name: "rootStep1" |
| 34 | + nodeName: node1 |
| 35 | + dependency: Hard |
| 36 | + - name: "rootStep2" |
| 37 | + serviceName: {{ success_200_isvc_id }} |
| 38 | + node1: |
| 39 | + routerType: Switch |
| 40 | + steps: |
| 41 | + - name: "node1Step1" |
| 42 | + serviceName: {{ error_404_isvc_id }} |
| 43 | + condition: "[@this].#(decision_picker==ERROR)" |
| 44 | + dependency: Hard |
| 45 | + MinReplicas: 5 |
| 46 | + MaxReplicas: 10 |
| 47 | + ScaleTarget: 50 |
| 48 | + ScaleMetric: "cpu" |
| 49 | + ``` |
| 50 | + For more details please refer to the [issue](https://github.com/kserve/kserve/issues/2454). |
| 51 | +
|
| 52 | +- |
| 53 | +
|
| 54 | +### Enhanced Python SDK Dependency Management |
| 55 | +
|
| 56 | +- |
| 57 | +- |
| 58 | +
|
| 59 | +### KServe Python Runtimes Improvements |
| 60 | +- |
| 61 | +
|
| 62 | +### LLM Runtimes |
| 63 | +
|
| 64 | +#### TorchServe LLM Runtime |
| 65 | +
|
| 66 | +#### vLLM Runtime |
| 67 | +
|
| 68 | +## ModelMesh Updates |
| 69 | +
|
| 70 | +### Storing Models on Kubernetes Persistent Volumes (PVC) |
| 71 | +
|
| 72 | +### Horizontal Pod Autoscaling (HPA) |
| 73 | +
|
| 74 | +### Model Metrics, Metrics Dashboard, Payload Event Logging |
| 75 | +
|
| 76 | +## What's Changed? :warning: |
| 77 | +
|
| 78 | +## Join the community |
| 79 | +
|
| 80 | +- Visit our [Website](https://kserve.github.io/website/) or [GitHub](https://github.com/kserve) |
| 81 | +- Join the Slack ([#kserve](https://kubeflow.slack.com/?redir=%2Farchives%2FCH6E58LNP)) |
| 82 | +- Attend our community meeting by subscribing to the [KServe calendar](https://wiki.lfaidata.foundation/display/kserve/calendars). |
| 83 | +- View our [community github repository](https://github.com/kserve/community) to learn how to make contributions. We are excited to work with you to make KServe better and promote its adoption! |
| 84 | +
|
| 85 | +
|
| 86 | +Thanks for all the contributors who have made the commits to 0.11 release! |
| 87 | +
|
| 88 | +The KServe Working Group |
0 commit comments