You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When using Kserve with the `RawDeployment` mode, Knative is not installed. In this mode, if you deploy an `InferenceService`, Kserve uses **Kubernetes’ Horizontal Pod Autoscaler (HPA)** for autoscaling instead of **Knative Pod Autoscaler (KPA)**. For more information about Kserve's autoscaler, you can refer [`this`](https://kserve.github.io/website/master/modelserving/v1beta1/torchserve/#knative-autoscaler)
511
511
512
512
513
-
=== "Old Schema"
513
+
=== "New Schema"
514
514
515
515
```yaml
516
516
apiVersion: "serving.kserve.io/v1beta1"
@@ -524,11 +524,13 @@ When using Kserve with the `RawDeployment` mode, Knative is not installed. In th
If you want to control the scaling of the deployment created by KServe inference service with an external tool like [`KEDA`](https://keda.sh/). You can disable KServe's creation of the **HPA** by replacing **external** value with autoscaler class annotaion that should be disable the creation of HPA
554
554
555
-
=== "Old Schema"
555
+
=== "New Schema"
556
556
557
557
```yaml
558
558
apiVersion: "serving.kserve.io/v1beta1"
@@ -564,11 +564,13 @@ If you want to control the scaling of the deployment created by KServe inference
0 commit comments