You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When using Kserve with the `RawDeployment` mode, Knative is not installed. In this mode, if you deploy an `InferenceService`, Kserve uses **Kubernetes’ Horizontal Pod Autoscaler (HPA)** for autoscaling instead of **Knative Pod Autoscaler (KPA)**. For more information about Kserve's autoscaler, you can refer [`this`](https://kserve.github.io/website/master/modelserving/v1beta1/torchserve/#knative-autoscaler)
506
506
507
507
508
-
=== "Old Schema"
508
+
=== "New Schema"
509
509
510
510
```yaml
511
511
apiVersion: "serving.kserve.io/v1beta1"
@@ -519,11 +519,13 @@ When using Kserve with the `RawDeployment` mode, Knative is not installed. In th
If you want to control the scaling of the deployment created by KServe inference service with an external tool like [`KEDA`](https://keda.sh/). You can disable KServe's creation of the **HPA** by replacing **external** value with autoscaler class annotaion that should be disable the creation of HPA
549
549
550
-
=== "Old Schema"
550
+
=== "New Schema"
551
551
552
552
```yaml
553
553
apiVersion: "serving.kserve.io/v1beta1"
@@ -559,11 +559,13 @@ If you want to control the scaling of the deployment created by KServe inference
0 commit comments