refactor(model-serving): remove controller label propagation and add autoscaling example by WHOIM1205 · Pull Request #877 · volcano-sh/kthena

WHOIM1205 · 2026-04-09T20:04:23Z

Summary

This PR updates the autoscaling example for ModelServing based on the feedback received.

Instead of adding any controller-level logic or fixed labels, the example now relies on user-defined labels in the ModelServing templates.

What was removed

Removed controller-side label propagation logic
Removed the fixed annotation (kthena.io/model-name)
Removed related constants and tests

The idea here is to keep the controller generic and not enforce any specific labeling pattern. Users can define labels as needed.

What was added

Added an autoscaling example:
examples/model-serving/autoscaling-with-keda.yaml

The example demonstrates:

Defining labels in ModelServing templates
Using Prometheus metrics for scaling
Configuring KEDA to scale ModelServing

Fixes & Improvements

Fixed Prometheus query label:
- model_name → model (to match the actual metric)
Replaced hardcoded values with placeholders:
- <model-name>
- <modelserving-name>
- <prometheus-url>
Clarified a few things in comments:
- Metrics come from the router, not from pod labels
- Labels are optional and user-defined
- These are standard pod-template labels (nothing special added in the controller)
Added:
- pollingInterval: 15
- cooldownPeriod: 120 (can be tuned depending on workload)

How it works

User defines labels in the ModelServing template
These labels are part of the pod template and show up on the created pods
The router emits metrics with a model label
KEDA queries Prometheus using that label
HPA updates spec.replicas on ModelServing

Notes

No changes to controller behavior
No CRD changes
Fully backward compatible
Autoscaling is optional and user-driven

Context

In an earlier iteration of this PR, I tried adding controller-side label propagation, but based on feedback, that approach was dropped.

The current version keeps things simple and just adds an example using user-defined labels.

Goal

Provide a simple and flexible example of autoscaling with Prometheus + KEDA, without introducing any opinionated controller changes.

gemini-code-assist

Code Review

This pull request introduces the propagation of the 'kthena.io/model-name' annotation from the ModelServing CR to the corresponding pod labels. The implementation includes validation to ensure the annotation value is a valid Kubernetes label and adds comprehensive unit tests to verify the propagation logic and precedence rules. I have reviewed the code and suggest removing the redundant '!exists' check in the label assignment logic, as the pod labels are initialized with a fixed set of keys that do not include this annotation.

gemini-code-assist · 2026-04-09T20:10:51Z

pkg/model-serving-controller/utils/utils.go

+		} else if _, exists := pod.Labels[workloadv1alpha1.ModelNameAnnotationKey]; !exists {
+			pod.Labels[workloadv1alpha1.ModelNameAnnotationKey] = modelName
+		}


The !exists check is redundant here because pod.Labels was just initialized a few lines above (lines 135-142) with a fixed set of keys that does not include ModelNameAnnotationKey. Removing this check simplifies the logic without changing behavior, as any subsequent overrides from the role template are handled later in addPodLabelAndAnnotation.

else { pod.Labels[workloadv1alpha1.ModelNameAnnotationKey] = modelName }

FAUST-BENCHOU · 2026-04-10T02:39:21Z

/retest

volcano-sh-bot · 2026-04-10T02:39:45Z

@FAUST-BENCHOU: Cannot trigger testing until a trusted user reviews the PR and leaves an /ok-to-test message.

Details

In response to this:

/retest

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

volcano-sh-bot · 2026-04-10T20:42:12Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign yaozengzeng for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Details

Needs approval from an approver in each of these files:

examples/OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

FAUST-BENCHOU · 2026-04-13T14:54:28Z

dont see any remove work, only a yml example

Adds a reference YAML showing how users can autoscale ModelServing with KEDA + Prometheus using a router-level metric. Labels are user-defined in the pod templates; no controller-side label propagation is introduced. Signed-off-by: WHOIM1205 <[email protected]>

WHOIM1205 · 2026-04-13T18:55:06Z

dont see any remove work, only a yml example

Hey @FAUST-BENCHOU thanks for the feedback

ive cleaned up the pr a bit removed the earlier controller side label propagation approach and kept it aligned with using user defined labels only also squashed the commits so the diff reflects the final state more clearly

right now it just adds the autoscaling example with prometheus and keda without introducing any controller changes

happy to adjust anything further if needed

LiZhenCheng9527 · 2026-04-15T07:50:31Z

dont see any remove work, only a yml example

Hey @FAUST-BENCHOU thanks for the feedback

ive cleaned up the pr a bit removed the earlier controller side label propagation approach and kept it aligned with using user defined labels only also squashed the commits so the diff reflects the final state more clearly

right now it just adds the autoscaling example with prometheus and keda without introducing any controller changes

happy to adjust anything further if needed

If you’ve done these things, you should update the PR title and description.

Copilot AI review requested due to automatic review settings April 9, 2026 20:04

volcano-sh-bot requested review from git-malu and hzxuzhonghu April 9, 2026 20:04

volcano-sh-bot added the size/L label Apr 9, 2026

WHOIM1205 changed the title ~~propagate kthena.io/model-name annotation from ModelServing to pod la…~~ feat(model-serving): propagate model-name annotation to pod labels for autoscaling Apr 9, 2026

Copilot started reviewing on behalf of WHOIM1205 April 9, 2026 20:05 View session

gemini-code-assist bot reviewed Apr 9, 2026

View reviewed changes

volcano-sh-bot added size/M and removed size/L labels Apr 10, 2026

WHOIM1205 changed the title ~~feat(model-serving): propagate model-name annotation to pod labels for autoscaling~~ refactor(model-serving): remove controller label propagation and add autoscaling example Apr 10, 2026

WHOIM1205 force-pushed the feat/propagate-model-name-annotation-to-pods branch from 17e10b4 to a0851ac Compare April 10, 2026 20:53

WHOIM1205 force-pushed the feat/propagate-model-name-annotation-to-pods branch from a0851ac to b0641a7 Compare April 13, 2026 18:50

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor(model-serving): remove controller label propagation and add autoscaling example#877

refactor(model-serving): remove controller label propagation and add autoscaling example#877
WHOIM1205 wants to merge 1 commit intovolcano-sh:mainfrom
WHOIM1205:feat/propagate-model-name-annotation-to-pods

WHOIM1205 commented Apr 9, 2026 •

edited

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Apr 9, 2026

Uh oh!

FAUST-BENCHOU commented Apr 10, 2026

Uh oh!

volcano-sh-bot commented Apr 10, 2026

Uh oh!

volcano-sh-bot commented Apr 10, 2026

Uh oh!

FAUST-BENCHOU commented Apr 13, 2026

Uh oh!

WHOIM1205 commented Apr 13, 2026

Uh oh!

LiZhenCheng9527 commented Apr 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

WHOIM1205 commented Apr 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

What was removed

What was added

Fixes & Improvements

How it works

Notes

Context

Goal

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

FAUST-BENCHOU commented Apr 10, 2026

Uh oh!

volcano-sh-bot commented Apr 10, 2026

Uh oh!

volcano-sh-bot commented Apr 10, 2026

Uh oh!

FAUST-BENCHOU commented Apr 13, 2026

Uh oh!

WHOIM1205 commented Apr 13, 2026

Uh oh!

LiZhenCheng9527 commented Apr 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

WHOIM1205 commented Apr 9, 2026 •

edited

Loading