Skip to content

refactor(model-serving): remove controller label propagation and add autoscaling example#877

Open
WHOIM1205 wants to merge 1 commit intovolcano-sh:mainfrom
WHOIM1205:feat/propagate-model-name-annotation-to-pods
Open

refactor(model-serving): remove controller label propagation and add autoscaling example#877
WHOIM1205 wants to merge 1 commit intovolcano-sh:mainfrom
WHOIM1205:feat/propagate-model-name-annotation-to-pods

Conversation

@WHOIM1205
Copy link
Copy Markdown
Contributor

@WHOIM1205 WHOIM1205 commented Apr 9, 2026

Summary

This PR updates the autoscaling example for ModelServing based on the feedback received.

Instead of adding any controller-level logic or fixed labels, the example now relies on user-defined labels in the ModelServing templates.


What was removed

  • Removed controller-side label propagation logic
  • Removed the fixed annotation (kthena.io/model-name)
  • Removed related constants and tests

The idea here is to keep the controller generic and not enforce any specific labeling pattern. Users can define labels as needed.


What was added

  • Added an autoscaling example:
    examples/model-serving/autoscaling-with-keda.yaml

The example demonstrates:

  • Defining labels in ModelServing templates
  • Using Prometheus metrics for scaling
  • Configuring KEDA to scale ModelServing

Fixes & Improvements

  • Fixed Prometheus query label:

    • model_namemodel (to match the actual metric)
  • Replaced hardcoded values with placeholders:

    • <model-name>
    • <modelserving-name>
    • <prometheus-url>
  • Clarified a few things in comments:

    • Metrics come from the router, not from pod labels
    • Labels are optional and user-defined
    • These are standard pod-template labels (nothing special added in the controller)
  • Added:

    • pollingInterval: 15
    • cooldownPeriod: 120 (can be tuned depending on workload)

How it works

  1. User defines labels in the ModelServing template
  2. These labels are part of the pod template and show up on the created pods
  3. The router emits metrics with a model label
  4. KEDA queries Prometheus using that label
  5. HPA updates spec.replicas on ModelServing

Notes

  • No changes to controller behavior
  • No CRD changes
  • Fully backward compatible
  • Autoscaling is optional and user-driven

Context

In an earlier iteration of this PR, I tried adding controller-side label propagation, but based on feedback, that approach was dropped.

The current version keeps things simple and just adds an example using user-defined labels.


Goal

Provide a simple and flexible example of autoscaling with Prometheus + KEDA, without introducing any opinionated controller changes.

Copilot AI review requested due to automatic review settings April 9, 2026 20:04
@WHOIM1205 WHOIM1205 changed the title propagate kthena.io/model-name annotation from ModelServing to pod la… feat(model-serving): propagate model-name annotation to pod labels for autoscaling Apr 9, 2026
Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces the propagation of the 'kthena.io/model-name' annotation from the ModelServing CR to the corresponding pod labels. The implementation includes validation to ensure the annotation value is a valid Kubernetes label and adds comprehensive unit tests to verify the propagation logic and precedence rules. I have reviewed the code and suggest removing the redundant '!exists' check in the label assignment logic, as the pod labels are initialized with a fixed set of keys that do not include this annotation.

Comment on lines +156 to +158
} else if _, exists := pod.Labels[workloadv1alpha1.ModelNameAnnotationKey]; !exists {
pod.Labels[workloadv1alpha1.ModelNameAnnotationKey] = modelName
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The !exists check is redundant here because pod.Labels was just initialized a few lines above (lines 135-142) with a fixed set of keys that does not include ModelNameAnnotationKey. Removing this check simplifies the logic without changing behavior, as any subsequent overrides from the role template are handled later in addPodLabelAndAnnotation.

else {
			pod.Labels[workloadv1alpha1.ModelNameAnnotationKey] = modelName
		}

@FAUST-BENCHOU
Copy link
Copy Markdown
Contributor

/retest

@volcano-sh-bot
Copy link
Copy Markdown
Contributor

@FAUST-BENCHOU: Cannot trigger testing until a trusted user reviews the PR and leaves an /ok-to-test message.

Details

In response to this:

/retest

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@volcano-sh-bot
Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign yaozengzeng for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@WHOIM1205 WHOIM1205 changed the title feat(model-serving): propagate model-name annotation to pod labels for autoscaling refactor(model-serving): remove controller label propagation and add autoscaling example Apr 10, 2026
@WHOIM1205 WHOIM1205 force-pushed the feat/propagate-model-name-annotation-to-pods branch from 17e10b4 to a0851ac Compare April 10, 2026 20:53
@FAUST-BENCHOU
Copy link
Copy Markdown
Contributor

dont see any remove work, only a yml example

Adds a reference YAML showing how users can autoscale ModelServing with
KEDA + Prometheus using a router-level metric. Labels are user-defined
in the pod templates; no controller-side label propagation is introduced.

Signed-off-by: WHOIM1205 <[email protected]>
@WHOIM1205 WHOIM1205 force-pushed the feat/propagate-model-name-annotation-to-pods branch from a0851ac to b0641a7 Compare April 13, 2026 18:50
@WHOIM1205
Copy link
Copy Markdown
Contributor Author

dont see any remove work, only a yml example

Hey @FAUST-BENCHOU thanks for the feedback

ive cleaned up the pr a bit removed the earlier controller side label propagation approach and kept it aligned with using user defined labels only also squashed the commits so the diff reflects the final state more clearly

right now it just adds the autoscaling example with prometheus and keda without introducing any controller changes

happy to adjust anything further if needed

@LiZhenCheng9527
Copy link
Copy Markdown
Contributor

dont see any remove work, only a yml example

Hey @FAUST-BENCHOU thanks for the feedback

ive cleaned up the pr a bit removed the earlier controller side label propagation approach and kept it aligned with using user defined labels only also squashed the commits so the diff reflects the final state more clearly

right now it just adds the autoscaling example with prometheus and keda without introducing any controller changes

happy to adjust anything further if needed

If you’ve done these things, you should update the PR title and description.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants