Skip to content

Add LWS Labels Extension Plugin for ModelServing Pods#767

Open
WHOIM1205 wants to merge 1 commit intovolcano-sh:mainfrom
WHOIM1205:feat/lws-labels-plugin
Open

Add LWS Labels Extension Plugin for ModelServing Pods#767
WHOIM1205 wants to merge 1 commit intovolcano-sh:mainfrom
WHOIM1205:feat/lws-labels-plugin

Conversation

@WHOIM1205
Copy link
Copy Markdown
Contributor

What type of PR is this?

/kind enhancement

What this PR does / why we need it:

Pods created by the ModelServing controller currently only include kthena/volcano-specific labels.
When the workload originates from a LeaderWorkerSet (LWS), the standard leaderworkerset.sigs.k8s.io/* labels are missing, which breaks compatibility with the LWS ecosystem (controllers, monitoring, logging).

This PR adds a built-in extension plugin that injects the standard LWS labels during pod creation, ensuring LWS-managed pods can be correctly identified and grouped while remaining fully backward compatible.

Which issue(s) this PR fixes:

Fixes #759

Special notes for your reviewer:

  • Implemented as a built-in extension plugin using the existing plugin framework
  • Labels are injected in OnPodCreate, after pod generation and before the Kubernetes Create() call
  • Label injection is additive only and never overwrites user-defined labels
  • No changes to existing controllers or reconciliation logic
  • Includes table-driven unit tests covering leader/worker pods and edge cases

Copilot AI review requested due to automatic review settings February 11, 2026 00:17
@volcano-sh-bot volcano-sh-bot added the kind/enhancement New feature or request label Feb 11, 2026
@volcano-sh-bot
Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign hzxuzhonghu for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@gemini-code-assist
Copy link
Copy Markdown

Summary of Changes

Hello @WHOIM1205, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request enhances the ModelServing controller's compatibility with the Kubernetes LeaderWorkerSet (LWS) ecosystem. It introduces a new built-in plugin that automatically injects the standard LWS-specific labels into pods created by the ModelServing controller. This ensures that LWS-managed pods are correctly identified and grouped, facilitating integration with LWS-aware tools and controllers, without altering existing controller logic or reconciliation.

Highlights

  • New Built-in Extension Plugin: Introduced a new built-in extension plugin named "lws-labels" to enhance ModelServing functionality.
  • LeaderWorkerSet Label Injection: Enabled the injection of standard leaderworkerset.sigs.k8s.io/* labels into ModelServing Pods, ensuring compatibility with the LWS ecosystem.
  • Non-destructive Label Merging: Implemented label injection in the OnPodCreate hook, ensuring that existing user-defined labels are never overwritten, only new labels are added.
  • No Core Controller Changes: The changes are isolated to the new plugin, with no modifications to existing controllers or reconciliation logic.
  • Comprehensive Unit Testing: Included comprehensive table-driven unit tests covering various scenarios, including leader/worker pods, edge cases, and plugin registration.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog
  • pkg/model-serving-controller/plugins/lws_labels_plugin.go
    • Defined the LWSLabelsPlugin structure and its Name() method.
    • Implemented the OnPodCreate method to derive and inject LWS labels (name, group-index, worker-index, group-key) into the pod's metadata.
    • Added a setIfAbsent helper function to prevent overwriting existing labels.
    • Registered the plugin with the DefaultRegistry during initialization.
    • Declared OnPodReady as a no-op.
  • pkg/model-serving-controller/plugins/lws_labels_plugin_test.go
    • Created TestLWSLabelsPluginOnPodCreate with table-driven tests for various scenarios: entry pods, worker pods, preserving existing labels, and handling nil inputs.
    • Added TestLWSLabelsPluginReadyNoop to confirm OnPodReady is a no-op.
    • Included TestLWSLabelsPluginRegistration to verify the plugin is correctly registered and instantiated.
Activity
  • No specific activity (comments, reviews, or progress updates) has been recorded for this pull request yet.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@WHOIM1205
Copy link
Copy Markdown
Contributor Author

/assign @YaoZengzeng

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a new built-in plugin, lws-labels, to inject standard LeaderWorkerSet (LWS) labels into pods managed by the ModelServing controller. A security analysis of pkg/model-serving-controller/plugins/lws_labels_plugin.go found no vulnerabilities. The implementation is clean and well-tested, though there is a suggestion to enhance test assertion robustness. Overall, this is a great enhancement for improving compatibility with the LWS ecosystem.

Comment on lines +175 to +182
for key, want := range tt.expectLabels {
got, ok := pod.Labels[key]
if !ok {
t.Errorf("label %s missing", key)
} else if got != want {
t.Errorf("label %s = %q, want %q", key, got, want)
}
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

For more robust test assertions, it's better to compare the entire labels map using reflect.DeepEqual. This ensures that no unexpected labels are added and that the final state of the labels is exactly as expected. The current loop-based check only verifies that a subset of expected labels exist, but it wouldn't catch any extra, erroneously added labels.

Note: You will need to import the reflect package to use this function.

if !reflect.DeepEqual(pod.Labels, tt.expectLabels) {
	t.Errorf("labels mismatch.\nGot:  %v\nWant: %v", pod.Labels, tt.expectLabels)
}

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces a built-in ModelServing extension plugin intended to inject standard LeaderWorkerSet (LWS) labels onto pods created by the ModelServing controller, improving compatibility with the LWS ecosystem.

Changes:

  • Added LWSLabelsPlugin to inject leaderworkerset.sigs.k8s.io/* labels during OnPodCreate.
  • Added unit tests covering entry/worker pods, nil-safety, and plugin registration.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.

File Description
pkg/model-serving-controller/plugins/lws_labels_plugin.go New built-in plugin that derives and injects 4 standard LWS labels during pod creation.
pkg/model-serving-controller/plugins/lws_labels_plugin_test.go Table-driven tests for label injection behavior and DefaultRegistry registration.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +46 to +54
func init() {
DefaultRegistry.Register(LWSLabelsPluginName, NewLWSLabelsPlugin)
}

// NewLWSLabelsPlugin constructs the LWS labels plugin from a PluginSpec.
// This plugin does not require any configuration.
func NewLWSLabelsPlugin(spec workloadv1alpha1.PluginSpec) (Plugin, error) {
return &LWSLabelsPlugin{name: spec.Name}, nil
}
Copy link

Copilot AI Feb 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As implemented, this plugin only runs when it is explicitly listed in ModelServing.spec.plugins. The current LWS->ModelServing translation (pkg/model-serving-controller/controller/lws_controller.go:constructModelServing) does not add this plugin, so LWS-originated pods will still miss the leaderworkerset.sigs.k8s.io/* labels by default. Consider wiring this in (e.g., have the LWS controller inject a built-in PluginSpec{Name: "lws-labels", Type: BuiltIn} or add defaulting based on the LWS ownerRef), otherwise the PR won't actually fix #759 as described.

Copilot uses AI. Check for mistakes.
Comment on lines +60 to +79
func (p *LWSLabelsPlugin) OnPodCreate(_ context.Context, req *HookRequest) error {
if req == nil || req.Pod == nil || req.ModelServing == nil {
return nil
}

// Derive label values from the HookRequest context.
lwsName := req.ModelServing.Name

// Extract group index from the serving group name (e.g. "my-lws-0" → "0").
_, groupIndex := utils.GetParentNameAndOrdinal(req.ServingGroup)
if groupIndex < 0 {
return fmt.Errorf("cannot extract group index from serving group name %q", req.ServingGroup)
}
groupIndexStr := strconv.Itoa(groupIndex)

// Extract worker index from the pod name (trailing ordinal, e.g. "my-lws-0-default-0-1" → "1").
_, workerIndex := utils.GetParentNameAndOrdinal(req.Pod.Name)
if workerIndex < 0 {
return fmt.Errorf("cannot extract worker index from pod name %q", req.Pod.Name)
}
Copy link

Copilot AI Feb 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OnPodCreate returns an error if it cannot parse req.ServingGroup or req.Pod.Name. Because plugin errors abort pod creation (createPod returns the error), this makes a best-effort label-injection feature capable of breaking workloads. Prefer treating parsing failures as a no-op (or gating on a known LWS marker/ownerRef) and only injecting labels when the expected naming pattern is present.

Copilot uses AI. Check for mistakes.
workerIndexStr := strconv.Itoa(workerIndex)

// Group key uniquely identifies the group within the LWS.
groupKey := fmt.Sprintf("%s-%s", lwsName, groupIndexStr)
Copy link

Copilot AI Feb 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

groupKey is recomputed from ModelServing.Name + parsed group index, but the serving group name is already available as req.ServingGroup (and is what pods/services are keyed off elsewhere). Using req.ServingGroup directly would be more robust (e.g., if the ModelServing name ever diverges from the serving group prefix) and avoids constructing a potentially inconsistent key.

Suggested change
groupKey := fmt.Sprintf("%s-%s", lwsName, groupIndexStr)
groupKey := req.ServingGroup

Copilot uses AI. Check for mistakes.
LWSLabelName: "user-override", // preserved
LWSLabelGroupIndex: "99", // preserved
LWSLabelWorkerIndex: "0", // injected
LWSLabelGroupKey: "my-lws-0", // injected
Copy link

Copilot AI Feb 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test case asserts that group-index can be user-overridden ("99") while the plugin still injects group-key based on the parsed serving group name ("my-lws-0"). That produces an internally inconsistent label set (group-key no longer corresponds to group-index), which can break LWS tooling. Either (a) derive group-key from the effective group-index value when present, or (b) skip injecting group-key when group-index is already set to something else, and adjust the expectation accordingly.

Suggested change
LWSLabelGroupKey: "my-lws-0", // injected

Copilot uses AI. Check for mistakes.
@hzxuzhonghu
Copy link
Copy Markdown
Member

/assign @YaoZengzeng

@WHOIM1205
Copy link
Copy Markdown
Contributor Author

/assign @YaoZengzeng

@hzxuzhonghu is there anything i can change in this pr

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

kind/enhancement New feature or request size/L

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support native LeaderWorkerSet (LWS) labels in pods generated via LWS API

5 participants