Skip to content

Conversation

@capri-xiyue
Copy link
Contributor

@capri-xiyue capri-xiyue commented Nov 7, 2025

What type of PR is this?

/kind feature

What this PR does / why we need it:
See #1779
Which issue(s) this PR fixes:

Part of #1779

Does this PR introduce a user-facing change?:

NONE to existing features with inferencepool. 
But users can use epp without inference pool via args 
`
        - --endpoint-selector
        - "app=vllm-llama3-8b-instruct"
        - --endpoint-target-ports
        - "8000"
`

@k8s-ci-robot k8s-ci-robot added the kind/feature Categorizes issue or PR as related to a new feature. label Nov 7, 2025
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: capri-xiyue
Once this PR has been reviewed and has the lgtm label, please assign kfswain for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@netlify
Copy link

netlify bot commented Nov 7, 2025

Deploy Preview for gateway-api-inference-extension ready!

Name Link
🔨 Latest commit 84b2275
🔍 Latest deploy log https://app.netlify.com/projects/gateway-api-inference-extension/deploys/691bd91f54ecb200086338bf
😎 Deploy Preview https://deploy-preview-1833--gateway-api-inference-extension.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@k8s-ci-robot k8s-ci-robot requested a review from elevran November 7, 2025 19:47
@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Nov 7, 2025
@capri-xiyue capri-xiyue changed the title Enable EPP to support endpoint discovery using pod selector [WIP] Enable EPP to support endpoint discovery using pod selector Nov 7, 2025
@k8s-ci-robot k8s-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Nov 7, 2025
@capri-xiyue
Copy link
Contributor Author

No need to review it right now. I just made the CUJ of standalone epp work without inferencepool. Still need to fix the e2e and ut

@elevran elevran mentioned this pull request Nov 11, 2025
Copy link
Contributor

@elevran elevran left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cursory review to provide initial feedback (realizing this is work in progress)
The main question I have (and might be worth mentioning in the PR description) is the need for a new abstraction/type (EndPointsPool). A naive/simple solution (which perhaps does not work...) would be to copy the selector and port array into a Go InferencePool object and use datastore.PoolSet() along with disabling the Pool notification/reconciliation so it does not overwrite with nil.
Hopefully the rest of the code should not care or depend on the pool's origin (from command line or the API server)

@capri-xiyue
Copy link
Contributor Author

assign @ahg-g for early review.

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Nov 14, 2025
@capri-xiyue
Copy link
Contributor Author

assign @kfswain for early review

@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Nov 14, 2025
Signed-off-by: Xiyue Yu <[email protected]>
@capri-xiyue capri-xiyue requested a review from ahg-g November 17, 2025 18:43
@capri-xiyue capri-xiyue changed the title [WIP] Enable EPP to support endpoint discovery using pod selector Enable EPP to support endpoint discovery using pod selector Nov 17, 2025
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Nov 17, 2025
return nil
}

func strToUniqueIntSlice(s string) ([]int, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there is no existing library for that in k8s?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't find existing library in k8s to change a int[] string to []int. As this is not like selector, probably it makes sense that k8s doesn't have it natively

valueFrom:
fieldRef:
fieldPath: metadata.namespace
- name: EPP_NAME
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we need this?

Copy link
Contributor Author

@capri-xiyue capri-xiyue Nov 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To avoid a giant PR, I kind of keep the metrics collector part unchanged in this PR.
Previously the metrics collector need inferencepool name, see https://github.com/kubernetes-sigs/gateway-api-inference-extension/blob/main/pkg/epp/metrics/collectors/inference_pool.go#L76.

As in EPP standalone mode, there is no inferencepool, I use the EPP name here.

Later I will revisit it to see whether there is better way to do it. But I think combing that in this then this PR will be too large.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

metadata.name is the name of the pod, which will look odd as it will include a hash at the end, like my-epp-xyz-abc if the deployment name was my-epp, perhaps extract the deployment name in code by dropping the hash suffix?

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Nov 18, 2025
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Nov 18, 2025
@capri-xiyue capri-xiyue requested a review from ahg-g November 18, 2025 02:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/feature Categorizes issue or PR as related to a new feature. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants