Skip to content

Conversation

capri-xiyue
Copy link
Contributor

What type of PR is this?
/kind documentation

What this PR does / why we need it:
Added migration guide for Inference Pool v1 apo

Which issue(s) this PR fixes:

Fixes #

Does this PR introduce a user-facing change?:

None

@k8s-ci-robot k8s-ci-robot added kind/documentation Categorizes issue or PR as related to documentation. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Sep 9, 2025
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: capri-xiyue
Once this PR has been reviewed and has the lgtm label, please assign nirrozenbaum for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Copy link

netlify bot commented Sep 9, 2025

Deploy Preview for gateway-api-inference-extension ready!

Name Link
🔨 Latest commit e934b58
🔍 Latest deploy log https://app.netlify.com/projects/gateway-api-inference-extension/deploys/68c0a3231e50b80009cfaf72
😎 Deploy Preview https://deploy-preview-1558--gateway-api-inference-extension.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@k8s-ci-robot k8s-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Sep 9, 2025
@capri-xiyue
Copy link
Contributor Author

/assign @kfswain

If you are not using Helm, you will need to manually delete all resources associated with your `v1alpha2` deployment. The key is to remove the `HTTPRoute`'s reference to the old `InferencePool` and then delete the `v1alpha2` resources themselves.

1. **Update or Delete the `HTTPRoute`**: Modify the `HTTPRoute` to remove the `backendRef` that points to the `v1alpha2` `InferencePool`.
2. **Delete the `InferencePool` and associated resources**: You must delete the `v1alpha2` `InferencePool`, any `InferenceModel` resources that point to it, and the corresponding Endpoint Picker (EPP) Deployment and Service.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
2. **Delete the `InferencePool` and associated resources**: You must delete the `v1alpha2` `InferencePool`, any `InferenceModel` resources that point to it, and the corresponding Endpoint Picker (EPP) Deployment and Service.
2. **Delete the `InferencePool` and associated resources**: You must delete the `v1alpha2` `InferencePool`, any `InferenceModel` (or 'InferenceObjective') resources that point to it, and the corresponding Endpoint Picker (EPP) Deployment and Service.

2. **Delete the `InferencePool` and associated resources**: You must delete the `v1alpha2` `InferencePool`, any `InferenceModel` resources that point to it, and the corresponding Endpoint Picker (EPP) Deployment and Service.
3. **Delete the `v1alpha2` CRDs**: Once all `v1alpha2` custom resources are deleted, you can remove the CRD definitions from your cluster.
```bash
kubectl delete -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/releases/download/v0.3.0/manifests.yaml
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider making the version portion of the path configurable

Curl the endpoint to make sure you are getting a successful response with a **200** response code.

```bash
IP=$(kubectl get gateway/inference-gateway -o jsonpath='{.status.addresses[0].value}')
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think leaving the GW name as inference-gateway is fine in this case, but I would make mention that you need to put your GW name here

After cleaning up the old resources, you can proceed with a fresh installation of the `v1` Inference Gateway. This involves installing the new `v1` CRDs, creating a new `v1` `InferencePool` and corresponding `InferenceObjective` resources, and creating a new `HTTPRoute` that directs traffic to your new `v1` `InferencePool`.


### 3. Verify the Deployment
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should probably include mention of the fact that you need to deploy a new EPP image that is compatible with the v1 API

@kfswain
Copy link
Collaborator

kfswain commented Sep 10, 2025

Added comments, the split seems fine, but we need to add that deploying a new EPP is likely needed.

But this is a great addition, thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/documentation Categorizes issue or PR as related to documentation. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants