Skip to content

Conversation

@Meina-rh
Copy link

πŸ“‘ Description

Fixes #

Additional Information for reviewers

βœ… Checks

  • My code requires changes to the documentation
  • if so, I have updated the documentation as required
  • My code requires tests
  • if so, I have added and/or updated the tests as required
  • All the tests have passed in the CI

How to verify it

Mykola Yurchenko and others added 30 commits November 12, 2025 11:16
- not to update the DPU connection status annotation on pod if pod is
  to be deleted
- return failure if Acl log meter failed to be created

Signed-off-by: Yun Zhou <[email protected]>
 Remove --subresource=status from ovnkube.sh get_node_zone
Fixes issues were introduced by adb1fc8

The core problem is that with the change to move the networkID from
nodes to NADs, the upgrade logic left a gap in time where pods could not
start. This because cluster-manager was responsible for migrating the
networkID from the node->NAD, and in our upgrade strategy, workers
upgrade before control plane nodes. This would leave worker nodes in a
state where new OVNK code was running that was only looking for the
networkID on the NAD, but it had not yet been migrated.

This patch changes the behavior so that any NAD Controller (zone, node,
or cluster manager) will attempt at start up to find NADs that are
missing networkIDs and search nodes for the legacy values. Node and Zone
NAD controllers will fallback to the legacy ID, but will not annotate
the NAD. Cluster manager will also use the legacy ID and update the NAD
with it.

Unit tests added to cover the different scenarios.

Signed-off-by: Tim Rozet <[email protected]>
Fixes NAD Controller syncAll for networkID upgrade from node->NAD
When a namespace/pod/EgressIP label update causes it to move from one EIP to another,
the EIP controller may process the associated EIPs in an order that leads
to incorrect assignment behavior.

Example Scenario:

1. Two EIPs exist:
   * eip1 matches namespace label test: qe
   * eip2 matches namespace label test: dev
2. Namespace ns1 initially has label test: dev and is served by eip2.
3. The label on ns1 is updated from test: dev to test: qe.
4. The EIP controller processes the Namespace update event:
   Step 1: eip1 is processed first but skips assignment since the pod is already
           served by eip2.
    * In reconcileEgressIPNamespace, eip1 is processed first and matches the new
      Namespace object.
    * It invokes addNamespaceEgressIPAssignments β†’ addPodEgressIPAssignments for
      the pod, detect that eip2 (not yet processed) is still serving the pod, adds
      eip1 to podState.standbyEgressIPNames, and returns.
   Step 2: eip2 is processed next, matches the old Namespace object, and deletes
           the pod from the assignment cache.
    * In deleteNamespaceEgressIPAssignment β†’ deletePodEgressIPAssignments, it cleans
      up OVN entries (LRP, SNAT, and address sets) and removes the pod’s status entry
      from podAssignment cache.
As a result, ns1 is no longer assigned to eip1

Fix:

When eip2 is processed in the deletePodEgressIPAssignments method, promote eip1
from standby EgressIP to active.
The same issue might also occur during Pod label updates or EgressIP selector
label updates (Namespace or Pod selector).
Added unit tests to cover Namespace, Pod, and EgressIP label update scenarios
to reproduce the issue and verify the fix.

Signed-off-by: Periyasamy Palanisamy <[email protected]>
Signed-off-by: Surya Seetharaman <[email protected]>
Signed-off-by: Surya Seetharaman <[email protected]>
This commit adds the CR validations tests for the API
against a real kube-api server running in
our CI lanes.

It ensures invalid scenarios are caught and valid ones
are allowed correctly. It's a test to ensure our
CEL and kubebuilder validations hold well.

Signed-off-by: Surya Seetharaman <[email protected]>
With multiple subnets for a network, only the first one was being used
for cluster subnet exclusion.

Signed-off-by: Tim Rozet <[email protected]>
Refresh PID when calling OVS/OVN binaries
> go-controller
 go get k8s.io/[email protected]
 go get k8s.io/[email protected]
 go get k8s.io/[email protected]
 go get k8s.io/[email protected]
 go get k8s.io/[email protected]
 go get sigs.k8s.io/[email protected]
 go get k8s.io/[email protected]
 go get sigs.k8s.io/structured-merge-diff/[email protected]
 go get sigs.k8s.io/[email protected]
 go get k8s.io/[email protected]
 go get sigs.k8s.io/apiserver-network-proxy/[email protected]
 go get k8s.io/kube-openapi
 go get sigs.k8s.io/json
 go get k8s.io/utils
 go get github.com/openshift/api
 go get github.com/openshift/client-go
 go get golang.org/x/[email protected]
 go mod tidy
 go mod vendor

> Update
  1.Update GO version in go.mod and Makefile
  2.Update hack/lint.sh to v2.5.0, which uses β€œgolangci-lint migrate” [Refer to: https://golangci-lint.run/docs/product/migration-guide/ ] .
  3.β€œtypecheck” isn’t a linter and it runs automatically. So we don’t need it anymore in golangci-lint.yml [Refer to: https://golangci-lint.run/docs/welcome/faq/#why-do-you-have-typecheck-errors ]
  4.after golangci-lint version 2, staticcheck has a new feature: quickfix(QF) [https://staticcheck.dev/docs/checks#QF ]. It will affect lint results, so disable all QF checks first.

> Fix
  1.Fix TypeConverter undefined errors in pkg/crd/…/v1/apislyconfiguration/utils.go:
    update these versions in go-controller/hack/update-codegen.sh and run "make codegen"
    GO111MODULE=on go install sigs.k8s.io/controller-tools/cmd/[email protected]
    GO111MODULE=on go install $(printf "k8s.io/code-generator/cmd/%[email protected] " "${BINS[@]}")
  2.Fix staticcheck and testifylint errors:
    11 issues:
    * staticcheck: 7
    * testifylint: 4

Signed-off-by: Meina-rh <[email protected]>
> test/e2e
   go get k8s.io/[email protected]
   go get k8s.io/[email protected]
   go get k8s.io/[email protected]
   go get k8s.io/[email protected]
   go get k8s.io/[email protected]
   go get sigs.k8s.io/[email protected]
   go get k8s.io/[email protected]
   go get k8s.io/[email protected]
   go get k8s.io/[email protected]
   go get k8s.io/[email protected]
   go get k8s.io/[email protected]
   go get k8s.io/[email protected]
   go get k8s.io/[email protected]
   go get k8s.io/kube-openapi
   go get golang.org/x/[email protected]
   go mod tidy

> Fix e2e test failure
> Skip k8s e2e tests
  [sig-network] Services [It] should support named targetPorts that resolve to different ports on different endpoints [sig-network]
  [sig-network] Conntrack [It] should be able to cleanup conntrack entries when UDP service target port changes for a NodePort service [sig-network]

Signed-off-by: Meina-rh <[email protected]>
> test/conformance
   go get k8s.io/[email protected]
   go get k8s.io/[email protected]
   go get k8s.io/kube-openapi
   go get k8s.io/utils
   go get sigs.k8s.io/structured-merge-diff/[email protected]
   go get sigs.k8s.io/[email protected]
   go get sigs.k8s.io/[email protected]
   go mod tidy
   make conformance

> Delete suite.SupportAdminNetworkPolicyEgressInlineCIDRPeers and suite.SupportBaselineAdminNetworkPolicyEgressInlineCIDRPeers

Signed-off-by: Meina-rh <[email protected]>
> Update kubernetes version in the files below:
    github/workflows/test.xml
    contrib/kind-common
    test/scripts/upgrade-ovn.sh
    test/scripts/install-kind.sh
    docs/installation/launching-ovn-kubernetes-on-kind.md
    docs/developer-guide/local_testing_guide.md

> Update golangci-lint and golangci-lint-action in .github/workflows/test.xml

> Fix:
  # golang.org/x/tools/internal/tokeninternal
  Error: /home/runner/go/pkg/mod/golang.org/x/[email protected]/internal/tokeninternal/tokeninternal.go:78:9: invalid array length -delta * delta (constant -256 of type int64)
  this error occurs because some dependencies are incompatible with go1.25:
  1.in "coredns-ocp-dnsnameresolver/operator" CONTROLLER_TOOLS_VERSION=v0.14.0 isn't compatible with go1.25, specify CONTROLLER_TOOLS_VERSION=v0.19.0 in kind-common
  2.in kind-common install_metallb() controller_gen_version = "v0.16.3" isn't compatible with go1.25, upgrade to controller_gen_version = "v0.19.0"

Signed-off-by: Meina-rh <[email protected]>
Comparing or setting a variable of type time.Duration to an integer, the integer is considered to be in nanoseconds

Signed-off-by: Meina-rh <[email protected]>
cleanup primary pUDN connection details annotaton in cmdDel
go was bumped to 1.25 with the k8s bump to 1.34. This was a mistake as
k8s did not bump golang and stayed on 1.24 this time.

Signed-off-by: Jaime CaamaΓ±o Ruiz <[email protected]>
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Nov 20, 2025

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: Meina-rh
Once this PR has been reviewed and has the lgtm label, please assign kyrtapz for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@Meina-rh
Copy link
Author

/retest-required

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Nov 21, 2025

@Meina-rh: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-aws-ovn-upgrade 77540b4 link true /test e2e-aws-ovn-upgrade
ci/prow/e2e-aws-ovn-hypershift 77540b4 link true /test e2e-aws-ovn-hypershift
ci/prow/e2e-gcp-ovn-techpreview 77540b4 link true /test e2e-gcp-ovn-techpreview
ci/prow/security 77540b4 link false /test security
ci/prow/e2e-aws-ovn-windows 77540b4 link true /test e2e-aws-ovn-windows
ci/prow/e2e-aws-ovn-edge-zones 77540b4 link true /test e2e-aws-ovn-edge-zones

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants