Skip to content

Conversation

@openshift-pr-manager
Copy link

Automated merge of upstream/master → master.

Mykola Yurchenko and others added 15 commits November 12, 2025 11:16
- not to update the DPU connection status annotation on pod if pod is
  to be deleted
- return failure if Acl log meter failed to be created

Signed-off-by: Yun Zhou <[email protected]>
 Remove --subresource=status from ovnkube.sh get_node_zone
Fixes issues were introduced by adb1fc8

The core problem is that with the change to move the networkID from
nodes to NADs, the upgrade logic left a gap in time where pods could not
start. This because cluster-manager was responsible for migrating the
networkID from the node->NAD, and in our upgrade strategy, workers
upgrade before control plane nodes. This would leave worker nodes in a
state where new OVNK code was running that was only looking for the
networkID on the NAD, but it had not yet been migrated.

This patch changes the behavior so that any NAD Controller (zone, node,
or cluster manager) will attempt at start up to find NADs that are
missing networkIDs and search nodes for the legacy values. Node and Zone
NAD controllers will fallback to the legacy ID, but will not annotate
the NAD. Cluster manager will also use the legacy ID and update the NAD
with it.

Unit tests added to cover the different scenarios.

Signed-off-by: Tim Rozet <[email protected]>
Fixes NAD Controller syncAll for networkID upgrade from node->NAD
When a namespace/pod/EgressIP label update causes it to move from one EIP to another,
the EIP controller may process the associated EIPs in an order that leads
to incorrect assignment behavior.

Example Scenario:

1. Two EIPs exist:
   * eip1 matches namespace label test: qe
   * eip2 matches namespace label test: dev
2. Namespace ns1 initially has label test: dev and is served by eip2.
3. The label on ns1 is updated from test: dev to test: qe.
4. The EIP controller processes the Namespace update event:
   Step 1: eip1 is processed first but skips assignment since the pod is already
           served by eip2.
    * In reconcileEgressIPNamespace, eip1 is processed first and matches the new
      Namespace object.
    * It invokes addNamespaceEgressIPAssignments → addPodEgressIPAssignments for
      the pod, detect that eip2 (not yet processed) is still serving the pod, adds
      eip1 to podState.standbyEgressIPNames, and returns.
   Step 2: eip2 is processed next, matches the old Namespace object, and deletes
           the pod from the assignment cache.
    * In deleteNamespaceEgressIPAssignment → deletePodEgressIPAssignments, it cleans
      up OVN entries (LRP, SNAT, and address sets) and removes the pod’s status entry
      from podAssignment cache.
As a result, ns1 is no longer assigned to eip1

Fix:

When eip2 is processed in the deletePodEgressIPAssignments method, promote eip1
from standby EgressIP to active.
The same issue might also occur during Pod label updates or EgressIP selector
label updates (Namespace or Pod selector).
Added unit tests to cover Namespace, Pod, and EgressIP label update scenarios
to reproduce the issue and verify the fix.

Signed-off-by: Periyasamy Palanisamy <[email protected]>
With multiple subnets for a network, only the first one was being used
for cluster subnet exclusion.

Signed-off-by: Tim Rozet <[email protected]>
Refresh PID when calling OVS/OVN binaries
@openshift-pr-manager
Copy link
Author

/ok-to-test
/payload 4.21 ci blocking
/payload 4.21 nightly blocking

@openshift-ci-robot
Copy link
Contributor

@openshift-pr-manager[bot]: This pull request explicitly references no jira issue.

In response to this:

Automated merge of upstream/master → master.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Nov 19, 2025
@openshift-ci openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Nov 19, 2025
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Nov 19, 2025

@openshift-pr-manager[bot]: trigger 5 job(s) of type blocking for the ci release of OCP 4.21

  • periodic-ci-openshift-release-master-ci-4.21-upgrade-from-stable-4.20-e2e-aws-ovn-upgrade
  • periodic-ci-openshift-release-master-ci-4.21-upgrade-from-stable-4.20-e2e-azure-ovn-upgrade
  • periodic-ci-openshift-release-master-ci-4.21-e2e-gcp-ovn-upgrade
  • periodic-ci-openshift-hypershift-release-4.21-periodics-e2e-aks
  • periodic-ci-openshift-hypershift-release-4.21-periodics-e2e-aws-ovn

See details on https://pr-payload-tests.ci.openshift.org/runs/ci/985871b0-c53f-11f0-95d3-3c26d7486622-0

trigger 13 job(s) of type blocking for the nightly release of OCP 4.21

  • periodic-ci-openshift-release-master-ci-4.21-e2e-aws-upgrade-ovn-single-node
  • periodic-ci-openshift-release-master-nightly-4.21-e2e-aws-ovn-upgrade-fips
  • periodic-ci-openshift-release-master-ci-4.21-e2e-azure-ovn-upgrade
  • periodic-ci-openshift-release-master-ci-4.21-upgrade-from-stable-4.20-e2e-gcp-ovn-rt-upgrade
  • periodic-ci-openshift-hypershift-release-4.21-periodics-e2e-aws-ovn-conformance
  • periodic-ci-openshift-release-master-nightly-4.21-e2e-aws-ovn-serial-1of2
  • periodic-ci-openshift-release-master-nightly-4.21-e2e-aws-ovn-serial-2of2
  • periodic-ci-openshift-release-master-ci-4.21-e2e-aws-ovn-techpreview
  • periodic-ci-openshift-release-master-ci-4.21-e2e-aws-ovn-techpreview-serial-1of3
  • periodic-ci-openshift-release-master-ci-4.21-e2e-aws-ovn-techpreview-serial-2of3
  • periodic-ci-openshift-release-master-ci-4.21-e2e-aws-ovn-techpreview-serial-3of3
  • periodic-ci-openshift-release-master-nightly-4.21-e2e-metal-ipi-ovn-bm
  • periodic-ci-openshift-release-master-nightly-4.21-e2e-metal-ipi-ovn-ipv6

See details on https://pr-payload-tests.ci.openshift.org/runs/ci/985871b0-c53f-11f0-95d3-3c26d7486622-1

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Nov 19, 2025

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@openshift-ci openshift-ci bot added the ok-to-test Indicates a non-member PR verified by an org member that is safe to test. label Nov 19, 2025
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Nov 19, 2025

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: openshift-pr-manager[bot]
Once this PR has been reviewed and has the lgtm label, please assign knobunc for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@pperiyasamy
Copy link
Member

/retitle OCPBUGS-62013, OCPBUGS-61742: DownStream Merge [11-19-2025]

@openshift-ci openshift-ci bot changed the title NO-JIRA: DownStream Merge [11-19-2025] OCPBUGS-62013, OCPBUGS-61742: DownStream Merge [11-19-2025] Nov 20, 2025
@openshift-ci-robot openshift-ci-robot added jira/severity-critical Referenced Jira bug's severity is critical for the branch this PR is targeting. jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. labels Nov 20, 2025
@openshift-ci-robot
Copy link
Contributor

@openshift-pr-manager[bot]: This pull request references Jira Issue OCPBUGS-62013, which is valid.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target version (4.21.0) matches configured target version for branch (4.21.0)
  • bug is in the state POST, which is one of the valid states (NEW, ASSIGNED, POST)

Requesting review from QA contact:
/cc @huiran0826

The bug has been updated to refer to the pull request using the external bug tracker.

This pull request references Jira Issue OCPBUGS-61742, which is valid.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target version (4.21.0) matches configured target version for branch (4.21.0)
  • bug is in the state POST, which is one of the valid states (NEW, ASSIGNED, POST)

Requesting review from QA contact:
/cc @huiran0826

The bug has been updated to refer to the pull request using the external bug tracker.

In response to this:

Automated merge of upstream/master → master.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci openshift-ci bot requested a review from huiran0826 November 20, 2025 13:07
@pperiyasamy
Copy link
Member

/retest

@pperiyasamy
Copy link
Member

/payload-job periodic-ci-openshift-release-master-ci-4.21-upgrade-from-stable-4.20-e2e-aws-ovn-upgrade
/payload-job periodic-ci-openshift-release-master-ci-4.21-e2e-aws-upgrade-ovn-single-node
/payload-job periodic-ci-openshift-release-master-ci-4.21-e2e-aws-ovn-techpreview-serial-3of3
/payload-job periodic-ci-openshift-release-master-nightly-4.21-e2e-metal-ipi-ovn-bm

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Nov 20, 2025

@pperiyasamy: trigger 4 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command

  • periodic-ci-openshift-release-master-ci-4.21-upgrade-from-stable-4.20-e2e-aws-ovn-upgrade
  • periodic-ci-openshift-release-master-ci-4.21-e2e-aws-upgrade-ovn-single-node
  • periodic-ci-openshift-release-master-ci-4.21-e2e-aws-ovn-techpreview-serial-3of3
  • periodic-ci-openshift-release-master-nightly-4.21-e2e-metal-ipi-ovn-bm

See details on https://pr-payload-tests.ci.openshift.org/runs/ci/7cf90190-c612-11f0-9593-c177a4ba179c-0

@openshift-pr-manager openshift-pr-manager bot marked this pull request as ready for review November 20, 2025 17:18
@openshift-ci openshift-ci bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Nov 20, 2025
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Nov 20, 2025

@openshift-pr-manager[bot]: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-aws-ovn-edge-zones 477c5db link true /test e2e-aws-ovn-edge-zones
ci/prow/security 477c5db link false /test security
ci/prow/e2e-aws-ovn-windows 477c5db link true /test e2e-aws-ovn-windows
ci/prow/e2e-aws-ovn-local-gateway 477c5db link true /test e2e-aws-ovn-local-gateway
ci/prow/4.21-upgrade-from-stable-4.20-e2e-aws-ovn-upgrade 477c5db link true /test 4.21-upgrade-from-stable-4.20-e2e-aws-ovn-upgrade
ci/prow/e2e-aws-ovn-hypershift 477c5db link true /test e2e-aws-ovn-hypershift
ci/prow/e2e-metal-ipi-ovn-ipv6 477c5db link true /test e2e-metal-ipi-ovn-ipv6
ci/prow/e2e-aws-ovn 477c5db link true /test e2e-aws-ovn
ci/prow/e2e-aws-ovn-local-to-shared-gateway-mode-migration 477c5db link true /test e2e-aws-ovn-local-to-shared-gateway-mode-migration
ci/prow/qe-perfscale-payload-control-plane-6nodes 477c5db link true /test qe-perfscale-payload-control-plane-6nodes

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@jluhrsen
Copy link
Contributor

/retest

/payload-job periodic-ci-openshift-release-master-ci-4.21-e2e-aws-upgrade-ovn-single-node

/payload-job periodic-ci-openshift-release-master-nightly-4.21-e2e-metal-ipi-ovn-bm

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Nov 21, 2025

@jluhrsen: trigger 2 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command

  • periodic-ci-openshift-release-master-ci-4.21-e2e-aws-upgrade-ovn-single-node
  • periodic-ci-openshift-release-master-nightly-4.21-e2e-metal-ipi-ovn-bm

See details on https://pr-payload-tests.ci.openshift.org/runs/ci/70dedc20-c68c-11f0-98cb-d7913ab95c7d-0

@jluhrsen
Copy link
Contributor

we should /override e2e-aws-ovn-windows

otherwise, CI is just not super healthy on the failing jobs. doesn't look like anything to do with this PR and the failures across the jobs are not common. trying again:

/retest
/payload-job periodic-ci-openshift-release-master-ci-4.21-e2e-aws-upgrade-ovn-single-node
/payload-job periodic-ci-openshift-release-master-nightly-4.21-e2e-metal-ipi-ovn-bm

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Nov 22, 2025

@jluhrsen: trigger 2 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command

  • periodic-ci-openshift-release-master-ci-4.21-e2e-aws-upgrade-ovn-single-node
  • periodic-ci-openshift-release-master-nightly-4.21-e2e-metal-ipi-ovn-bm

See details on https://pr-payload-tests.ci.openshift.org/runs/ci/ac1aa910-c743-11f0-8023-39e6977616bb-0

@pperiyasamy
Copy link
Member

/test 4.21-upgrade-from-stable-4.20-e2e-aws-ovn-upgrade

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

jira/severity-critical Referenced Jira bug's severity is critical for the branch this PR is targeting. jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. ok-to-test Indicates a non-member PR verified by an org member that is safe to test.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants