OCPBUGS-54766: rename OVNKubernetesNodeOVSOverflowKernel #2779

martinkennelly · 2025-08-19T10:41:25Z

because as per Adrian Moreno:

"This alarm is great and we need visibility
into these packet drops. Actually, it's already
surfacing some customer issues that would
otherwise stay undetected.
The mild problem, however, is the naming.
Technically, there are many possible reasons
for the ovs_vswitchd_dp_flows_lookup_lost metric to increase, not just an overflow in the netlink
socket (as the name of the alarm suggests).
In fact, I have written a KB article listing some
of them: https://access.redhat.com/articles/7115263. I'm opening this bug for us to consider renaming it as something more accurate (and less scary),
e.g: OVNKubernetesNodeOVSDpLostPacket."

The alert name is misleading and may indicate a bug where in reality, its just we ran out of space to
process new flows and therefore drop packets.

because as per Adrian Moreno: "This alarm is great and we need visibility into these packet drops. Actually, it's already surfacing some customer issues that would otherwise stay undetected. The mild problem, however, is the naming. Technically, there are many possible reasons for the `ovs_vswitchd_dp_flows_lookup_lost` metric to increase, not just an overflow in the netlink socket (as the name of the alarm suggests). In fact, I have written a KB article listing some of them: https://access.redhat.com/articles/7115263. I'm opening this bug for us to consider renaming it as something more accurate (and less scary), e.g: OVNKubernetesNodeOVSDpLostPacket." The alert name is misleading and may indicate a bug where in reality, its just we ran out of space to process new flows and therefore drop packets. Signed-off-by: Martin Kennelly <[email protected]>

openshift-ci-robot · 2025-08-19T10:41:32Z

@martinkennelly: This pull request references Jira Issue OCPBUGS-54766, which is invalid:

expected the bug to target the "4.20.0" version, but no target version was set

Comment /jira refresh to re-evaluate validity if changes to the Jira bug are made, or edit the title of this pull request to link to a different bug.

The bug has been updated to refer to the pull request using the external bug tracker.

In response to this:

because as per Adrian Moreno:

"This alarm is great and we need visibility
into these packet drops. Actually, it's already
surfacing some customer issues that would
otherwise stay undetected.
The mild problem, however, is the naming.
Technically, there are many possible reasons
for the ovs_vswitchd_dp_flows_lookup_lost metric to increase, not just an overflow in the netlink
socket (as the name of the alarm suggests).
In fact, I have written a KB article listing some
of them: https://access.redhat.com/articles/7115263. I'm opening this bug for us to consider renaming it as something more accurate (and less scary),
e.g: OVNKubernetesNodeOVSDpLostPacket."

The alert name is misleading and may indicate a bug where in reality, its just we ran out of space to
process new flows and therefore drop packets.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

martinkennelly · 2025-08-19T10:41:58Z

/jira refresh

openshift-ci-robot · 2025-08-19T10:42:07Z

@martinkennelly: This pull request references Jira Issue OCPBUGS-54766, which is valid. The bug has been moved to the POST state.

3 validation(s) were run on this bug

bug is open, matching expected state (open)
bug target version (4.20.0) matches configured target version for branch (4.20.0)
bug is in the state New, which is one of the valid states (NEW, ASSIGNED, POST)

Requesting review from QA contact:
/cc @anuragthehatter

In response to this:

/jira refresh

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

martinkennelly · 2025-08-21T09:34:48Z

/retest

martinkennelly · 2025-08-21T09:37:13Z

Simple change for you @kyrtapz to close a bug we have thats normal prio.

martinkennelly · 2025-08-21T09:38:26Z

@ahardin-rh do we have docs that reference this alert ? May need updating.

We also have to search for any kcs on this alert and update. I'll do this.

martinkennelly · 2025-08-21T14:37:05Z

/retest

martinkennelly · 2025-08-23T11:19:18Z

/retest

openshift-ci · 2025-08-23T16:40:59Z

@martinkennelly: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name	Commit	Details	Required	Rerun command
ci/prow/e2e-aws-ovn-serial	`85915e8`	link	false	`/test e2e-aws-ovn-serial`
ci/prow/4.20-upgrade-from-stable-4.19-e2e-azure-ovn-upgrade	`85915e8`	link	false	`/test 4.20-upgrade-from-stable-4.19-e2e-azure-ovn-upgrade`
ci/prow/4.20-upgrade-from-stable-4.19-e2e-aws-ovn-upgrade	`85915e8`	link	false	`/test 4.20-upgrade-from-stable-4.19-e2e-aws-ovn-upgrade`
ci/prow/security	`85915e8`	link	false	`/test security`
ci/prow/e2e-aws-hypershift-ovn-kubevirt	`85915e8`	link	false	`/test e2e-aws-hypershift-ovn-kubevirt`

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

martinkennelly · 2025-08-26T09:20:34Z

Asking Adrian for +1

martinkennelly · 2025-08-26T20:17:12Z

Pingd Adrian but hes on PTO. Waiting.

amorenoz · 2025-09-01T06:33:24Z

New name looks good to me, thanks.

martinkennelly · 2025-09-01T08:53:14Z

/assign @kyrtapz

kyrtapz · 2025-09-01T09:35:50Z

/lgtm

openshift-ci · 2025-09-01T09:36:30Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: kyrtapz, martinkennelly

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [kyrtapz]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

openshift-bot · 2025-09-02T09:11:41Z

/jira refresh

The requirements for Jira bugs have changed (Jira issues linked to PRs on main branch need to target different OCP), recalculating validity.

openshift-ci-robot · 2025-09-02T09:12:06Z

@openshift-bot: This pull request references Jira Issue OCPBUGS-54766, which is invalid:

expected the bug to target either version "4.21." or "openshift-4.21.", but it targets "4.20.0" instead

Comment /jira refresh to re-evaluate validity if changes to the Jira bug are made, or edit the title of this pull request to link to a different bug.

In response to this:

/jira refresh

The requirements for Jira bugs have changed (Jira issues linked to PRs on main branch need to target different OCP), recalculating validity.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

martinkennelly · 2025-09-04T11:21:08Z

@kyrtapz can you over ride the bgp job - its unrelated. thanks.

martinkennelly · 2025-09-11T11:28:39Z

/override ci/prow/e2e-metal-ipi-ovn-dualstack-bgp-local-gw

unrelated

openshift-ci · 2025-09-11T11:29:26Z

@martinkennelly: martinkennelly unauthorized: /override is restricted to Repo administrators, approvers in top level OWNERS file, and the following github teams:openshift: openshift-release-oversight openshift-staff-engineers openshift-sustaining-engineers.

In response to this:

/override ci/prow/e2e-metal-ipi-ovn-dualstack-bgp-local-gw

unrelated

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

martinkennelly · 2025-09-15T11:20:02Z

/test e2e-metal-ipi-ovn-dualstack-bgp-local-gw

Looks like its passing again

martinkennelly · 2025-09-29T12:01:36Z

/tide refresh

openshift-ci-robot added jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. jira/invalid-bug Indicates that a referenced Jira bug is invalid for the branch this PR is targeting. labels Aug 19, 2025

openshift-ci-robot added jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. and removed jira/invalid-bug Indicates that a referenced Jira bug is invalid for the branch this PR is targeting. labels Aug 19, 2025

openshift-ci bot requested review from anuragthehatter, kyrtapz and miheer August 19, 2025 10:42

openshift-ci bot assigned kyrtapz Sep 1, 2025

openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Sep 1, 2025

openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Sep 1, 2025

openshift-ci-robot added jira/invalid-bug Indicates that a referenced Jira bug is invalid for the branch this PR is targeting. and removed jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. labels Sep 2, 2025

OCPBUGS-54766: rename OVNKubernetesNodeOVSOverflowKernel #2779

Are you sure you want to change the base?

OCPBUGS-54766: rename OVNKubernetesNodeOVSOverflowKernel #2779

Uh oh!

Conversation

martinkennelly commented Aug 19, 2025

Uh oh!

openshift-ci-robot commented Aug 19, 2025

Uh oh!

martinkennelly commented Aug 19, 2025

Uh oh!

openshift-ci-robot commented Aug 19, 2025

Uh oh!

martinkennelly commented Aug 21, 2025

Uh oh!

martinkennelly commented Aug 21, 2025

Uh oh!

martinkennelly commented Aug 21, 2025

Uh oh!

martinkennelly commented Aug 21, 2025

Uh oh!

martinkennelly commented Aug 23, 2025

Uh oh!

openshift-ci bot commented Aug 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

martinkennelly commented Aug 26, 2025

Uh oh!

martinkennelly commented Aug 26, 2025

Uh oh!

amorenoz commented Sep 1, 2025

Uh oh!

martinkennelly commented Sep 1, 2025

Uh oh!

kyrtapz commented Sep 1, 2025

Uh oh!

openshift-ci bot commented Sep 1, 2025

Uh oh!

openshift-bot commented Sep 2, 2025

Uh oh!

openshift-ci-robot commented Sep 2, 2025

Uh oh!

martinkennelly commented Sep 4, 2025

Uh oh!

martinkennelly commented Sep 11, 2025

Uh oh!

openshift-ci bot commented Sep 11, 2025

Uh oh!

martinkennelly commented Sep 15, 2025

Uh oh!

martinkennelly commented Sep 29, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

openshift-ci bot commented Aug 23, 2025 •

edited

Loading