Skip to content

Conversation

juzhao
Copy link
Contributor

@juzhao juzhao commented Sep 1, 2025

see https://issues.redhat.com/browse/OCPBUGS-61088, use this bug to set networkpolicy for in-cluster monitoring
this PR replaced #2645, 2645 does not include networkpolicy settings for alertmanager/prometheus/thanos-querier, since there is default deny network settings for all monitoring pods, if we don't include them, CI jobs would be failed for them

@juzhao
Copy link
Contributor Author

juzhao commented Sep 1, 2025

/retitle [WIP] OCPBUGS-61088: create networkpolicy settings for in-cluster monitoring

@openshift-ci openshift-ci bot changed the title add networkpolicy settings for in-cluster monitoring [WIP] OCPBUGS-61088: create networkpolicy settings for in-cluster monitoring Sep 1, 2025
@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Sep 1, 2025
@openshift-ci openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Sep 1, 2025
@openshift-ci-robot openshift-ci-robot added the jira/invalid-bug Indicates that a referenced Jira bug is invalid for the branch this PR is targeting. label Sep 1, 2025
@openshift-ci-robot
Copy link
Contributor

@juzhao: This pull request references Jira Issue OCPBUGS-61088, which is invalid:

  • expected the bug to target the "4.20.0" version, but no target version was set

Comment /jira refresh to re-evaluate validity if changes to the Jira bug are made, or edit the title of this pull request to link to a different bug.

The bug has been updated to refer to the pull request using the external bug tracker.

In response to this:

see https://issues.redhat.com/browse/OCPBUGS-61088, use this bug to set networkpolicy for in-cluster monitoring
this PR replaced #2645, 2645 does not include networkpolicy settings for alertmanager/prometheus/thanos-querier, since there is default deny network settings for all monitoring pods, if we don't include them, CI jobs would be failed for them

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@juzhao
Copy link
Contributor Author

juzhao commented Sep 1, 2025

/jira refresh

@openshift-ci-robot openshift-ci-robot added jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. and removed jira/invalid-bug Indicates that a referenced Jira bug is invalid for the branch this PR is targeting. labels Sep 1, 2025
@openshift-ci-robot
Copy link
Contributor

@juzhao: This pull request references Jira Issue OCPBUGS-61088, which is valid.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target version (4.20.0) matches configured target version for branch (4.20.0)
  • bug is in the state POST, which is one of the valid states (NEW, ASSIGNED, POST)

Requesting review from QA contact:
/cc @juzhao

In response to this:

/jira refresh

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

Copy link
Contributor

openshift-ci bot commented Sep 1, 2025

@openshift-ci-robot: GitHub didn't allow me to request PR reviews from the following users: juzhao.

Note that only openshift members and repo collaborators can review this PR, and authors cannot review their own PRs.

In response to this:

@juzhao: This pull request references Jira Issue OCPBUGS-61088, which is valid.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target version (4.20.0) matches configured target version for branch (4.20.0)
  • bug is in the state POST, which is one of the valid states (NEW, ASSIGNED, POST)

Requesting review from QA contact:
/cc @juzhao

In response to this:

/jira refresh

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

],
},
],
egress: [
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any recommendation to allow egress traffic only to the API service? This is the only destination which ksm should try to connect but I guess that it's not trivial to accommodate classic OCP (self-managed) and hosted clusters (HyperShift).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

networkpolicy applied to pod only, seems we can not define for API service, will check again.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can use pod/namespace selectors in rgress rules https://kubernetes.io/docs/concepts/services-networking/network-policies/#behavior-of-to-and-from-selectors
but yes, it could be tricky to have it right for all the flavors.

maybe we can create a ticket for this, and try to narrow down egress for ksm and others in a follow up PR, with some manual/payload testing we should be able to get it right. (maybe in 4.21)

@juzhao
Copy link
Contributor Author

juzhao commented Sep 2, 2025

/retest

@juzhao
Copy link
Contributor Author

juzhao commented Sep 3, 2025

/test verify
/test rules
/test generate

@juzhao
Copy link
Contributor Author

juzhao commented Sep 5, 2025

/retest

@juzhao
Copy link
Contributor Author

juzhao commented Sep 23, 2025

/retest

@juzhao
Copy link
Contributor Author

juzhao commented Sep 24, 2025

/retest

1 similar comment
@juzhao
Copy link
Contributor Author

juzhao commented Sep 24, 2025

/retest

@danielmellado
Copy link
Contributor

go-proxy CI issues are unrelated to the patch

@juzhao
Copy link
Contributor Author

juzhao commented Sep 24, 2025

e2e CI error is

: Run multi-stage test e2e-agnostic-operator expand_less 	14s
{  failed to acquire lease for "aws-2-quota-slice": status 503 Service Unavailable, status code 503}

seems prow issue

@juzhao
Copy link
Contributor Author

juzhao commented Sep 25, 2025

/retest

@juzhao
Copy link
Contributor Author

juzhao commented Sep 25, 2025

/retest-required

@juzhao
Copy link
Contributor Author

juzhao commented Sep 25, 2025

  1. e2e-aws-ovn-techpreview failed
: [sig-instrumentation][Late] Alerts shouldn't exceed the series limit of total series sent via telemetry from each cluster [Suite:openshift/conformance/parallel]
    [
        <*errors.errorString | 0xc0074b8030>{
            s: "promQL query returned unexpected results:\navg_over_time(cluster:telemetry_selected_series:count[1h37m56s]) >= 780\n[\n  {\n    \"metric\": {\n      \"prometheus\": \"openshift-monitoring/k8s\"\n    },\n    \"value\": [\n      1758776352.201,\n      \"788.4744897959183\"\n    ]\n  }\n]",
        },
    ]

same issue as #2681 (comment)

  1. e2e-agnostic-operator
    failed at image_registry_test.go, PR OCPBUGS-62109: test: remove image registry e2e tests #2681 will remove the tests
    image_registry_test.go:52: 
        	Error Trace:	/go/src/github.com/openshift/cluster-monitoring-operator/test/e2e/image_registry_test.go:52
        	            				/go/src/github.com/openshift/cluster-monitoring-operator/test/e2e/image_registry_test.go:18
        	Error:      	Not equal: 
        	            	expected: "registry.build05.ci.openshift.org"
        	            	actual  : "quay-proxy.ci.openshift.org"

@juzhao
Copy link
Contributor Author

juzhao commented Sep 25, 2025

tested with PR, no regression issues.
/verified by @juzhao

@openshift-ci-robot openshift-ci-robot added the verified Signifies that the PR passed pre-merge verification criteria label Sep 25, 2025
@openshift-ci-robot
Copy link
Contributor

@juzhao: This PR has been marked as verified by @juzhao.

In response to this:

tested with PR, no regression issues.
/verified by @juzhao

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@juzhao
Copy link
Contributor Author

juzhao commented Sep 25, 2025

/test e2e-aws-ovn-techpreview

@juzhao
Copy link
Contributor Author

juzhao commented Sep 25, 2025

/test e2e-agnostic-operator

1 similar comment
@juzhao
Copy link
Contributor Author

juzhao commented Sep 26, 2025

/test e2e-agnostic-operator

@juzhao
Copy link
Contributor Author

juzhao commented Sep 26, 2025

/retest-required

@openshift-ci-robot openshift-ci-robot removed the verified Signifies that the PR passed pre-merge verification criteria label Sep 28, 2025
@juzhao
Copy link
Contributor Author

juzhao commented Oct 11, 2025

/retest

Copy link
Contributor

openshift-ci bot commented Oct 11, 2025

@juzhao: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/versions a6ee40b link false /test versions
ci/prow/okd-scos-e2e-aws-ovn a6ee40b link false /test okd-scos-e2e-aws-ovn

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants