Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OCPBUGS-43745: Add support for IdleCloseTerminationPolicy (Go http.Client) #1182

Conversation

frobware
Copy link
Contributor

@frobware frobware commented Jan 13, 2025

Introduce logic in desiredRouterDeployment to set the environmentvariable ROUTER_IDLE_CLOSE_ON_RESPONSE when the IdleConnectionTerminationPolicy field in the IngressController spec is set to Deferred. This change enables configuring HAProxy with the idle-close-on-response option for better control over idle connection termination behaviour.

Add an e2e test Test_IdleConnectionTerminationPolicy to verify the behaviour.

Requires:

Enhanced response handlers (`/` and `/healthz`) to include
pod-specific headers (`x-pod-name` and `x-pod-namespace`).

Introduced new environment variables to control HTTP and HTTPS listeners:
- `HTTP2_TEST_SERVER_ENABLE_HTTP_LISTENER`: Enables/disables the HTTP listener.
- `HTTP2_TEST_SERVER_ENABLE_HTTPS_LISTENER`: Enables/disables the HTTPS listener.

Improved error handling to log and terminate if no listeners are
enabled, providing flexibility in determining which listeners to
activate.
Pickup openshift/api#2102

% git show 27316471eb72fe8fcf0d44fb5a0602f698f253dc
commit 27316471eb72fe8fcf0d44fb5a0602f698f253dc
Merge: de9de05a8 b7417509c
Author: openshift-merge-bot[bot] <148852131+openshift-merge-bot[bot]@users.noreply.github.com>
Date:   Wed Dec 18 10:31:50 2024 +0000

    Merge pull request #2102 from frobware/OCPBUGS-43745-idle-close-on-response

    OCPBUGS-43745: Add IdleCloseOnResponse field to IngressControllerSpec

Vendor steps:

$ go mod edit -replace github.com/openshift/api=github.com/openshift/api@27316471eb72fe8fcf0d44fb5a0602f698f253dc
$ go mod tidy
$ go mod vendor
$ make update
@openshift-ci-robot openshift-ci-robot added jira/severity-critical Referenced Jira bug's severity is critical for the branch this PR is targeting. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. labels Jan 13, 2025
@openshift-ci-robot
Copy link
Contributor

@frobware: This pull request references Jira Issue OCPBUGS-43745, which is valid.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target version (4.19.0) matches configured target version for branch (4.19.0)
  • bug is in the state POST, which is one of the valid states (NEW, ASSIGNED, POST)

Requesting review from QA contact:
/cc @lihongan

The bug has been updated to refer to the pull request using the external bug tracker.

In response to this:

This is like #1166 but uses the Go http.Client.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@frobware
Copy link
Contributor Author

/hold

@openshift-ci openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jan 13, 2025
@frobware
Copy link
Contributor Author

/testwith openshift/cluster-ingress-operator/master/e2e-aws-operator openshift/router#639

@frobware frobware force-pushed the OCPBUGS-43745-idle-close-on-response-net-http-client branch 2 times, most recently from c2fb259 to ec270a8 Compare January 14, 2025 09:37
@frobware
Copy link
Contributor Author

/retest

@frobware
Copy link
Contributor Author

/testwith openshift/cluster-ingress-operator/master/e2e-aws-operator openshift/router#639

@frobware
Copy link
Contributor Author

/retest-required

1 similar comment
@frobware
Copy link
Contributor Author

/retest-required

@frobware
Copy link
Contributor Author

/test e2e-aws-operator

@frobware
Copy link
Contributor Author

/testwith openshift/cluster-ingress-operator/master/e2e-aws-operator openshift/router#639

@frobware
Copy link
Contributor Author

/test e2e-aws-operator

@frobware
Copy link
Contributor Author

/test e2e-azure-operator

@frobware
Copy link
Contributor Author

/test e2e-gcp-operator

@frobware frobware force-pushed the OCPBUGS-43745-idle-close-on-response-net-http-client branch from ec270a8 to 6879c2f Compare January 15, 2025 19:12
@frobware
Copy link
Contributor Author

/testwith openshift/cluster-ingress-operator/master/e2e-aws-operator openshift/router#639

@frobware frobware force-pushed the OCPBUGS-43745-idle-close-on-response-net-http-client branch 2 times, most recently from ed35fca to e3a69e3 Compare January 16, 2025 19:27
@frobware
Copy link
Contributor Author

@frobware: any particular reason to run multi pr tests?

/testwith openshift/cluster-ingress-operator/master/e2e-aws-operator https://github.com/openshift/router/pull/639

The router's code was merged 2 weeks ago, it must be in the CI builds since a while now.

FWIW, I tried using /testwith without an additional PR and that did not launch a job, so I'll just continue to use the already merged router PR.

@frobware
Copy link
Contributor Author

/retest-required

Introduce logic in desiredRouterDeployment to set the environment
variable `ROUTER_IDLE_CLOSE_ON_RESPONSE` when the
`IdleConnectionTerminationPolicy` field in the IngressController spec is
set to `Deferred`. This change enables configuring HAProxy with the
`idle-close-on-response` option for better control over idle connection
termination behaviour.
@frobware frobware force-pushed the OCPBUGS-43745-idle-close-on-response-net-http-client branch from aab1312 to 1385b13 Compare January 22, 2025 09:24
@frobware
Copy link
Contributor Author

@alebedev87 I pushed a commit that separates the test from a parallel test with parallel subtests into two discrete tests--that still run in parallel. This change allows me to reliably piece together the test output when reviewing CI jobs. I'm using https://github.com/frobware/go-test-sift to collate parallel test output from CI logs (and locally). I haven't seen Test_IdleConnectionTerminationPolicy flake in the last few days, so I'd consider this my final iteration.

@frobware
Copy link
Contributor Author

/testwith openshift/cluster-ingress-operator/master/e2e-gcp-operator openshift/router#639
/testwith openshift/cluster-ingress-operator/master/e2e-azure-operator openshift/router#639
/testwith openshift/cluster-ingress-operator/master/e2e-aws-operator openshift/router#639

@alebedev87
Copy link
Contributor

/lgtm
/approve

@alebedev87
Copy link
Contributor

/label acknowledge-critical-fixes-only

@openshift-ci openshift-ci bot added lgtm Indicates that a PR is ready to be merged. acknowledge-critical-fixes-only Indicates if the issuer of the label is OK with the policy. labels Jan 22, 2025
Copy link
Contributor

openshift-ci bot commented Jan 22, 2025

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: alebedev87

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jan 22, 2025
@openshift-ci-robot
Copy link
Contributor

/retest-required

Remaining retests: 0 against base HEAD 2856e6d and 2 for PR HEAD 1385b13 in total

@frobware
Copy link
Contributor Author

/retest-required

@frobware
Copy link
Contributor Author

e2e-aws-ovn-serial and e2e-hypershift are failing in an empty PR: #1185 (comment)

@openshift-ci-robot
Copy link
Contributor

/retest-required

Remaining retests: 0 against base HEAD 2856e6d and 2 for PR HEAD 1385b13 in total

1 similar comment
@openshift-ci-robot
Copy link
Contributor

/retest-required

Remaining retests: 0 against base HEAD 2856e6d and 2 for PR HEAD 1385b13 in total

@lihongan
Copy link
Contributor

Did pre-merge test and looks good.

##  no "option idle-close-on-response" in default router pod
$ oc -n openshift-ingress-operator get ingresscontroller/default -oyaml | yq .spec.idleConnectionTerminationPolicy
Immediate
$ oc -n openshift-ingress get deployment/router-default -oyaml | grep IDLE
(null)
$ oc -n openshift-ingress exec router-default-595897545-kfczn -- cat haproxy.config | grep idle
(null)

## create custom ingresscontroller and update spec.idleConnectionTerminationPolicy
$ oc -n openshift-ingress-operator get ingresscontroller/test -oyaml | yq .spec.idleConnectionTerminationPolicy
Immediate

$ oc -n openshift-ingress-operator patch ingresscontroller/test --type=merge -p '{"spec":{"idleConnectionTerminationPolicy":"Deferred"}}'
ingresscontroller.operator.openshift.io/test patched

$ oc -n openshift-ingress get deployment/router-test -oyaml | grep IDLE -A1
        - name: ROUTER_IDLE_CLOSE_ON_RESPONSE
          value: "true"

$ oc -n openshift-ingress exec router-test-6c756c4ff8-gdv4s -- cat haproxy.config | grep idle -B4
  frontend public
    
  bind :80
  mode http
  option idle-close-on-response
--
frontend fe_sni
  # terminate ssl on edge
  bind unix@/var/lib/haproxy/run/haproxy-sni.sock ssl crt /var/lib/haproxy/router/certs/default.pem crt-list /var/lib/haproxy/conf/cert_config.map accept-proxy no-alpn
  mode http
  option idle-close-on-response
--
frontend fe_no_sni
  # terminate ssl on edge
  bind unix@/var/lib/haproxy/run/haproxy-no-sni.sock ssl crt /var/lib/haproxy/router/certs/default.pem accept-proxy no-alpn
  mode http
  option idle-close-on-response

/label qe-approved

@openshift-ci openshift-ci bot added the qe-approved Signifies that QE has signed off on this PR label Jan 23, 2025
@openshift-ci-robot
Copy link
Contributor

@frobware: This pull request references Jira Issue OCPBUGS-43745, which is valid.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target version (4.19.0) matches configured target version for branch (4.19.0)
  • bug is in the state POST, which is one of the valid states (NEW, ASSIGNED, POST)

Requesting review from QA contact:
/cc @lihongan

In response to this:

Introduce logic in desiredRouterDeployment to set the environmentvariable ROUTER_IDLE_CLOSE_ON_RESPONSE when the IdleConnectionTerminationPolicy field in the IngressController spec is set to Deferred. This change enables configuring HAProxy with the idle-close-on-response option for better control over idle connection termination behaviour.

Add an e2e test Test_IdleConnectionTerminationPolicy to verify the behaviour.

Requires:

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@frobware
Copy link
Contributor Author

The e2e-aws-ovn-serial job is failing with "Image signature workflow can push a signed image to openshift registry and verify it" - related links:

@openshift-ci-robot
Copy link
Contributor

/retest-required

Remaining retests: 0 against base HEAD 2856e6d and 2 for PR HEAD 1385b13 in total

Copy link
Contributor

openshift-ci bot commented Jan 23, 2025

@frobware: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/okd-scos-e2e-aws-ovn 1385b13 link false /test okd-scos-e2e-aws-ovn
ci/prow/e2e-azure-ovn 1385b13 link false /test e2e-azure-ovn
ci/prow/e2e-aws-operator-techpreview 1385b13 link false /test e2e-aws-operator-techpreview

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@openshift-ci-robot
Copy link
Contributor

/retest-required

Remaining retests: 0 against base HEAD 2856e6d and 2 for PR HEAD 1385b13 in total

@openshift-merge-bot openshift-merge-bot bot merged commit cba3e44 into openshift:master Jan 24, 2025
19 of 22 checks passed
@openshift-ci-robot
Copy link
Contributor

@frobware: Jira Issue OCPBUGS-43745: All pull requests linked via external trackers have merged:

Jira Issue OCPBUGS-43745 has been moved to the MODIFIED state.

In response to this:

Introduce logic in desiredRouterDeployment to set the environmentvariable ROUTER_IDLE_CLOSE_ON_RESPONSE when the IdleConnectionTerminationPolicy field in the IngressController spec is set to Deferred. This change enables configuring HAProxy with the idle-close-on-response option for better control over idle connection termination behaviour.

Add an e2e test Test_IdleConnectionTerminationPolicy to verify the behaviour.

Requires:

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-bot
Copy link
Contributor

[ART PR BUILD NOTIFIER]

Distgit: ose-cluster-ingress-operator
This PR has been included in build ose-cluster-ingress-operator-container-v4.19.0-202501240608.p0.gcba3e44.assembly.stream.el9.
All builds following this will include this PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
acknowledge-critical-fixes-only Indicates if the issuer of the label is OK with the policy. approved Indicates a PR has been approved by an approver from all required OWNERS files. jira/severity-critical Referenced Jira bug's severity is critical for the branch this PR is targeting. jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. lgtm Indicates that a PR is ready to be merged. qe-approved Signifies that QE has signed off on this PR
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants