Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OCPBUGS-37220: Fix DNS for Gateway API on AWS #1108

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

candita
Copy link
Contributor

@candita candita commented Jul 19, 2024

Before this change, a new Gateway API dnsRecord created on AWS was being deleted right after it was published.

pkg/operator/controller/gateway-service-dns/controller.go - add a predicate to ensure only dnsRecords for Gateway API are watched; export the ManagedByIstioLabelKey for use by the general dns controller; add logging to ensureDNSRecordsForGateway and deleteStaleDNSRecordsForGateway

pkg/resources/dnsrecord/dns.go - minor tweak and logging added to EnsureDNSRecord
pkg/dns/aws/dns.go - add logging

@openshift-ci-robot openshift-ci-robot added the jira/severity-moderate Referenced Jira bug's severity is moderate for the branch this PR is targeting. label Jul 19, 2024
@openshift-ci openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jul 19, 2024
Copy link
Contributor

openshift-ci bot commented Jul 19, 2024

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Jul 19, 2024
@openshift-ci-robot
Copy link
Contributor

@candita: This pull request references Jira Issue OCPBUGS-37220, which is invalid:

  • expected the bug to target the "4.17.0" version, but no target version was set

Comment /jira refresh to re-evaluate validity if changes to the Jira bug are made, or edit the title of this pull request to link to a different bug.

The bug has been updated to refer to the pull request using the external bug tracker.

In response to this:

Before this change, a new Gateway API dnsRecord created on AWS was being deleted right after it was published.

pkg/operator/controller/dns/controller.go - make the general watch for dnsRecords skip Gateway API dnsRecords

pkg/operator/controller/gateway-service-dns/controller.go - add a predicate to ensure only dsnRecords for Gateway API are watched; remove the restriction that watched only Services in openshift-ingress namespace; export the ManagedByIstioLabelKey for use by the general dns controller; add logging to ensureDNSRecordsForGateway and deleteStaleDNSRecordsForGateway

pkg/resources/dnsrecord/dns.go - minor tweak and logging added to EnsureDNSRecord
pkg/dns/aws/dns.go - add logging

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci-robot openshift-ci-robot added the jira/invalid-bug Indicates that a referenced Jira bug is invalid for the branch this PR is targeting. label Jul 19, 2024
@candita candita force-pushed the OCPBUGS-37220-GWAPI-AWS-DNS branch from b04083d to b6dc40b Compare July 19, 2024 19:21
@candita
Copy link
Contributor Author

candita commented Jul 19, 2024

/test all

@candita
Copy link
Contributor Author

candita commented Jul 19, 2024

/jira refresh

@openshift-ci-robot openshift-ci-robot added jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. and removed jira/invalid-bug Indicates that a referenced Jira bug is invalid for the branch this PR is targeting. labels Jul 19, 2024
@openshift-ci-robot
Copy link
Contributor

@candita: This pull request references Jira Issue OCPBUGS-37220, which is valid. The bug has been moved to the POST state.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target version (4.17.0) matches configured target version for branch (4.17.0)
  • bug is in the state New, which is one of the valid states (NEW, ASSIGNED, POST)

Requesting review from QA contact:
/cc @lihongan

In response to this:

/jira refresh

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci openshift-ci bot requested a review from lihongan July 19, 2024 19:23
@candita candita force-pushed the OCPBUGS-37220-GWAPI-AWS-DNS branch from b6dc40b to 3101dbc Compare July 20, 2024 00:11
@candita
Copy link
Contributor Author

candita commented Jul 23, 2024

/test all

@openshift-merge-robot openshift-merge-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Aug 9, 2024
@openshift-bot
Copy link
Contributor

Issues go stale after 90d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle stale

@openshift-ci openshift-ci bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Nov 7, 2024
@candita
Copy link
Contributor Author

candita commented Nov 25, 2024

/lifecycle-remove stale

@openshift-bot
Copy link
Contributor

Stale issues rot after 30d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle rotten
/remove-lifecycle stale

@openshift-ci openshift-ci bot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Dec 26, 2024
@lihongan
Copy link
Contributor

/remove-lifecycle rotten

@openshift-ci openshift-ci bot removed the lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. label Dec 26, 2024
@candita candita force-pushed the OCPBUGS-37220-GWAPI-AWS-DNS branch from 3101dbc to 13ccae7 Compare January 6, 2025 21:11
@openshift-merge-robot openshift-merge-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jan 6, 2025
Copy link
Contributor

openshift-ci bot commented Jan 6, 2025

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please ask for approval from candita. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@candita
Copy link
Contributor Author

candita commented Jan 6, 2025

/test all

@candita candita marked this pull request as ready for review January 6, 2025 21:40
@openshift-ci openshift-ci bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jan 6, 2025
@openshift-ci openshift-ci bot requested review from frobware and Miciah January 6, 2025 21:41
@candita
Copy link
Contributor Author

candita commented Jan 7, 2025

level=fatal msg=failed to fetch Cluster Infrastructure Variables: failed to fetch dependency of "Cluster Infrastructure Variables": failed to generate asset "Platform Provisioning Check": metadata.name: Invalid value: "ci-op-qgjbb60q-d9859": record(s) ["api.ci-op-qgjbb60q-d9859.XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX."] already exists in DNS Zone (XXXXXXXXXXXXXXXXXXXXXX/origin-ci-int-gce) and might be in use by another cluster, please remove it to continue
Installer exit with code 1

/test e2e-gcp-ovn

@candita
Copy link
Contributor Author

candita commented Jan 7, 2025

/retest

@candita
Copy link
Contributor Author

candita commented Jan 10, 2025

https://issues.redhat.com/browse/HOSTEDCP-2236
/test hypershift-e2e-aks

@bryan-cox
Copy link
Member

/test hypershift-e2e-aks
VAP flake

@rhamini3
Copy link

Pre-merge verified, mark as approved

Create a gateway using listeners and a secret

oc create -f - <<EOF
apiVersion: gateway.networking.k8s.io/v1
kind: Gateway
metadata:
  name: gateway
  namespace: openshift-ingress
spec:
  gatewayClassName: openshift-default
  listeners:
  - name: http
    hostname: "*.$gwapi_domain"
    port: 80
    protocol: HTTP
    allowedRoutes:
      namespaces:
        from: All
  - name: https
    hostname: "*.$gwapi_domain"
    port: 443
    protocol: HTTPS
    tls:
      mode: Terminate
      certificateRefs:
      - name: gwapi-wildcard
    allowedRoutes:
      namespaces:
        from: All
EOF
check the DNSRecord and confirm that it is published, and not automatically deleted

% oc -n openshift-ingress get dnsrecord -o yaml
apiVersion: v1
items:
- apiVersion: ingress.operator.openshift.io/v1
  kind: DNSRecord
  metadata:
    annotations:
      ingress.operator.openshift.io/target-hosted-zone-id: Z1H1FL5HABSF5
    creationTimestamp: "2025-01-13T17:51:19Z"
    finalizers:
    - operator.openshift.io/ingress-dns
    generation: 1
    labels:
      gateway.istio.io/managed: openshift.io-gateway-controller
      istio.io/gateway-name: gateway
    name: gateway-756bfb4988-wildcard
    namespace: openshift-ingress
    ownerReferences:
    - apiVersion: v1
      kind: Service
      name: gateway-openshift-default
      uid: 2fb04bf5-7dc8-422a-b1f7-564ba9112110
    resourceVersion: "89489"
    uid: 86eefd2e-2e17-4c4e-87aa-338dc806bd26
  spec:
    dnsManagementPolicy: Managed
    dnsName: '*.gwapi.ci-ln-mmjjxht-76ef8.aws-2.ci.openshift.org.'
    recordTTL: 30
    recordType: CNAME
    targets:
    - a2fb04bf57dc8422ab1f7564ba911211-1876664505.us-west-2.elb.amazonaws.com
  status:
    observedGeneration: 1
    zones:
    - conditions:
      - lastTransitionTime: "2025-01-13T17:51:19Z"
        message: The DNS provider succeeded in ensuring the record
        reason: ProviderSuccess
        status: "True"
        type: Published <--
      dnsZone:
        tags:
          Name: ci-ln-mmjjxht-76ef8-j59lj-int
          kubernetes.io/cluster/ci-ln-mmjjxht-76ef8-j59lj: owned
    - conditions:
      - lastTransitionTime: "2025-01-13T17:51:19Z"
        message: The DNS provider succeeded in ensuring the record
        reason: ProviderSuccess
        status: "True"
        type: Published <--
      dnsZone:
        id: Z00287062J1ITQ61DDU2Z
kind: List
metadata:
  resourceVersion: ""

/label qe-approved

@openshift-ci openshift-ci bot added the qe-approved Signifies that QE has signed off on this PR label Jan 13, 2025
@openshift-ci-robot openshift-ci-robot added jira/invalid-bug Indicates that a referenced Jira bug is invalid for the branch this PR is targeting. and removed jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. labels Jan 13, 2025
@openshift-ci-robot
Copy link
Contributor

@candita: This pull request references Jira Issue OCPBUGS-37220, which is invalid:

  • expected the bug to target either version "4.19." or "openshift-4.19.", but it targets "4.17.z" instead

Comment /jira refresh to re-evaluate validity if changes to the Jira bug are made, or edit the title of this pull request to link to a different bug.

In response to this:

Before this change, a new Gateway API dnsRecord created on AWS was being deleted right after it was published.

pkg/operator/controller/dns/controller.go - make the general watch for dnsRecords skip Gateway API dnsRecords

pkg/operator/controller/gateway-service-dns/controller.go - add a predicate to ensure only dsnRecords for Gateway API are watched; remove the restriction that watched only Services in openshift-ingress namespace; export the ManagedByIstioLabelKey for use by the general dns controller; add logging to ensureDNSRecordsForGateway and deleteStaleDNSRecordsForGateway

pkg/resources/dnsrecord/dns.go - minor tweak and logging added to EnsureDNSRecord
pkg/dns/aws/dns.go - add logging

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@bryan-cox
Copy link
Member

HyperShift AKS test is failing due to a CRIO/RHCOS issue. More info here - https://redhat-internal.slack.com/archives/C01CQA76KMX/p1736790655245439?thread_ts=1736785822.924329&cid=C01CQA76KMX.

@candita
Copy link
Contributor Author

candita commented Jan 16, 2025

/retest

Copy link
Contributor

openshift-ci bot commented Jan 16, 2025

@candita: all tests passed!

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@candita
Copy link
Contributor Author

candita commented Jan 17, 2025

/jira refresh

@openshift-ci-robot openshift-ci-robot added jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. and removed jira/invalid-bug Indicates that a referenced Jira bug is invalid for the branch this PR is targeting. labels Jan 17, 2025
@openshift-ci-robot
Copy link
Contributor

@candita: This pull request references Jira Issue OCPBUGS-37220, which is valid.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target version (4.19.0) matches configured target version for branch (4.19.0)
  • bug is in the state POST, which is one of the valid states (NEW, ASSIGNED, POST)

No GitHub users were found matching the public email listed for the QA contact in Jira ([email protected]), skipping review request.

In response to this:

/jira refresh

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@Miciah
Copy link
Contributor

Miciah commented Jan 22, 2025

/assign

@candita
Copy link
Contributor Author

candita commented Jan 22, 2025

/assign @grzpiotrowski

Copy link
Contributor

@grzpiotrowski grzpiotrowski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

Just one thing, should we add the commit description/body?

Leaving out the /approve to @Miciah

Also I think this case could be good to test for in e2e?

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Jan 31, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
jira/severity-moderate Referenced Jira bug's severity is moderate for the branch this PR is targeting. jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. lgtm Indicates that a PR is ready to be merged. qe-approved Signifies that QE has signed off on this PR
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants