Skip to content

Conversation

mzazrivec
Copy link
Contributor

@mzazrivec mzazrivec commented Apr 9, 2025

What type of PR is this?

/kind feature

What this PR does / why we need it:

This pull request implements CRD and a controller for provisioning complete networking infrastructure required to install a ROSA-HCP cluster in AWS. The proposal for this implementation has been described in #5381.

Under the hood, the implementation uses cloudformation stack and a static (i.e. no possibility of customization) cloudformation template from rosa-cli

This pull request depends on openshift/rosa#2904 (now merged).

Quick howto:

$ export ROSA_NETWORK_NAME=rosa-net-01
$ export AWS_REGION=us-west-2
$ export AVAILABILITY_ZONE_COUNT=2
$ export CIDR_BLOCK=10.0.0.0/16
$ clusterctl generate yaml --from templates/rosa-network.yaml > rosa-net-01.yaml
$ kubectl apply -f rosa-net-01.yaml

To use the ROSANetwork from ROSA control plane:

apiVersion: controlplane.cluster.x-k8s.io/v1beta2
kind: ROSAControlPlane
metadata:
  name: rosa-hcp01-control-plane
  namespace: default
spec:
  rosaNetworkRef:
    name: rosa-net01

and skip / remove subnets and availability zones from the CP spec.

Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged):
Fixes #

Special notes for your reviewer:

Checklist:

  • squashed commits
  • includes documentation
  • includes emoji in title
  • adds unit tests
  • adds or updates e2e tests

Release note:

New API for provisioning network infrastructure for ROSA clusters

@k8s-ci-robot k8s-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. kind/feature Categorizes issue or PR as related to a new feature. do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-priority labels Apr 9, 2025
@k8s-ci-robot k8s-ci-robot requested review from faiq and serngawy April 9, 2025 19:27
@k8s-ci-robot k8s-ci-robot added needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Apr 9, 2025
@k8s-ci-robot
Copy link
Contributor

Hi @mzazrivec. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

webhookClientConfig:
# this is "\n" used as a placeholder, otherwise it will be rejected by the apiserver for being blank,
# but we're going to set it later using the cert-manager (or potentially a patch if not using cert-manager)
caBundle: Cg==
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to add the caBundle.

Resource string `json:"resource"`

// Identified of the created resource. Will be filled in once the resource is created & ready
ID string `json:"ID"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
ID string `json:"ID"`
Id string `json:"id"`

Or resourceId

// CFResource groups information pertaining to a resource created as a part of a cloudformation stack
type CFResource struct {
// Name of the created resource: NATGateway1, VPC, SecurityGroup, ...
Resource string `json:"resource"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Resource string `json:"resource"`
Name string `json:"name"`

OR resourceName

Status string `json:"status"`

// Message pertaining to the status of the resource
Reason string `json:"reason"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

message is better I guess ?

Suggested change
Reason string `json:"reason"`
Message string `json:"message"`

// Availability zone of the subnet pair
AvailabilityZone string `json:"availabilityZone"`

// ID of the public subnet
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// ID of the public subnet
// Public subnet Id ex; subnet-xxxxxxxxxx

main.go Outdated
}
}

// TODO: feature gates?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need a new feature gate, we can have it under ROSA feature gate

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. I did not mean a new feature gate here, just the existing rosa FG.

@serngawy
Copy link
Contributor

you also need to update the ValidatingWebhookConfiguration and MutatingWebhookConfiguration here

@mzazrivec mzazrivec force-pushed the rosa_network branch 4 times, most recently from 5907fb1 to 24a5950 Compare April 24, 2025 13:20
@mzazrivec mzazrivec force-pushed the rosa_network branch 3 times, most recently from a947563 to a255790 Compare May 19, 2025 13:43
Copy link
Contributor

@serngawy serngawy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/ok-to-test

// If no identity is specified, the default identity for this controller will be used.
//
// +optional
IdentityRef *infrav1.AWSIdentityReference `json:"identityRef,omitempty"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure, if we want to provide this option to end user. We don't do that with RosaControlPlane only default aws identity. However, we should provide OCM identityRef

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why shouldn't we provide this option to the end user? We need to specify the ref to the aws secret somehow. Here I'm just reusing existing structures & code.

What do you mean by OCM identity ref? OCM will not be involved here in any way.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well, to use openshift/rosa and establish ocm client you need to have ocm authentication. Is this not the case with the RosaNetwork CF stack creation?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No. No OCM credentials are needed for rosanet, just AWS credentials.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@serngawy Are you satisfied with the answers here?

Copy link
Contributor

@serngawy serngawy Aug 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mzazrivec I do remember we discuss that, but after checking the ROSANetwork cloud formation stack template , there are tags added as rosa_hcp_policy and roas service here.
Those tags I think is used to check for privileges ?
I think we have to authenticate the ocm credential. Even if we don't need to create the CF stack but enduser must be a valid OCM user.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@serngawy What does creating VPC with certain tags and checking OCM credentials have to do with each other?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

okay, as we discussed no need to have ocm authentication.

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels May 28, 2025
@mzazrivec mzazrivec force-pushed the rosa_network branch 4 times, most recently from d2534a7 to dcc599d Compare June 9, 2025 08:19
@mzazrivec mzazrivec force-pushed the rosa_network branch 2 times, most recently from 023f99b to 1729dff Compare June 27, 2025 12:31
Comment on lines 158 to 160
func getSessionName(region string, clusterScoper cloud.SessionMetadata) string {
return fmt.Sprintf("%s-%s-%s", region, clusterScoper.InfraClusterName(), clusterScoper.Namespace())
return fmt.Sprintf("%s-%s-%s-%s", region, clusterScoper.ControllerName(), clusterScoper.InfraClusterName(), clusterScoper.Namespace())
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This changes the session name in all the places already using this (not only ROSA ones). Is this a backward compatible change? (cc. @richardcase @punkwalker)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess it should be okay as when the capa-manager update happen all cache sessions will be re-created with the new session name.
Does the cache stored some how even when the pod re-created, we need to fail safe when loading session fail ?

// Is the referenced ROSANetwork ready yet?
if !conditions.IsTrue(rosaNet, expinfrav1.ROSANetworkReadyCondition) {
rosaScope.Info(fmt.Sprintf("referenced ROSANetwork %s is not ready", rosaNet.Name))
return ctrl.Result{RequeueAfter: time.Minute}, nil
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A minute seems quite a lot here, no?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It takes about 5 minutes to create the cloudformation stack, i.e. approximately 5 cycles through the reconciliation loop. I'm fine with making it smaller (suggestions welcome), but not quite sure how it would help or improve the situation.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok and do we need to requeue after or are we watching and we should get a reconciliation event anyway?

Copy link
Member

@damdo damdo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

Thanks for addressing my comments

/assign @nrb @richardcase

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Oct 2, 2025
@k8s-ci-robot
Copy link
Contributor

LGTM label has been added.

Git tree hash: 5f64d3fa3341e2ef3426645208e69a234b4147e3

@damdo
Copy link
Member

damdo commented Oct 6, 2025

Based on the convo at https://kubernetes.slack.com/archives/CD6U2V71N/p1759420550389709

/cherry-pick release-2.9

@k8s-infra-cherrypick-robot

@damdo: once the present PR merges, I will cherry-pick it on top of release-2.9 in a new PR and assign it to you.

In response to this:

Based on the convo at https://kubernetes.slack.com/archives/CD6U2V71N/p1759420550389709

/cherry-pick release-2.9

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@richardcase
Copy link
Member

richardcase commented Oct 7, 2025

/approve

@damdo damdo added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Oct 7, 2025
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: richardcase

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@damdo
Copy link
Member

damdo commented Oct 7, 2025

/cherry-pick release-2.9

@k8s-infra-cherrypick-robot

@damdo: once the present PR merges, I will cherry-pick it on top of release-2.9 in a new PR and assign it to you.

In response to this:

/cherry-pick release-2.9

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Oct 7, 2025
@mzazrivec
Copy link
Contributor Author

/retest

@damdo
Copy link
Member

damdo commented Oct 7, 2025

Re-adding LGTM after rebase

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Oct 7, 2025
@k8s-ci-robot
Copy link
Contributor

LGTM label has been added.

Git tree hash: ebc14b0e8b4a2ee7c8be29ba1b22cc7b2fbbb814

@k8s-ci-robot k8s-ci-robot merged commit 516f342 into kubernetes-sigs:main Oct 7, 2025
17 checks passed
@k8s-infra-cherrypick-robot

@damdo: #5464 failed to apply on top of branch "release-2.9":

Applying: RosaNetwork: new CRD & reconciler to provision net infra for ROSA-HCP
Using index info to reconstruct a base tree...
M	PROJECT
M	config/crd/bases/controlplane.cluster.x-k8s.io_rosacontrolplanes.yaml
M	config/crd/kustomization.yaml
M	config/rbac/role.yaml
M	controlplane/rosa/api/v1beta2/rosacontrolplane_types.go
M	controlplane/rosa/api/v1beta2/rosacontrolplane_webhook.go
M	controlplane/rosa/api/v1beta2/zz_generated.deepcopy.go
M	controlplane/rosa/controllers/rosacontrolplane_controller.go
M	exp/api/v1beta2/zz_generated.deepcopy.go
M	go.mod
M	main.go
Falling back to patching base and 3-way merge...
Auto-merging main.go
Auto-merging go.mod
CONFLICT (content): Merge conflict in go.mod
Auto-merging exp/api/v1beta2/zz_generated.deepcopy.go
CONFLICT (content): Merge conflict in exp/api/v1beta2/zz_generated.deepcopy.go
Auto-merging controlplane/rosa/controllers/rosacontrolplane_controller.go
CONFLICT (content): Merge conflict in controlplane/rosa/controllers/rosacontrolplane_controller.go
Auto-merging controlplane/rosa/api/v1beta2/zz_generated.deepcopy.go
CONFLICT (content): Merge conflict in controlplane/rosa/api/v1beta2/zz_generated.deepcopy.go
Auto-merging controlplane/rosa/api/v1beta2/rosacontrolplane_webhook.go
CONFLICT (content): Merge conflict in controlplane/rosa/api/v1beta2/rosacontrolplane_webhook.go
Auto-merging controlplane/rosa/api/v1beta2/rosacontrolplane_types.go
CONFLICT (content): Merge conflict in controlplane/rosa/api/v1beta2/rosacontrolplane_types.go
Auto-merging config/rbac/role.yaml
CONFLICT (content): Merge conflict in config/rbac/role.yaml
Auto-merging config/crd/kustomization.yaml
CONFLICT (content): Merge conflict in config/crd/kustomization.yaml
Auto-merging config/crd/bases/controlplane.cluster.x-k8s.io_rosacontrolplanes.yaml
CONFLICT (content): Merge conflict in config/crd/bases/controlplane.cluster.x-k8s.io_rosacontrolplanes.yaml
Auto-merging PROJECT
CONFLICT (content): Merge conflict in PROJECT
error: Failed to merge in the changes.
hint: Use 'git am --show-current-patch=diff' to see the failed patch
hint: When you have resolved this problem, run "git am --continue".
hint: If you prefer to skip this patch, run "git am --skip" instead.
hint: To restore the original branch and stop patching, run "git am --abort".
hint: Disable this message with "git config advice.mergeConflict false"
Patch failed at 0001 RosaNetwork: new CRD & reconciler to provision net infra for ROSA-HCP

In response to this:

/cherry-pick release-2.9

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@serngawy
Copy link
Contributor

serngawy commented Oct 7, 2025

/cherry-pick release-2.9

@k8s-infra-cherrypick-robot

@serngawy: new pull request created: #5701

In response to this:

/cherry-pick release-2.9

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/feature Categorizes issue or PR as related to a new feature. lgtm "Looks good to me", indicates that a PR is ready to be merged. needs-priority ok-to-test Indicates a non-member PR verified by an org member that is safe to test. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. tide/merge-method-squash Denotes a PR that should be squashed by tide when it merges.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants