-
Notifications
You must be signed in to change notification settings - Fork 209
OCPBUGS-57646: Unify capitalization when comparing architectures for available updates #1255
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: DavidHurta The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
Installing a 4.18 cluster (My attempts for a multi 4.21 with expanded logging were futile): Available updates are as expected: Migrate to multi-arch: Wait for the migration to complete... The available updates still only show the current version. Running a local CVO with the PR's changes proceeds to fix the available updates: even with a set |
|
Running the introduced tests without the changes in the The second test is interesting as it highlights that the wrong |
Otherwise, the following condition in cincinnati.go in MultiArch clusters:
```
if desiredArch == string(configv1.ClusterVersionArchitectureMulti) && currentArch != desiredArch {
return current, []configv1.Release{current}, nil, nil
}
```
gets evaluated to:
```
if "Multi" == string(configv1.ClusterVersionArchitectureMulti) && "multi" != "Multi" {
return current, []configv1.Release{current}, nil, nil
}
```
This will cause MultiArch clusters with a set non-empty
`ClusterVersion.Spec.DesiredUpdate.Architecture` field to indefinitely
have available updates set to `[]configv1.Release{current}` because
the `"multi" != "Multi"` logic will always match.
The commit does change the default behaviour of the `getDesiredArchitecture` method. However, the method is only used once in the `syncAvailableUpdates` method and nowhere else. The commit adds the subsequent logic for evaluating a desired architecture to the method itself and implements the `getCurrentArchitecture` method. The goal is to introduces "getters" for such values where the unified capitalization is enforced, and their return values are of the same nature (e.g., "Multi", "amd64", ...).
9b25c87 to
e496f4b
Compare
|
@DavidHurta: This pull request references Jira Issue OCPBUGS-57646, which is valid. The bug has been moved to the POST state. 3 validation(s) were run on this bug
Requesting review from QA contact: The bug has been updated to refer to the pull request using the external bug tracker. In response to this: Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
@DavidHurta: This pull request references Jira Issue OCPBUGS-57646, which is valid. 3 validation(s) were run on this bug
Requesting review from QA contact: In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
| currentArch := runtime.GOARCH | ||
|
|
||
| if optr.release.Architecture == configv1.ClusterVersionArchitectureMulti { | ||
| currentArch = "multi" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need this downcased form for Cincinnati, right? Here's Multi returning nothing, while multi returns 4.20.0:
$ curl -s 'https://api.openshift.com/api/upgrades_info/graph?channel=stable-4.20&arch=Multi' | jq .nodes
null
$ curl -s 'https://api.openshift.com/api/upgrades_info/graph?channel=stable-4.20&arch=multi' | jq -c '[.nodes[] | .version]'
["4.20.0"]There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, however, the currentArch variable is only used for the following comparison:
if desiredArch == string(configv1.ClusterVersionArchitectureMulti) && currentArch != desiredArch {
return current, []configv1.Release{current}, nil, nil
}
We use the desiredArch for the query parameter, "which is downcased" before creating the query parameters at:
cluster-version-operator/pkg/cincinnati/cincinnati.go
Lines 90 to 93 in 667bc63
| releaseArch := desiredArch | |
| if desiredArch == string(configv1.ClusterVersionArchitectureMulti) { | |
| releaseArch = "multi" | |
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Trying to check in multi-arch CI: 4.21 nightly multi CI -> 4.21.0-0.nightly-multi-2025-10-30-034235 -> but both e2e-ovn-serial-aws-multi-a-a-1of2 and e2e-ovn-serial-aws-multi-a-a-2of2 are failing to bootstrap.
Moving back to 4.20.0-0.nightly-multi: 4.20.0-0.nightly-multi-2025-10-30-035942 -> e2e-ovn-serial-aws-multi-a-a-1of2 -> Artifacts -> ... -> e2e artifacts:
$ curl -s https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/test-platform-results/logs/periodic-ci-openshift-multiarch-master-nightly-4.20-ocp-e2e-ovn-serial-aws-multi-a-a-1of2/1983747935866720256/artifacts/ocp-e2e-ovn-serial-aws-multi-a-a/openshift-e2e-test/artifacts/e2e.log | grep 'upgrade recommend'
started: 0/4/45 "[Serial][sig-cli] oc adm upgrade recommend When the update service has conditional recommendations runs successfully with conditional recommendations to the --version target [Suite:openshift/conformance/serial]"
passed: (26.9s) 2025-10-30T05:58:16 "[Serial][sig-cli] oc adm upgrade recommend When the update service has conditional recommendations runs successfully with conditional recommendations to the --version target [Suite:openshift/conformance/serial]"
started: 0/24/45 "[Serial][sig-cli] oc adm upgrade recommend When the update service has no recommendations runs successfully [Suite:openshift/conformance/serial]"
passed: (22s) 2025-10-30T06:22:53 "[Serial][sig-cli] oc adm upgrade recommend When the update service has no recommendations runs successfully [Suite:openshift/conformance/serial]"which means this test logic is happy about the CVO showing the recommended update to the sha256:cccc... release. Checking gather-extra pod logs:
$ curl -s https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/test-platform-results/logs/periodic-ci-openshift-multiarch-master-nightly-4.20-ocp-e2e-ovn-serial-aws-multi-a-a-1of2/1983747935866720256/artifacts/ocp-e2e-ovn-serial-aws-multi-a-a/gather-extra/artifacts/pods/openshift-cluster-version_cluster-version-operator-f89bdbd46-sln5x_cluster-version-operator.log | grep '1030 05:5[78].* \(cincinnati\|availableupdates\)'
I1030 05:57:50.548607 1 availableupdates.go:83] Retrieving available updates again, because more than 2m35.488910548s has elapsed since last attempt at 2025-10-30T05:54:37Z
I1030 05:57:56.533407 1 availableupdates.go:77] Retrieving available updates again, because the channel has changed from "" to "test-channel"
I1030 05:57:56.537343 1 cincinnati.go:114] Using a root CA pool with 0 root CA subjects to request updates from https://api.openshift.com/api/upgrades_info/v1/graph?arch=multi&channel=test-channel&id=021db8b4-7df4-4870-a035-24fc93f70d05&version=4.20.0-0.nightly-multi-2025-10-29-210751
I1030 05:57:56.800389 1 availableupdates.go:398] Update service https://api.openshift.com/api/upgrades_info/v1/graph could not return available updates: VersionNotFound: currently reconciling cluster version 4.20.0-0.nightly-multi-2025-10-29-210751 not found in the "test-channel" channel
I1030 05:57:56.811990 1 availableupdates.go:98] Available updates were recently retrieved, with less than 2m35.488910548s elapsed since 2025-10-30T05:57:56Z, will try later.
I1030 05:57:57.042846 1 availableupdates.go:103] Retrieving available updates again, because the update service has changed from "" to "http://172.30.166.91:8000/graph" from ClusterVersion spec.upstream
I1030 05:57:57.045682 1 cincinnati.go:114] Using a root CA pool with 0 root CA subjects to request updates from http://172.30.166.91:8000/graph?arch=multi&channel=test-channel&id=021db8b4-7df4-4870-a035-24fc93f70d05&version=4.20.0-0.nightly-multi-2025-10-29-210751
I1030 05:57:57.101065 1 availableupdates.go:98] Available updates were recently retrieved, with less than 2m35.488910548s elapsed since 2025-10-30T05:57:57Z, will try later.
I1030 05:58:06.331303 1 availableupdates.go:98] Available updates were recently retrieved, with less than 2m35.488910548s elapsed since 2025-10-30T05:57:57Z, will try later.
I1030 05:58:16.883450 1 availableupdates.go:103] Retrieving available updates again, because the update service has changed from "http://172.30.166.91:8000/graph" to "https://api.openshift.com/api/upgrades_info/v1/graph" from the operator's default update service
I1030 05:58:16.886723 1 cincinnati.go:114] Using a root CA pool with 0 root CA subjects to request updates from https://api.openshift.com/api/upgrades_info/v1/graph?arch=multi&channel=test-channel&id=021db8b4-7df4-4870-a035-24fc93f70d05&version=4.20.0-0.nightly-multi-2025-10-29-210751
I1030 05:58:17.152154 1 availableupdates.go:398] Update service https://api.openshift.com/api/upgrades_info/v1/graph could not return available updates: VersionNotFound: currently reconciling cluster version 4.20.0-0.nightly-multi-2025-10-29-210751 not found in the "test-channel" channel
I1030 05:58:17.164073 1 availableupdates.go:98] Available updates were recently retrieved, with less than 2m35.488910548s elapsed since 2025-10-30T05:58:17Z, will try later.well, I can see arch=multi in there, and it's not talking about VersionNotFound when pulling from the e2e test's 172.30.166.91. Would be nice if it logged this single->multi transition branch. Ah here is cincinnati.go doing the Multi -> multi downcasing, so that's one bit I'd missed earlier.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah, the multi CI is passing because those tests aren't setting spec.desired.architecture: Multi. We should grow some CI that exercises that.
|
@DavidHurta: all tests passed! Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
|
/cc |
| } | ||
|
|
||
| func (optr *Operator) getDesiredArchitecture(update *configv1.Update) string { | ||
| if update != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be if update != nil && len(update.Architecture) > 0? Otherwise, we could return an empty string here, and we'd rather fall back to getCurrentArchitecture, because your syncAvailableUpdates change is dropping the old:
if desiredArch == "" {
desiredArch = currentArch
}| } | ||
| } | ||
|
|
||
| func TestSyncAvailableUpdatesMultiArchAfterMigrationDesiredUpdateNil(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: looks like this could be collapsed with TestSyncAvailableUpdatesMultiArchAfterMigration into a single TestSyncAvailableUpdatesMultiArch function that iterated over (test-case name, desired, expected) tuples, to reduce duplication, and make it easy to add a test-case for desired set, but architecture unset.
availableupdates: Unify capitalization when comparing architecturesOtherwise, the following condition in cincinnati.go in MultiArch clusters:
gets evaluated to:
This will cause MultiArch clusters with a set non-empty
ClusterVersion.Spec.DesiredUpdate.Architecturefield to indefinitelyhave available updates set to
[]configv1.Release{current}becausethe
"multi" != "Multi"logic will always match.availableupdates: Refactor getting desired/current architecture