Skip to content

NO-JIRA: overrides-c10s: pin kernel to 6.12.0-71 #7

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

jcapiitao
Copy link
Member

This is a workaround for [1]. We are hitting a QEMU bug since the introduction of the kernel patch [2].
With the newer kernel that does not contain this patch, the tests are working.

Note that this is a workaround targeting our tooling (i.e how we run our kola tests), but on the other hand, it covers the usecase of the enduser booting the image with QEMU.

I'm using my fedorapeople.org namespace to serve the packages, let me know if there is a more proper way.

[1] openshift/os#1818
[2] torvalds/linux@6aa989a

This is a workaround for [1]. We are hitting a qemu bug since the
introduction of the kernel patch [2]. With the newer kernel that does
not contain this patch, the tests are working.

[1] openshift/os#1818
[2] torvalds/linux@6aa989a
@openshift-ci-robot
Copy link

@jcapiitao: This pull request explicitly references no jira issue.

In response to this:

This is a workaround for [1]. We are hitting a QEMU bug since the introduction of the kernel patch [2].
With the newer kernel that does not contain this patch, the tests are working.

Note that this is a workaround targeting our tooling (i.e how we run our kola tests), but on the other hand, it covers the usecase of the enduser booting the image with QEMU.

I'm using my fedorapeople.org namespace to serve the packages, let me know if there is a more proper way.

[1] openshift/os#1818
[2] torvalds/linux@6aa989a

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

Copy link
Contributor

openshift-ci bot commented May 16, 2025

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: jcapiitao

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment


#packages:
packages-ppc64le:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we usually do a separate file for arch-specifix overrides

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm I used the same pattern as in openshift/os@5e54e65
In coreos-assembler codebase, I see only manifest-lock.overrides.{arch}.yaml which is handled. In f-c-c and this repo, I don't see manifests or overrides file with arches in filename, but maybe it was the case in the past.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we usually do a separate file for arch-specifix overrides

We do for FCOS where we actually use manifest-lock.overrides.{arch}.yaml as input to the rpm-ostree compose.

For EL we don't have lockfiles so we have to resort to putting the NVRs in the packages: definition.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but when we do add overrides.. this would be the place to do it for c10s.

If we're having the same problem in RHEL then the rhel-10.1 version would be the place for it.

Now that we can do conditional-includes we could consider consolidating those 4 files into one.

# This needs to be removed once https://gitlab.com/qemu-project/qemu/-/issues/2966 is fixed.
[c10s-kernel-6.12.0-71]
name=CentOS Stream 10 - kernel-6.12.0-71.el10
baseurl=https://jcapitao.fedorapeople.org/c10s-kernel-6.12.0-71/
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we'd ever want to pull RPMS from non official places so I don't think we should merge this as is.

I guess the problem is that the CentOS repos don't have older versions of packages in them?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ahh yes:

# These are the official c10s repos. They are slower to update, but contain older
# versions of packages, which is useful when pinning for lack of a "coreos-pool"
# equivalent. When no pinning is needed you may find the compose repo URLs
# defined in c10s.repo are quicker to get new content.

We can just switch to the mirror versions of the repo for now while we wait. Another option is to add a "fast-track" view into that repo just for the kernel-* package so we can continue to get newer versions of everything else.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but I see that over in openshift/os#1818 (comment) it doesn't have every version of the package :(

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, I tried compose and mirror repo URLs before hosting the kernel onto my own fedorapeople.org space.

@dustymabe
Copy link
Member

[2] torvalds/linux@6aa989a

That particular commit is only in the (yet to be released) kernel 6.15.

I think you are saying that our kernels in c10s and RHEL backported that patch and that's why we're seeing the problem here.

If that's the case, what I don't understand is why we're not seeing a similar issue when testing ppc64le in rawhide (that does have 6.15 RC kernels).

@jcapiitao
Copy link
Member Author

[2] torvalds/linux@6aa989a

That particular commit is only in the (yet to be released) kernel 6.15.

I think you are saying that our kernels in c10s and RHEL backported that patch and that's why we're seeing the problem here.

yes, it's already backport in c10s kernel as per [1] (i.e see - powerpc/pseries/iommu: memory notifier incorrectly adds TCEs for pmemory (Mamatha Inamdar) [RHEL-85949])

If that's the case, what I don't understand is why we're not seeing a similar issue when testing ppc64le in rawhide (that does have 6.15 RC kernels).

I was wondering the same question, so I went a little bit further on the troubleshooting, and I found out [2].
So I guess, we should move forward by asking the backport of torvalds/linux@67dfc11982f7 into c10s instead of going backward with this PR.

[1] https://gitlab.com/redhat/centos-stream/rpms/kernel/-/commit/dc7a191ebb6074eb0107fbf255c99b303aff106f
[2] openshift/os#1818 (comment)

@dustymabe
Copy link
Member

I was wondering the same question, so I went a little bit further on the troubleshooting, and I found out [2].
So I guess, we should move forward by asking the backport of torvalds/linux@67dfc11982f7 into c10s instead of going backward with this PR.

Sounds good to me. I guess we should open a RHEL bug for that to make the request?

@jcapiitao
Copy link
Member Author

I was wondering the same question, so I went a little bit further on the troubleshooting, and I found out [2].
So I guess, we should move forward by asking the backport of torvalds/linux@67dfc11982f7 into c10s instead of going backward with this PR.

Sounds good to me. I guess we should open a RHEL bug for that to make the request?

I filed a bug in RHEL https://issues.redhat.com/browse/RHEL-92470 asking to backport the patch.

@openshift-merge-robot
Copy link
Contributor

PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants