Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Enable compose repos during build #48

Merged
merged 1 commit into from
Jan 2, 2025

Conversation

carlwgeorge
Copy link
Contributor

In recent builds of achillobator we noticed that installing systemd-container downgrades the systemd package already built into the image. This is because the centos-bootc image we're basing on has a newer systemd than what is on the mirror network. I believe this is caused by the centos-bootc image enabling the compose repos during its build.

https://gitlab.com/redhat/centos-stream/containers/bootc/-/blob/c10s/cs.repo

To avoid this, we can enable the same repos used during the centos-bootc build. To avoid stomping on the default repos, we'll change the repo IDs. We'll also disable these repos at the end of the build.

In recent builds of achillobator we noticed that installing
systemd-container downgrades the systemd package already built into the
image.  This is because the centos-bootc image we're basing on has a
newer systemd than what is on the mirror network.  I believe this is
caused by the centos-bootc image enabling the compose repos during its
build.

https://gitlab.com/redhat/centos-stream/containers/bootc/-/blob/c10s/cs.repo

To avoid this, we can enable the same repos used during the centos-bootc
build.  To avoid stomping on the default repos, we'll change the repo
IDs.  We'll also disable these repos at the end of the build.
@carlwgeorge
Copy link
Contributor Author

Merging this will avoid errors like this:

  Problem 1: package glibc-all-langpacks-2.39-23.el10.x86_64 from baseos requires glibc = 2.39-23.el10, but none of the providers can be installed
  - cannot install both glibc-2.39-23.el10.x86_64 from baseos and glibc-2.39-31.el10.x86_64 from @System
  - cannot install the best candidate for the job
 Problem 2: cannot install both sssd-common-2.10.0-3.el10.x86_64 from baseos and sssd-common-2.10.1-3.el10.x86_64 from @System
  - package sssd-2.10.0-3.el10.x86_64 from baseos requires sssd-common = 2.10.0-3.el10, but none of the providers can be installed
  - cannot install the best candidate for the job

@castrojo castrojo added this pull request to the merge queue Jan 2, 2025
Merged via the queue into centos-workstation:main with commit bbf6792 Jan 2, 2025
3 checks passed
@cgwalters
Copy link
Contributor

Yeah, the impedance mismatch between rpm and containers has been a huge pain for everyone for many years. The long term fix I think is rpm-software-management/dnf5#833 (comment)

carlwgeorge added a commit to carlwgeorge/achillobator that referenced this pull request Jan 3, 2025
The ghcr.io/centos-workstation/main image now includes baseos-compose
and appstream-compose repos, which are the compose repos that are
enabled during the quay.io/centos-bootc/centos-bootc build.  They are
disabled by default, but enabling them during the achillobator build
should help prevent the package downgrade issue we were seeing
previously.

Related: centos-workstation/main#48
@carlwgeorge carlwgeorge deleted the compose-repos branch January 3, 2025 08:05
tulilirockz pushed a commit to carlwgeorge/achillobator that referenced this pull request Jan 3, 2025
The ghcr.io/centos-workstation/main image now includes baseos-compose
and appstream-compose repos, which are the compose repos that are
enabled during the quay.io/centos-bootc/centos-bootc build.  They are
disabled by default, but enabling them during the achillobator build
should help prevent the package downgrade issue we were seeing
previously.

Related: centos-workstation/main#48
@cgwalters
Copy link
Contributor

Yeah in fact, this can only really work if we ship the compose repo in the base image because all we've done here is make the problem less likely - there's still the possibility here that we bump the compose repo in git main, but that image isn't pushed for a bit, so things can still skew.

@cgwalters
Copy link
Contributor

One thing that's done in RHEL (and now in Fedora with the "archive" repo) is ship multiple versions of rpms by default - but we aren't doing that for CentOS Stream it seems, though I guess one could be cobbled together by us shipping a .repo file with the compose repos.

(Again all this would get much much much saner if we stored RPMs in registries)

@carlwgeorge
Copy link
Contributor Author

carlwgeorge commented Jan 3, 2025

Yeah in fact, this can only really work if we ship the compose repo in the base image because all we've done here is make the problem less likely - there's still the possibility here that we bump the compose repo in git main, but that image isn't pushed for a bit, so things can still skew.

With this workaround a newer compose repo in git is not an issue. We don't need to exactly match the same compose to prevent package downgrades, we just need the subpackages present for those strict NVR requirements. The core problem is the centos-bootc image having packages installed that are newer than what is on the mirror network.

One thing that's done in RHEL (and now in Fedora with the "archive" repo) is ship multiple versions of rpms by default - but we aren't doing that for CentOS Stream it seems, though I guess one could be cobbled together by us shipping a .repo file with the compose repos.

That's not accurate. CentOS Stream keeps the latest five of every package. For example, on the mirror network right now we have the following systemd packages:

  • systemd-256-13.el10
  • systemd-256-14.el10
  • systemd-256-15.el10
  • systemd-256-16.el10
  • systemd-256-18.el10

The problem isn't a lack of multiple versions, it's that the centos-bootc image seems to be building from composes that have not yet been published to the mirror network. The latest compose that has been pushed to the mirrors is listed here. I think if we had the centos-bootc cs.repo track that same compose (i.e. never get ahead of what is on the mirrors) then it would solve this problem. I can understand the benefit of also doing a build that tracks the latest compose to find potential problems sooner, but ideally that would be in a different tag, not the main stream10 tag.

@cgwalters
Copy link
Contributor

cgwalters commented Jan 3, 2025

Ah, thanks; I stand corrected.

I think if we had the centos-bootc cs.repo track that same compose (i.e. never get ahead of what is on the mirrors) then it would solve this problem.

Yes, though it's still racy because presumably that file updates before the mirrors sync, right? It would reduce the race window of course.

(edit: to be clear the race is: compose is queued to mirrors, we build a new bootc base image and publish it, someone does a build that does dnf install and hits an old mirror)

I can understand the benefit of also doing a build that tracks the latest compose to find potential problems sooner, but ideally that would be in a different tag, not the main stream10 tag.

Right, I moved this to https://gitlab.com/redhat/centos-stream/containers/bootc/-/issues/1174 - should be a relatively straightforward change to our dependabot config there.

@carlwgeorge
Copy link
Contributor Author

Yes, though it's still racy because presumably that file updates before the mirrors sync, right? It would reduce the race window of course.

Right, but the key is how drastically it reduces that window. The current compose on the mirrors is from 2024-12-16. That's older than usual due to the holidays, but it's common for the published compose to be a week or two old. I believe the target for the releng team is to push a compose once a week. If centos-bootc:stream10 only includes packages from the compose identified by the COMPOSE_ID file on the mirror sync point, then the window in practice should be between a few hours and a day. By that time enough mirrors will have synced the content to ensure a user won't be directed to a stale mirror. We could reduce this window even further by picking a single mirror that is know to sync regularly and reliably and use that for the COMPOSE_ID check.

(cc: @asamalik @tdawson)

@cgwalters
Copy link
Contributor

then the window in practice should be between a few hours and a day.

Yeah, but with enough users, "a few hours" once a week is a lot of hits to that window.

It's worth noting that none of this is actually specific to bootc images in any way; it happens any time one has skew between an image build and rpms - it can happen with "app" base container images like quay.io/centos/centos:stream10 or also with qcow2 or AMIs too.

On that topic, we should definitely look at how we sync up with how the centos app base image is built and published with how the bootc one is built and published.

Although, for app base images it's much less likely that one ends up trying to install a "split versionlocked" package I think.

Again though the only thing that reliably fix this is to do what I linked earlier - strictly bind the image versions and the rpm versions by default.

@carlwgeorge
Copy link
Contributor Author

Yeah, but with enough users, "a few hours" once a week is a lot of hits to that window.

It would still be far fewer hits than the current status quo. Let's not let perfect be the enemy of good and iterative improvement.

It's worth noting that none of this is actually specific to bootc images in any way; it happens any time one has skew between an image build and rpms - it can happen with "app" base container images like quay.io/centos/centos:stream10 or also with qcow2 or AMIs too.

Sure, but I believe those images are pushed at the same time as composes to the mirror network. That's why the date on the centos:stream10 tag matches the date in the COMPOSE_ID file. Sure it's theoretically possible if you time things just right and get unlucky on mirror selection, but based on the number of complaints I've seen for downgrades in that image (zero) it seems to be exceedingly rare.

On that topic, we should definitely look at how we sync up with how the centos app base image is built and published with how the bootc one is built and published.

Definitely, set up a meeting with @asamalik and @tdawson and I'm sure they'd be happy to go over it in detail. I don't work on that team but I'd be happy to sit in to verify my understanding of the pipeline.

@asamalik
Copy link

asamalik commented Jan 9, 2025

Hey there, apologies if I'm missing some essential part of the context, but a question pops into my mind: Why aren't you building from the mirrors content?

The composes (both "production" and "development") are available for sure, but they don't represent the official release. They're not even tested. This is covered in our docs: https://docs.centos.org/centos-stream-docs/release/

@cgwalters
Copy link
Contributor

@asamalik Thanks for replying, though I think the best venue to discuss that is https://gitlab.com/redhat/centos-stream/containers/bootc/-/issues/1174

@asamalik
Copy link

asamalik commented Jan 9, 2025

Sure. Just saw my name mentioned here a few times. :) That one feels like it's doing exactly what I'm asking about, so it should be good anyway!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants