Skip to content

engineering: build azl3+azl4 RPMs in parallel at task level#699

Open
bfjelds wants to merge 3 commits into
mainfrom
user/bfjelds/azl4-rpms-parallel-build
Open

engineering: build azl3+azl4 RPMs in parallel at task level#699
bfjelds wants to merge 3 commits into
mainfrom
user/bfjelds/azl4-rpms-parallel-build

Conversation

@bfjelds

@bfjelds bfjelds commented Jun 24, 2026

Copy link
Copy Markdown
Member

Summary

Builds the azl3 and azl4 Trident RPMs concurrently within a single ADO job instead of running two sequential release.yml invocations (one per distro). This avoids a second job and the duplicated OneBranch/setup overhead, while overlapping the I/O-bound parts of the two docker builds. The single-artifact dual-distro layout is unchanged (azl3 at the base path, azl4 under azl4/).

In trident-CI, amd64 build stage with azl3 and azl4 went from 14 minutes to 27 minutes. Parallel effort reclaims some of that time (down to 21 minutes, unclear why azl4 build takes so much more time).

Why task-level (not job/stage-level)

Today, inside BuildTrident_<arch>, release.yml is invoked twice. Each invocation independently repeats all of: set version, tdnf install, start docker, preview-container download/sed, docker build, and extract. So we paid the setup tax twice and the two docker builds ran serially. Splitting into a separate ADO job per distro would add OneBranch/checkout/setup overhead that likely cancels any speedup. Doing it at the task level captures both wins: run setup once, then overlap the builds.

Changes

  • New scripts/build-rpms-parallel.sh — runs both docker builds as background jobs against one BuildKit daemon, each writing to its own dest dir and log file, with explicit per-distro exit-code checks. Unpacks azl3 to the base artifact dir and azl4 under azl4/.
  • New build-rpms-parallel.yml — run-once shared setup (version, deps, docker, preview-container) followed by a single task invoking the script, then extract-binary.sh per distro.
  • build-source.yml — amd64 and arm64 jobs call the new template once.
  • Removed the now-unused release.yml.

Hardening

  • Per-distro dest dirs, image tags (:azl3/:azl4), and tarballs prevent clobbering between the concurrent builds.
  • set -o pipefail + anchored greps (^AZL_IMAGE=) so a failed make azl-version-vars cannot be masked by the grep | cut pipeline.
  • Explicit wait $pid || rc=$? checks fail the task if either build fails; both build logs are printed sequentially for readable diagnostics.
  • Job-scoped $(Agent.TempDirectory) work dir avoids collisions if builds share a host.
  • Confirmed Dockerfile.full has no BuildKit --mount=type=cache, so no cache-race.

Artifact contract

Unchanged. azl3 RPMs/binary at the base artifact path; azl4 under azl4/. download-staged.yml and all consumer distro-filters are unaffected.

Validation

  • YAML lints clean.
  • ⚠️ Needs a real pipeline run / make check-pipelines (Linux + az toolchain) before merge to confirm wiring and measure actual wall-clock vs. the serial baseline. Realistic expectation: solid savings from de-duplicated setup; build overlap depends on agent core count (two concurrent cargo builds share cores, so not a full 2x).

Note

In non-release modes, preview-container.yml seds a single base into Dockerfile.full, so both distros share that base — a pre-existing limitation (already true before this change), clean in release mode where AZL_IMAGE is honored per distro.

Replace the two sequential release.yml invocations (one per distro) with a
single build-rpms-parallel.yml template plus scripts/build-rpms-parallel.sh.
Shared setup runs once, then both azl3 and azl4 RPMs build concurrently as
background docker builds in a single task, avoiding a second ADO job and
duplicated OneBranch/setup overhead.

- New scripts/build-rpms-parallel.sh (executable): run both docker builds in
  the background (& ... wait), capture per-distro logs and exit codes, then
  unpack azl3 to the base artifact dir and azl4 under azl4/. Uses set -o
  pipefail and anchored greps so a failed make azl-version-vars cannot be
  masked by the grep/cut pipeline. Per-distro dest dirs and image tags
  prevent clobbering.
- New build-rpms-parallel.yml: run-once setup (version, deps, docker, preview
  container) then a single task invoking the script. Distro binaries are
  extracted via extract-binary.sh for azl3 and azl4.
- build-source.yml: amd64 and arm64 jobs call the new template once.
- Remove now-unused release.yml.

Artifact layout unchanged: azl3 at base path, azl4 under azl4/. Downstream
download-staged.yml and consumer distro-filters are unaffected.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@bfjelds

bfjelds commented Jun 24, 2026

Copy link
Copy Markdown
Member Author

/azp run [GITHUB]-trident-pr-e2e

@azure-pipelines

Copy link
Copy Markdown
Azure Pipelines successfully started running 1 pipeline(s).

@bfjelds bfjelds marked this pull request as ready for review June 25, 2026 00:12
@bfjelds bfjelds requested a review from a team as a code owner June 25, 2026 00:12
Copilot AI review requested due to automatic review settings June 25, 2026 00:12

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates the Trident RPM build pipeline to build Azure Linux 3 (azl3) and Azure Linux 4 (azl4) RPMs concurrently within a single Azure DevOps job, preserving the existing single-artifact layout (azl3 at the base path, azl4 under azl4/).

Changes:

  • Add scripts/build-rpms-parallel.sh to run two docker build invocations in parallel, each with isolated output/log directories, then unpack both tarballs into the expected artifact layout.
  • Add build-rpms-parallel.yml pipeline template that performs shared setup once, runs the parallel build script, then extracts binaries per distro.
  • Update build-source.yml to use the new template once per architecture, and remove the old per-distro release.yml template.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.

File Description
scripts/build-rpms-parallel.sh New bash driver to run azl3/azl4 Docker RPM builds concurrently and unpack results into the established artifact contract.
.pipelines/templates/stages/trident_rpms/build-rpms-parallel.yml New pipeline template that performs shared setup once, invokes the parallel build script, then runs per-distro binary extraction.
.pipelines/templates/stages/trident_rpms/build-source.yml Switch amd64/arm64 jobs to use the new parallel template instead of two sequential template invocations.
.pipelines/templates/stages/trident_rpms/release.yml Removed obsolete template previously invoked once per distro.

Comment thread .pipelines/templates/stages/trident_rpms/build-source.yml Outdated
Comment thread scripts/build-rpms-parallel.sh
- build-source.yml: drop the redundant arm64 'Start Docker' step; the
  build-rpms-parallel.yml shared setup already starts Docker for both arches.
- build-rpms-parallel.sh: validate arg count and refuse an empty, root, or
  non-absolute work_dir before the rm -rf guard.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.

Comment thread scripts/build-rpms-parallel.sh
Comment thread scripts/build-rpms-parallel.sh Outdated
The two builds run as concurrent background subshells; unconditional set -x
interleaved trace lines from both on the task stderr, undermining the
per-distro log files. Default to no xtrace and enable it only when
BUILD_RPMS_DEBUG=1.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants