fix: llama.cpp backend updates make sure artifacts are present by fl0rianr · Pull Request #2518 · lemonade-sdk/lemonade

fl0rianr · 2026-07-01T18:38:15Z

Summary

Harden the llama.cpp release update workflow so backend pins are updated only when the expected release assets are available.

Relates to #2492

What changed

Adds an asset availability check before updating llama.cpp backend pins.
Keeps existing pins when a backend release is incomplete.
Preserves the existing PR validation path: pull requests build and validate against the current checked-in pins.
Scheduled/manual runs can still create update PRs, but only after verified assets and successful validation.

kenvandine · 2026-07-02T13:54:39Z

Overall, this nicely matches the PRs we have in llama.cpp and stable-diffusion.cpp, ensuring families are complete before including them in a release. However, there is a minor bug:

CUDA_SMS includes sm_121, which llama.cpp never builds. Our release.yml only builds 7 variants (sm_75 through sm_120) — no sm_121. That's a stable-diffusion.cpp thing (GB10/Blackwell arm64 support), not llama.cpp's. Since CUDA_SMS here has 8 entries including sm_121, the cuda requirement list will always be missing 3 assets (windows-cuda-sm_121-x64.7z, ubuntu-cuda-sm_121-x64.tar.xz, -arm64.tar.xz) that can never exist in lemonade-sdk/llama.cpp
releases. That means cuda_available evaluates to false forever, and the CUDA backend pin can never auto-update through this workflow — even when all 21 real CUDA assets are complete. Looks like the list was copied from
stable-diffusion.cpp's CUDA_SMS (which correctly has sm_121) without accounting for the difference between the two repos' build matrices.

fl0rianr · 2026-07-02T14:35:56Z

Good catch, thanks! Minor adaption implemented.

kenvandine · 2026-07-02T14:53:35Z

Re-reviewed after the sm_121 fix — confirmed `CUDA_SMS` now has exactly 7 entries matching lemonade-sdk/llama.cpp's actual build matrix, and the `rocm-stable`/`cuda` all-or-nothing family groupings correctly mirror the producer-side "drop the whole family if any piece is missing" logic from lemonade-sdk/llama.cpp#16 and lemonade-sdk/stable-diffusion.cpp#14.

Found one more instance of the same bug class, on the ROCm-nightly side:

```python
ROCM_NIGHTLY_ARCHES = (
"gfx1150",
"gfx1151",
"gfx1152",
"gfx103X",
"gfx110X",
"gfx120X",
)
```

I checked lemonade-sdk/llamacpp-rocm's build workflow and `gfx1152` doesn't appear anywhere in it — not in the default `GFX_TARGETS` matrix (`gfx1151,gfx1150,gfx120X,gfx110X,gfx103X,gfx90a,gfx908`), not in `rocm_asset_families`, nowhere. That default list does build `gfx90a` and `gfx908`, neither of which are in `ROCM_NIGHTLY_ARCHES`.

Same failure mode as the `sm_121` bug: `llama-{tag}-windows-rocm-gfx1152-x64.zip` and the ubuntu equivalent can never exist in a llamacpp-rocm release, so `rocm_nightly` will permanently report incomplete and that pin can never auto-update — even when the 6 real targets llamacpp-rocm actually ships are all complete.

My guess is `gfx1152` got pulled in from stable-diffusion.cpp's `GPU_TARGETS` list (a compile-time ISA list for TheRock), which does include it — but that's a different thing from llamacpp-rocm's release asset names. I don't have enough context on GPU support plans to know whether the right fix is dropping `gfx1152` from the required list, or whether llamacpp-rocm should start shipping a `gfx1152` asset — that's a call for whoever owns that repo's roadmap.

fl0rianr · 2026-07-02T15:38:06Z

Good catch again — aligned ROCM_NIGHTLY_ARCHES with the actual lemonade-sdk/llamacpp-rocm producer matrix: removed gfx1152 and added gfx90a / gfx908.

kenvandine · 2026-07-02T15:47:17Z

Re-reviewed after c560cc3 — confirmed `ROCM_NIGHTLY_ARCHES` now reads:

```python
ROCM_NIGHTLY_ARCHES = (
"gfx1151",
"gfx1150",
"gfx120X",
"gfx110X",
"gfx103X",
"gfx90a",
"gfx908",
)
```

which is an exact match for lemonade-sdk/llamacpp-rocm's default `GFX_TARGETS` (`gfx1151,gfx1150,gfx120X,gfx110X,gfx103X,gfx90a,gfx908`). No leftover `gfx1152`, and `sm_121` is still correctly absent from `CUDA_SMS`. That was the only hunk that changed since my last pass — everything else (the `rocm-stable`/`cuda` all-or-nothing family groupings, the reused-asset skip logic, the PR body table) is unchanged and still looks correct.

With both asset lists now matching their producers' real output, this looks good to me.

kenvandine

Thanks, this looks good now.

fix: llama.cpp backend updates make sure artifacts are present

ced664f

github-actions Bot added engine::llamacpp llama.cpp backend (LlamaCppServer); GPU/CPU LLM inference (Vulkan, ROCm, Metal) area::ci CI / GitHub Actions / self-hosted runner infrastructure enhancement New feature or request labels Jul 1, 2026

fl0rianr mentioned this pull request Jul 1, 2026

ci: fail release if any expected artifacts are missing lemonade-sdk/llama.cpp#16

Merged

fl0rianr requested a review from kenvandine July 2, 2026 00:41

remove unsupported llama.cpp CUDA sm_121 gate

449025f

fix: align ROCm nightly asset gate with llamacpp-rocm targets

c560cc3

kenvandine approved these changes Jul 2, 2026

View reviewed changes

fl0rianr added this pull request to the merge queue Jul 2, 2026

Merged via the queue into main with commit e9ae1b2 Jul 2, 2026
83 checks passed

fl0rianr deleted the fl0rianr/harden_llama_backend_auto_updater branch July 2, 2026 18:47

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: llama.cpp backend updates make sure artifacts are present#2518

fix: llama.cpp backend updates make sure artifacts are present#2518
fl0rianr merged 3 commits into
mainfrom
fl0rianr/harden_llama_backend_auto_updater

fl0rianr commented Jul 1, 2026

Uh oh!

kenvandine commented Jul 2, 2026

Uh oh!

fl0rianr commented Jul 2, 2026

Uh oh!

kenvandine commented Jul 2, 2026

Uh oh!

fl0rianr commented Jul 2, 2026

Uh oh!

kenvandine commented Jul 2, 2026

Uh oh!

kenvandine left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

fl0rianr commented Jul 1, 2026

Summary

What changed

Uh oh!

kenvandine commented Jul 2, 2026

Uh oh!

fl0rianr commented Jul 2, 2026

Uh oh!

kenvandine commented Jul 2, 2026

Uh oh!

fl0rianr commented Jul 2, 2026

Uh oh!

kenvandine commented Jul 2, 2026

Uh oh!

kenvandine left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants