fix: llama.cpp backend updates make sure artifacts are present#2518
Conversation
|
Overall, this nicely matches the PRs we have in llama.cpp and stable-diffusion.cpp, ensuring families are complete before including them in a release. However, there is a minor bug: CUDA_SMS includes sm_121, which llama.cpp never builds. Our release.yml only builds 7 variants (sm_75 through sm_120) — no sm_121. That's a stable-diffusion.cpp thing (GB10/Blackwell arm64 support), not llama.cpp's. Since CUDA_SMS here has 8 entries including sm_121, the cuda requirement list will always be missing 3 assets (windows-cuda-sm_121-x64.7z, ubuntu-cuda-sm_121-x64.tar.xz, -arm64.tar.xz) that can never exist in lemonade-sdk/llama.cpp |
|
Good catch, thanks! Minor adaption implemented. |
|
Re-reviewed after the sm_121 fix — confirmed `CUDA_SMS` now has exactly 7 entries matching lemonade-sdk/llama.cpp's actual build matrix, and the `rocm-stable`/`cuda` all-or-nothing family groupings correctly mirror the producer-side "drop the whole family if any piece is missing" logic from lemonade-sdk/llama.cpp#16 and lemonade-sdk/stable-diffusion.cpp#14. Found one more instance of the same bug class, on the ROCm-nightly side: ```python I checked lemonade-sdk/llamacpp-rocm's build workflow and `gfx1152` doesn't appear anywhere in it — not in the default `GFX_TARGETS` matrix (`gfx1151,gfx1150,gfx120X,gfx110X,gfx103X,gfx90a,gfx908`), not in `rocm_asset_families`, nowhere. That default list does build `gfx90a` and `gfx908`, neither of which are in `ROCM_NIGHTLY_ARCHES`. Same failure mode as the `sm_121` bug: `llama-{tag}-windows-rocm-gfx1152-x64.zip` and the ubuntu equivalent can never exist in a llamacpp-rocm release, so `rocm_nightly` will permanently report incomplete and that pin can never auto-update — even when the 6 real targets llamacpp-rocm actually ships are all complete. My guess is `gfx1152` got pulled in from stable-diffusion.cpp's `GPU_TARGETS` list (a compile-time ISA list for TheRock), which does include it — but that's a different thing from llamacpp-rocm's release asset names. I don't have enough context on GPU support plans to know whether the right fix is dropping `gfx1152` from the required list, or whether llamacpp-rocm should start shipping a `gfx1152` asset — that's a call for whoever owns that repo's roadmap. |
|
Good catch again — aligned ROCM_NIGHTLY_ARCHES with the actual lemonade-sdk/llamacpp-rocm producer matrix: removed gfx1152 and added gfx90a / gfx908. |
|
Re-reviewed after c560cc3 — confirmed `ROCM_NIGHTLY_ARCHES` now reads: ```python which is an exact match for lemonade-sdk/llamacpp-rocm's default `GFX_TARGETS` (`gfx1151,gfx1150,gfx120X,gfx110X,gfx103X,gfx90a,gfx908`). No leftover `gfx1152`, and `sm_121` is still correctly absent from `CUDA_SMS`. That was the only hunk that changed since my last pass — everything else (the `rocm-stable`/`cuda` all-or-nothing family groupings, the reused-asset skip logic, the PR body table) is unchanged and still looks correct. With both asset lists now matching their producers' real output, this looks good to me. |
kenvandine
left a comment
There was a problem hiding this comment.
Thanks, this looks good now.
Summary
Harden the llama.cpp release update workflow so backend pins are updated only when the expected release assets are available.
Relates to #2492
What changed