Skip to content

Add gfx1152 support#97

Open
danielholanda wants to merge 2 commits into
mainfrom
dholanda/gfx1152
Open

Add gfx1152 support#97
danielholanda wants to merge 2 commits into
mainfrom
dholanda/gfx1152

Conversation

@danielholanda

Copy link
Copy Markdown
Contributor

Summary

Enable nightly builds for the gfx1152 target (Krackan Point). TheRock now publishes nightly tarballs for this target at therock-nightly-tarball.s3.amazonaws.com (therock-dist-{windows,linux}-gfx1152-*.tar.gz).

Related discussions: #50

@danielholanda danielholanda changed the title Add gfx1152 support [DRAFT] Add gfx1152 support May 20, 2026
@danielholanda danielholanda marked this pull request as draft May 20, 2026 04:32
@danielholanda danielholanda marked this pull request as ready for review May 20, 2026 04:32
@clee

clee commented May 20, 2026

Copy link
Copy Markdown

Hmm, the resulting build detects my GPU but llama-cli shows gibberish when I enter text using any of the qwen3.5 models I have locally. Using llama-cli from the Vulkan build works as expected.

I used llama-ubuntu-rocm-gfx1152-x64.zip for my testing.

@danielholanda

Copy link
Copy Markdown
Contributor Author

@ppanchad-amd Can you please follow up with TheRock team to check if this is a known issue on their side?

@petmav

petmav commented May 21, 2026

Copy link
Copy Markdown

Hmm, the resulting build detects my GPU but llama-cli shows gibberish when I enter text using any of the qwen3.5 models I have locally. Using llama-cli from the Vulkan build works as expected.

I used llama-ubuntu-rocm-gfx1152-x64.zip for my testing.

Can also attest to this, built llama.cpp for windows 7.14 gfx1152 (https://therock-nightly-tarball.s3.amazonaws.com/therock-dist-windows-gfx1152-7.14.0a20260521.tar.gz) and only got gibberish out of qwen3 and gemma models, no matter the quant/size. Might be an issue on TheRock's end as the builds arent sanity tested yet. CPU is Ryzen 7 350 (GPU is 860M). Interestingly enough lemonade server recognises it, but says it's only supported on linux as a backend.

@sofiageo

Copy link
Copy Markdown
Member

In lemonade discord some users had success with the rocm-stable and gfx1152. Models: gemma4 E4B Q4_K, Bonsai 1.7B and 8B.

Just commenting it here in case it helps identify what's wrong

@soulafein83

Copy link
Copy Markdown

In lemonade discord some users had success with the rocm-stable and gfx1152. Models: gemma4 E4B Q4_K, Bonsai 1.7B and 8B.

Just commenting it here in case it helps identify what's wrong

Just to clarify as the user who did those tests on gfx1152:
Bonsai models actually work fine. Gemma 3 and 4, Llama 3.2 both get stuck in infinite loops, spitting out either tokens or question marks.
However, llama-bench runs on the ROCm backend and finishes without any issues.

@ckuethe

ckuethe commented May 21, 2026

Copy link
Copy Markdown

that was me with the working bonsai and broken gemma4. happy to try test stuff and gather diagnostics.

@mgehre-amd

mgehre-amd commented May 21, 2026

Copy link
Copy Markdown

Which checkpoint files created gibberish? Do you have a command line to reproduce?

@ckuethe

ckuethe commented May 21, 2026

Copy link
Copy Markdown

not near the computer right now so I'm pulling this from discord...

./llama-cli --no-mmap -b 4096 -ub 4096 -fa 1 -ctk q8_0 -ctv q8_0 -m /var/lib/lemonade/.cache/huggingface/hub/models--unsloth--Qwen3.6-35B-A3B-GGUF/snapshots/a483e9e6cbd595906af30beda3187c2663a1118c/Qwen3.6-35B-A3B-UD-Q4_K_XL.gguf --ctx-size 262144 --jinja

Also tried with
models--unsloth--gemma-4-E4B-it-GGUF/snapshots/653803f092503c04a65164346f3208a36e707693/gemma-4-E4B-it-Q4_K_M.gguf

and I got rid of the extra ctk/ctv/ub/b/ctx-size flags too

@petmav

petmav commented May 21, 2026

Copy link
Copy Markdown

I've made a separate issue for the ? issues, as they might be more of an upstream thing, lmk if it should just be merged here #98

@soulafein83

soulafein83 commented Jun 8, 2026

Copy link
Copy Markdown

ggml-org/llama.cpp#24129

I think the issue was on the llama.cpp side. Pull Request #24129, which has now been merged, should fix it.

Update:
Just wanted to confirm that the llama.cpp ROCm 7.13 build (tag b9559) from here:
https://github.com/lemonade-sdk/llama.cpp/releases/tag/b9559
works completely out of the box on gfx1152 (tested on a laptop with Ryzen 7 AI 350).
Verified with:

  • Gemma-4-e4b
  • Qwen3.5 9b

@ckuethe

ckuethe commented Jun 11, 2026

Copy link
Copy Markdown

using upstream llama.cpp, rocm 7.13 I'm seeing repeated and serious success.

running an overnight test with lemonade bench, but a few multiturn sessions have worked with bonsai-1.7B, through gemma4-12B and qwen3.6-35B, up to qwen3-coder-next-80B

@soulafein83

Copy link
Copy Markdown

llama-b9628.txt
These are my logs from llama-bench on my laptop with Ryzen 7 AI 350 (KrackanPoint) 16 gb

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants