Summary
lemond's model downloader hangs indefinitely (0 bytes, no error) when pulling a model whose HuggingFace repo is stored on the Xet storage backend. The repository metadata is fetched successfully, the file list is resolved, but the actual blob download never transfers a single byte and never errors out — it just stalls until the client disconnects.
Environment
- Lemonade:
lemonade-server 10.8.1~24.04 (PPA lemonade-team/stable)
- OS: Zorin OS 18.1 (Ubuntu 24.04 / noble)
- GPU: 2× AMD Radeon AI PRO R9700 (gfx1201, RDNA4)
- Backend:
llamacpp (rocm); HF cache at /ml/huggingface-cache/huggingface/hub/
Reproduction
lemonade pull Qwen-AgentWorld-35B-A3B-GGUF-UD-Q4_K_XL
(repo: unsloth/Qwen-AgentWorld-35B-A3B-GGUF — public, not gated, Xet-backed)
Or directly against the server:
curl -X POST http://127.0.0.1:13305/api/v1/pull \
-H 'Content-Type: application/json' \
-d '{"model_name":"Qwen-AgentWorld-35B-A3B-GGUF-UD-Q4_K_XL"}'
# -> HTTP 000 after 20s (no response; nothing written to disk)
Server log (lemond.service)
(ModelManager) Downloading model: unsloth/Qwen-AgentWorld-35B-A3B-GGUF (variant: UD-Q4_K_XL)
(ModelManager) Fetching repository file list from Hugging Face...
(ModelManager) Using commit hash: 3a305abf5cfd119ee999dfe929c433746edd8d63
(ModelManager) Repository contains 25 files
(ModelManager) Identified files to download:
(ModelManager) - ...:Qwen-AgentWorld-35B-A3B-UD-Q4_K_XL.gguf
(ModelManager) - ...:mmproj-F16.gguf
(ModelManager) Created download manifest
(ModelManager) Downloading: Qwen-AgentWorld-35B-A3B-UD-Q4_K_XL.gguf...
<-- hangs here forever, 0 bytes, no progress, no error -->
(Server) Client disconnected, cancelling download
Evidence it's Xet-specific (not a bad URL / network issue)
The plain HTTPS path works perfectly from the same host:
# resolve URL 302 -> us.aws.cdn.hf.co (xet-bridge), x-linked-size: 22324804864
curl -L -r 0-1048575 .../resolve/main/Qwen-AgentWorld-35B-A3B-UD-Q4_K_XL.gguf
# -> HTTP 206, 1048576 bytes in 0.44s
The repo carries Xet headers (x-xet-hash, link: ...rel="xet-reconstruction-info" → cas-server.xethub.hf.co). Non-Xet repos (e.g. older GGUFs) download fine through lemond. Downloading the same two files with huggingface_hub + HF_HUB_DISABLE_XET=1 succeeds over plain HTTPS into the same cache.
What I tried
Setting HF_HUB_DISABLE_XET=1 and HF_XET_DISABLE=1 in the lemond service environment (/etc/lemonade/conf.d/) and restarting — no change, the downloader still stalls. So lemond's downloader does not appear to honor the standard HF Xet opt-out env vars.
Expected behavior
lemond should either (a) correctly complete Xet-backed downloads, or (b) fall back to the plain HTTPS CDN path when Xet reconstruction stalls, and in any case time out with an error instead of hanging silently at 0 bytes.
Suggestions
- Honor
HF_HUB_DISABLE_XET / HF_XET_DISABLE as an escape hatch.
- Add a stall/timeout on the blob transfer so a hung Xet reconstruction surfaces an error rather than hanging.
- Given the growing number of Unsloth/HF repos migrating to Xet, this will affect many popular GGUF models.
Happy to provide full debug logs or test a build on gfx1201 hardware.
Summary
lemond's model downloader hangs indefinitely (0 bytes, no error) when pulling a model whose HuggingFace repo is stored on the Xet storage backend. The repository metadata is fetched successfully, the file list is resolved, but the actual blob download never transfers a single byte and never errors out — it just stalls until the client disconnects.Environment
lemonade-server10.8.1~24.04 (PPAlemonade-team/stable)llamacpp(rocm); HF cache at/ml/huggingface-cache/huggingface/hub/Reproduction
(repo:
unsloth/Qwen-AgentWorld-35B-A3B-GGUF— public, not gated, Xet-backed)Or directly against the server:
Server log (lemond.service)
Evidence it's Xet-specific (not a bad URL / network issue)
The plain HTTPS path works perfectly from the same host:
The repo carries Xet headers (
x-xet-hash,link: ...rel="xet-reconstruction-info"→cas-server.xethub.hf.co). Non-Xet repos (e.g. older GGUFs) download fine throughlemond. Downloading the same two files withhuggingface_hub+HF_HUB_DISABLE_XET=1succeeds over plain HTTPS into the same cache.What I tried
Setting
HF_HUB_DISABLE_XET=1andHF_XET_DISABLE=1in thelemondservice environment (/etc/lemonade/conf.d/) and restarting — no change, the downloader still stalls. So lemond's downloader does not appear to honor the standard HF Xet opt-out env vars.Expected behavior
lemondshould either (a) correctly complete Xet-backed downloads, or (b) fall back to the plain HTTPS CDN path when Xet reconstruction stalls, and in any case time out with an error instead of hanging silently at 0 bytes.Suggestions
HF_HUB_DISABLE_XET/HF_XET_DISABLEas an escape hatch.Happy to provide full debug logs or test a build on gfx1201 hardware.