You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
failures reproducible on CI but not on dev machines / vice versa
failures that fail inconsistently on CI machines, possibly due to inter-machine differences
Huggingface Authentication Errors
E huggingface_hub.errors.GatedRepoError: 401 Client Error. (Request ID: Root=1-67c8d3e8-377d2afb1b6871bc6963be58;bb06d16a-9f8c-493f-9e37-fe7fd4f83d5a)
E
E Cannot access gated repo for url https://huggingface.co/black-forest-labs/FLUX.1-dev/resolve/main/vae/config.json.
E Access to model black-forest-labs/FLUX.1-dev is restricted. You must have access to it and be authenticated to access it. Please log in.
Sometimes sharktank CI failes with huggingface reporting failure to download model files.
This section keeps track of which models, which runner, and add links to the logs
Shortfin LLM Sharded Integration Tests work fine locally, but fails on CI with 2025-03-05T19:17:22.6032392Z Memory access fault by GPU node-10 (Agent handle: 0x557ae2a43b20) on address 0x7ef88f5d0000. Reason: Unknown.
This issue tracks:
Huggingface Authentication Errors
Sometimes sharktank CI failes with huggingface reporting failure to download model files.
This section keeps track of which models, which runner, and add links to the logs
Shortfin LLM Sharded Integration Tests work fine locally, but fails on CI with
2025-03-05T19:17:22.6032392Z Memory access fault by GPU node-10 (Agent handle: 0x557ae2a43b20) on address 0x7ef88f5d0000. Reason: Unknown.
See: https://github.com/nod-ai/shark-ai/actions/runs/13681855838/job/38256089307?pr=1021
One suspected cause of the inconsistency is, the CI machines have
ROCk module version 6.10.10
while the machien where it worked (mi300x-3) hasROCk module version 6.12.3
. There might also be other version inconsistencies.The text was updated successfully, but these errors were encountered: