You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
with enough space to hold a significant portion of the models in /home/xidaren2/shark-ai/sharktank/sharktank/utils/hf_datasets.py (my estimation is on the scale of hundreds of GBs)
supports symlinking to the CI working dir
(optionally) periodically cleared or downsized to evict old models
And set this directory as an environment variable
HF_HUB_CACHE
for CI tasks that uses it.
This would speed up sharktank and shortfin integration tests by a lot. In my previous experiments, caching hf_hub can cause a 40 minute test to complete in 6 minutes.
The text was updated successfully, but these errors were encountered:
Our integration tests currently download weights for many huggingface models using
hf_datasets.py
Currently, caching is not working on:
linux-mi300-gpu-1
azure-cpubuilder-linux-scale
Which causes 20+ minute model weight re-downloads on every CI run for shortfin llm
To address this, we need:
/home/xidaren2/shark-ai/sharktank/sharktank/utils/hf_datasets.py
(my estimation is on the scale of hundreds of GBs)And set this directory as an environment variable
HF_HUB_CACHE
for CI tasks that uses it.
This would speed up sharktank and shortfin integration tests by a lot. In my previous experiments, caching hf_hub can cause a 40 minute test to complete in 6 minutes.
The text was updated successfully, but these errors were encountered: