-
Notifications
You must be signed in to change notification settings - Fork 13.2k
ci: run the x64 and arm ci on the github machines instead #16183
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
fix test-quantize-perf just like #12306
I added the Arm machines as well, though that one's for some reason failing on rerank score 2 when running the rerank model. |
My best guess is that the SVE path has a bug. But I don't have a machine to test this. If we resolve the failing workflow, we can decommission the 4 self-hosted CPU runners in favor of Github-hosted runners. |
Looks like it works correct without SVE. |
Yeah it looks like a sve issue as it passes fine on the self hosted machine with no sve, and it's passing here as well with sve manually turned off. I don't have an Arm sve machine either so I can't really dig deeper into why this is failing. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this case, I think it is worth adding 2 Arm workflows:
ggml-ci-arm64-cpu-high-perf
ggml-ci-arm64-cpu-high-perf-sve
The second one will have SVE enabled and will fail for the time being, but we will be aware of the issue and fix it later.
…6183) * run the x64 ci on regular machines * set up the same thing for arm fix test-quantize-perf just like ggml-org#12306 * try to disable sve * add another sve run
…6183) * run the x64 ci on regular machines * set up the same thing for arm fix test-quantize-perf just like ggml-org#12306 * try to disable sve * add another sve run
This basically lets me run the x64 ci on local forks using the standard Github machines. Currently this takes around 4 minutes for the low perf run and 15 minutes for the high perf run with a filled ccache.
I'm leaving this as a draft for now to see if there's interest in doing this over using our self hosted machines.