-
Couldn't load subscription status.
- Fork 2.1k
Fixed 4bit compare UT on XPU #2843
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Hi @BenjaminBossan , please help review. Thanks! |
|
Hi, thanks for the PR. I assume you are from the same team as @yao-matrix?
Do you mean the latest release from bitsandbytes makes the tests pass? Why is that version not used for CI then? Or do you mean the tests require bitsandbytes installed from source? In that case, I would change the condition such that:
|
|
Echo Benjamin, our target is to cover all cases on XPU, so as a general principal, let's be very cautious of doing platform-specific skip. Pls follow Benjamin suggestion to figure out what bnb version is needed to pass. @YangKai0616 |
Sorry for my unclear expression. What I meant is that the current XPU can pass all tests of bitsandbytes. Because I noticed you mentioned that For the test case Additionally, I tested the FP32/4bit matmul, Linear outputs and quantization precision on both CUDA (A100) and XPU, and performed a cross-platform comparison. The results are as follows: The results show that the 4-bit quantization precision on XPU and CUDA (A100) aligns perfectly. The differences in Linear, MLP, and matmul computations are reasonable. The increased differences in large size matmul demonstrate the effect of accumulated errors. Considering that the ground truth for tests such as Thanks! |
|
I see, thanks for explaining. Honestly, I'm considering to remove these tests completely, as they are actually tests for bitsandbytes and not PEFT. We added them to PEFT because bnb didn't have its own CI for this at the time, but now the picture has changed. Therefore, I would like to avoid putting too much extra work into this, like providing separate artifacts per architecture. I started a discussion with bnb maintainers about removing the bnb tests in PEFT and will reply here once we have come up with a conclusion. |
|
@YangKai0616 After discussing with bnb colleagues, we decided to remove |
Thank you for your prompt response! Yes, you can see that the 4-bit BNB model is also used in test case test_regression.py::TestOpt4bitBnb::test_lora_4bit. The ground truth based on CUDA GPU is located at link. Should we maintain a similar ground truth for XPU? |
|
@YangKai0616 Okay, got it. So what could be done:
Let's add a comment to let the user know that if they run with XPU, the regression artifacts are loaded from a repo outside of Hugging Face's control, so they run this at their own risk. We use |
Similar to #2473 . Our internal CI testing for XPU found that current test results for cases like
test_bnb_regression.py::test_opt_350m_4bit_quant_storageandtest_regression.py::TestOpt4bitBnb::test_lora_4bitare similar totest_bnb_regression.py::test_opt_350m_4bit, vary due to thebitsandbytesversion and hardware platform.XPU currently can pass all UT files in the latest
bitsandbytesversion. Therefore, can we temporarily disable these specific examples on XPU to prevent CI errors?