Skip to content

HPU (Intel Gaudi) support for bitsandbytes #1592

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

ckvermaAI
Copy link

@ckvermaAI ckvermaAI commented Apr 14, 2025

This PR enables the support of bitsandbytes for HPU (Intel Gaudi) devices.

  1. Adding HPU as the supported device.
  2. Create a backend for HPU devices (bitsandbytes/backends/hpu.py).

Authored by: Chetan Kumar Verma [email protected]
Co-authored-by: Ruheena Suhani Shaik [email protected]
Co-authored-by: Bhargav Eede [email protected]
Co-authored-by: Vivek Goel [email protected]

Authored by: Chetan Kumar Verma <[email protected]>
Co-authored-by: Ruheena Suhani Shaik <[email protected]>
Co-authored-by: Bhargav Eede <[email protected]>
Co-authored-by: Vivek Goel <[email protected]>
@vivekgoe
Copy link

@jiqing-feng @Titus-von-Koeller Please help review these changes. These changes add support for NF4 quantization/dequantization using Intel Gaudi hardware. https://www.intel.com/content/www/us/en/products/details/processors/ai-accelerators/gaudi.html
This PR adds support for only single level NF4 quantization for now, we are working on adding support for second level NF4 quantization and will add that using another PR in near future. Thanks!

Copy link

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@Titus-von-Koeller Titus-von-Koeller marked this pull request as ready for review April 15, 2025 15:26
@Titus-von-Koeller Titus-von-Koeller merged commit b090d85 into bitsandbytes-foundation:multi-backend-refactor Apr 15, 2025
1 of 2 checks passed
@Titus-von-Koeller
Copy link
Collaborator

The code looks good, thanks for your work on this, a promising first step!

@vivekgoe @ckvermaAI

Please see this short update about the multi-backend refactor #1596.

Regarding the Intel backend, as discussed in parallel with Ke Ding, the target for PRs migrating existing work from multi-backend-refactor instead of main will be the new bitsandbytes-intel repo.

However, some of the pure torch ops and generic cpu functionality still make more sense in the main branch of bitsandbytes, if they don't have the Intel IPEX dependency. Please align with @matthewdouglas and me on those. It's probably best to discuss that in our shared Slack channel.

@vivekgoe
Copy link

@Titus-von-Koeller Thanks for reviewing and merging our PR! If possible, please add me to shared Slack channel you mentioned or if it needs to be done by someone in Intel team then let me know I will follow-up internally.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants