Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allocate intermediate tensors in the same device as the input tensor #679

Open
polvalente opened this issue Mar 21, 2022 · 3 comments
Open
Labels
area:torchx Applies to Torchx kind:bug Something isn't working

Comments

@polvalente
Copy link
Contributor

Some functions in Torchx.Backend create intermediate Iotas and similar tensors. These should be allocated in the same device as the function's input tensor.

@polvalente polvalente added kind:bug Something isn't working area:torchx Applies to Torchx labels Mar 21, 2022
@polvalente polvalente changed the title [Torchx] Allocate intermediate tensors in the same device as the input tensor Allocate intermediate tensors in the same device as the input tensor Mar 21, 2022
@josevalim
Copy link
Collaborator

We should probably create a helper like we did for vectorization, where do:

with_device([a, b, c], fn a, b, c ->

end)

So we can wrap our backend functions by such helper. Pinging @grzuy because he likely needs similar for candlex.

@ffloyd
Copy link

ffloyd commented Dec 27, 2024

Hey!

Any update on this?

I think this is an important one because:

  • a lot of developers have Apple Silicon (and many has it provided by their companies)
  • due to unified memory Mac Book Pro with 30+ GB of memory is capable of ~22b model serving. I tested it with ollama and codestral 22b for example. It's not blazing fast, but my 3080 RTX cannot even run such models.
  • GPU setup with 30+ Gb is either more expensive than macbook (A100) or somewhat tricky (build a separate machine with several old Tesla GPUs, nvlink them, pray that you LLM rig will work).

Considering this, a huge audience who would like to experiment with LLMs will be forced to use Python or pay for hosted GPUs. :(

@josevalim
Copy link
Collaborator

For Apple Silicon, there is: https://github.com/elixir-nx/emlx - if you would like to use Torchx to target Metal, a pull request is welcome.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:torchx Applies to Torchx kind:bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants