-
Notifications
You must be signed in to change notification settings - Fork 8
WIP: Try to speed things up with async #209
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
This PR adds an `asyncify` function to try to turn sync code into async code. Still testing if this speeds things up.
Currently there seems to be issues with the threading library numba uses, on my linux box the omp module is used but this isn't available on osx so the workgroup thing has issues. There is also an intel tbb library but there are issues with getting that installed (I had to install it system wide in linux but it seems to need to be installed in osx and the suggested |
Some clarifications:
|
...
The latter is caused by the former, right? I wonder if we can use a different library than numba for acceleration that might better support the threading we want to do. If we are only using numba for accelerating a slow for loop or something, we could use some other library (maybe Cython or something, no idea which ones support what we want). Or is the issue that numba is already doing multithreading, so we can't do additional threading during the numba call? |
Currently our LSH hash is implemented numba it is already using multithreading (to parallelize a for loop, also it avoids materializing a large intermediate array) but that threading is all encapsulated within numba. From the prospective of the caller, this hash call is blocking, it uses threads under the hood but the caller needs to sit there until it is done. Normally in async, the concurrency comes from waiting (on things like I/O or this hash to finish) but the hash never says "ok, I'm waiting, you can run other things." The default way to "asyncify" something like that is to run it in another thread. This gives us back a wrapper that will say "ok, I'm waiting" until the computation is done in that thread, so we can The issue is that this means we are now launching multithreaded numba code from multiple threads which their default threading library can't deal with. Different libraries can enable different safety settings docs. For example, the OpenMP lib on linux can hand if we call multithreaded numba from multiple threads but it couldn't if we called multithreaded numba code from multiple processes. |
Hm, I see. While there may be a solution, this seems like something that we can punt on for now. |
This PR adds an
asyncify
function to try to turn sync code into async code. Still testing if this speeds things up.