Use zmq-anyio #1291

davidbrochart · 2024-11-07T13:57:48Z

@minrk this is using an AnyIO-compatible pyzmq API (from https://github.com/davidbrochart/zmq-anyio), as discussed in zeromq/pyzmq#2045.

davidbrochart · 2024-11-07T16:28:19Z

I can see with this PR that ipykernel runs on trio when hard-coding the backend here, but how can we choose e.g. from JupyterLab to set trio_loop?

minrk · 2024-11-08T07:52:03Z

Nice! So it looks like for compatibility with Windows, you've gone with spawning a select thread for every waiting socket?

I can see with this PR that ipykernel runs on trio when hard-coding the backend here, but how can we choose e.g. from JupyterLab to set trio_loop?

This should presumably be the same as any other Kernel configuration option, so a config=True traitlet, set e.g. from a CLI arg or env var in the kernelspec (or parameterized kernel, or (most compatible) in the kernel config file (~/.ipython/profile_default/ipython_kernel_config.py).

davidbrochart · 2024-11-08T08:27:45Z

So it looks like for compatibility with Windows, you've gone with spawning a select thread for every waiting socket?

Yes, and I think Tornado does something similar here. This means a zmq_anyio.Socket has to be started asynchronously with an async context manager (or by calling its start() method), but that's consistent with an AnyIO socket.
AnyIO's wait_socket_readable "does NOT work on Windows when using the asyncio backend with a proactor event loop (default on py3.8+)".

Thanks for the kernel configuration, I'll try that 👍

minrk · 2024-11-08T09:53:25Z

I think Tornado does something similar

It does. The big difference is Tornado starts one selector thread per event loop which is far more scalable, whereas zmq-anyio starts one per socket. This makes sense from a library simplicity standpoint since there isn't a thread running when you are done with a socket, but definitely isn't scalable and probably isn't what we should do long term (and why I think anyio should have this built-in, just like tornado).

As I understand it, this means anyio will spawn up to 40 threads by default. That's might be okay for ipykernel, but I'd say it does mean we shouldn't use zmq-anyio in any client places like jupyter-client or jupyter-server. Because as soon as you've got 40 idle zmq sockets waiting for a message (what they spend most of their time doing), any subsequent calls to to_thread will block forever. I do think the first priority here should be universal wait_socket_readable support in anyio itself.

You might be able to provoke this in ipykernel by spawning 41 subshells and not using them, since I think each one adds a socket that will be idle.

You could limit the starvation by making the to_thread function one with a timeout that you re-enter all the time from the main async thread. Idle sockets will still starve the thread pool, just with timeout-sized bubbles instead of forever.

davidbrochart · 2024-11-08T10:09:04Z

Interesting, I hadn't thought about that.
Maybe before moving that part to AnyIO we could have async zmq_anyio.start() and zmq_anyio.stop() that would start/stop the thread?

minrk · 2024-11-08T11:10:19Z

Maybe before moving that part to AnyIO we could have async zmq_anyio.start() and zmq_anyio.stop() that would start/stop the thread?

Sure! I think that's sensible. I don't have enough experience with the task group hierarchy stuff to know what that should look like. I think it's probably appropriate to have some tests in zmq-anyio with a lot of idle sockets (at least more than the thread count, which I think can be set to 1 or 2) to probe this stuff.

If I were the one writing it, I'd implement a zmq_anyio.wait_readable that:

invokes tornado's SelectorThread logic as needed on asyncio as done in pyzmq, and
calls trio.lowlevel.wait_readable on trio
(ideally) accepts integer FD so it doesn't need to call socket.fromfd (like trio.lowlevel.wait_readable and asyncio.add_reader, unlike anyio.wait_socket_readable) which I think doubles the number of open FDs

You should be able to base it on anyio.wait_socket_readable which assumes socket objects, despite there being no actual need for that restriction (I'm guessing it's inherited because trio used to have this same issue, but doesn't anymore).

A smaller, but maybe less clean and less efficient version with a one-time monkeypatch:

if windows and asyncio and proactor:
    # only needed once per asyncio event loop, this is the only situation where a patch is needed
    loop = asyncio.get_running_loop()
    loop.add_reader = selector_add_reader # from tornado's AddThreadSelector
    loop.remove_reader = selector_remove_reader # from tornado's AddThreadSelector

...
# assume wait_socket_readable works, which it should now
await anyio.wait_socket_readable(socket.fromfd(zmq_sock.FD))
# hopefully anyio will fix integer FD support to match underlying asyncio and trio

If you did any of those, there would be the advantage that no actual thread is spawned except in the Windows + Proactor + asyncio case, which would get exactly one thread.

fwiw, I started to extract the tornado feature into its own package, but haven't tested it enough to publish a release if there's some reason to not depend on tornado for this feature (I don't think there is): https://github.com/minrk/async-selector-thread. Requiring tornado for this doesn't mean the tornado IOLoop object ever needs to be created, as the SelectorThread logic is pure asyncio, so there's really no reason not to require tornado for this as long as it's the only package with the required feature.

davidbrochart · 2024-11-09T20:30:32Z

Thanks @minrk, that was very helpful.
I both opened agronholm/anyio#820 and used the approach you described in davidbrochart/zmq-anyio#5.
BTW it seems that directly passing a file descriptor to anyio.wait_socket_readable works fine, no need to call socket.fromfd (with a type mismatch though).

minrk · 2024-11-13T10:23:52Z

I had another thought where you could shutdown the thread if nothing is waiting (when remove_reader is called). This might play nicer with anyio's design of shutting things down when they aren't in use, and you don't need anything hooked up to close unless it's called while waiting on a socket. But it comes at a performance cost because you are probably going to recreate the thread a whole bunch of times (once per message if you only have one socket). I don't actually think we should do that, but it's an idea if there are objections to leaving an idle thread running.

But really by far the most efficient approach is ZMQStream's event-driven on_recv, which registers the FD exactly once and calls handle_events whenever there might be a message, rather than calling add_reader and remove_reader for every message.

davidbrochart · 2024-11-14T17:15:16Z

Still a few tests failing, and trio is not enabled in tests (more failures), but this is taking shape.

davidbrochart · 2024-12-17T13:06:35Z

Tests don't hang anymore, the remaining failures seem to be for test_print_to_correct_cell_from_child_thread on Windows and Ubuntu/PyPy (there are also some ResourceWarning: unclosed file <_io.BufferedReader name=14> treated as errors).

davidbrochart · 2024-12-17T16:06:33Z

I think I'll need some help from @krassowski for the failing test_print_to_correct_cell_*, and @Carreau for the failures related to IPython:

Traceback (most recent call last):
  File "C:\Users\runneradmin\AppData\Local\hatch\env\virtual\ipykernel\iZc2KLS5\test\lib\site-packages\IPython\utils\_process_win32.py", line 124, in system
    return process_handler(cmd, _system_body)
ResourceWarning: unclosed file <_io.BufferedReader name=14>

Carreau · 2024-12-17T16:26:13Z

I think we can tracemalloc.enable() and the resource warning will tell us where the leasked thing is created.

Carreau · 2024-12-17T16:59:47Z

Oh, I had forgotten about PYTHONTRACEMALLOC env variable.

davidbrochart · 2024-12-17T17:16:52Z

But it doesn't show more information 🤔

Carreau · 2024-12-17T17:18:15Z

Well it did not help; I'll try to read the corresponding IPython code, but it's windows base so I'm not sure.

I'm wondering if those two threads could be the reason why:

https://github.com/ipython/ipython/blob/89d672ee48b0e0998159e85619a986837573ee40/IPython/utils/_process_win32.py#L74-L99

I'm far from an expert on threading, but maybe they should be joined somewhere ?

Carreau · 2024-12-17T17:36:02Z

I'm poking at stuff in ipython/ipython#14621 ;

.github/workflows/ci.yml

Carreau · 2024-12-17T20:10:46Z

that ... was not it

I will see if I can create a fast branch that only run the failing test, and iterate on commenting things in IPython to see where the buffer can come from. But I do not have windows, so it's a painful process.

davidbrochart · 2024-12-18T08:05:11Z

Thanks @Carreau.
I installed a Windows VM, I'll see if I can reproduce locally.

Carreau · 2024-12-18T09:27:40Z

I would also not be surprised for this kind of errors to depend on previous tests.

davidbrochart marked this pull request as draft November 7, 2024 13:57

davidbrochart added the enhancement label Nov 7, 2024

davidbrochart mentioned this pull request Nov 7, 2024

_eventloop_set event called from non-async context #1292

Open

minrk mentioned this pull request Nov 12, 2024

ipykernel 7.0.0a0 conda-forge/ipykernel-feedstock#186

Draft

5 tasks

davidbrochart force-pushed the zmq_anyio branch 2 times, most recently from 8c04773 to 90f12c2 Compare November 13, 2024 09:59

davidbrochart force-pushed the zmq_anyio branch from 90f12c2 to ee38f9e Compare November 14, 2024 16:50

davidbrochart marked this pull request as ready for review November 14, 2024 16:50

davidbrochart force-pushed the zmq_anyio branch from 04be394 to b340c2d Compare November 14, 2024 17:06

Use zmq-anyio

1fe492a

davidbrochart force-pushed the zmq_anyio branch from e7a0fde to 1fe492a Compare November 14, 2024 17:10

davidbrochart force-pushed the zmq_anyio branch 5 times, most recently from 9203727 to 130387c Compare November 15, 2024 09:11

Replace thread add_task with start_soon

615ec12

davidbrochart force-pushed the zmq_anyio branch 3 times, most recently from 3f40f1d to 8709b51 Compare November 15, 2024 09:59

Replace _IOPubThread with BaseThread

1834b58

davidbrochart added 4 commits December 5, 2024 17:29

-

4858e82

-

e22861a

-

35a45d1

-

7f11923

davidbrochart force-pushed the zmq_anyio branch from 3ed8418 to 7f11923 Compare December 17, 2024 12:50

davidbrochart added 2 commits December 17, 2024 16:16

-

32253f8

-

c4bf3b7

Enable tracemalloc

82ef8c4

Carreau reviewed Dec 17, 2024

View reviewed changes

.github/workflows/ci.yml Outdated Show resolved Hide resolved

Update .github/workflows/ci.yml

057f62c

Carreau reviewed Dec 17, 2024

View reviewed changes

.github/workflows/ci.yml Outdated Show resolved Hide resolved

Update .github/workflows/ci.yml

03a878d

-

25f939d

davidbrochart force-pushed the zmq_anyio branch from 4e4f945 to 25f939d Compare December 20, 2024 09:18

-

9de65b2

davidbrochart force-pushed the zmq_anyio branch from f01813a to 9de65b2 Compare December 20, 2024 10:31

davidbrochart mentioned this pull request Dec 20, 2024

Fix test_print_to_correct_cell_from_child_thread #1312

Merged

davidbrochart added 3 commits December 20, 2024 13:25

-

b9f9f73

-

e87ba37

-

f0818b6

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use zmq-anyio #1291

Use zmq-anyio #1291

davidbrochart commented Nov 7, 2024 •

edited

Loading

davidbrochart commented Nov 7, 2024

minrk commented Nov 8, 2024

davidbrochart commented Nov 8, 2024

minrk commented Nov 8, 2024

davidbrochart commented Nov 8, 2024

minrk commented Nov 8, 2024

davidbrochart commented Nov 9, 2024

minrk commented Nov 13, 2024

davidbrochart commented Nov 14, 2024

davidbrochart commented Dec 17, 2024

davidbrochart commented Dec 17, 2024

Carreau commented Dec 17, 2024

Carreau commented Dec 17, 2024

davidbrochart commented Dec 17, 2024

Carreau commented Dec 17, 2024

Carreau commented Dec 17, 2024

Carreau commented Dec 17, 2024

davidbrochart commented Dec 18, 2024

Carreau commented Dec 18, 2024

Use zmq-anyio #1291

Are you sure you want to change the base?

Use zmq-anyio #1291

Conversation

davidbrochart commented Nov 7, 2024 • edited Loading

davidbrochart commented Nov 7, 2024

minrk commented Nov 8, 2024

davidbrochart commented Nov 8, 2024

minrk commented Nov 8, 2024

davidbrochart commented Nov 8, 2024

minrk commented Nov 8, 2024

davidbrochart commented Nov 9, 2024

minrk commented Nov 13, 2024

davidbrochart commented Nov 14, 2024

davidbrochart commented Dec 17, 2024

davidbrochart commented Dec 17, 2024

Carreau commented Dec 17, 2024

Carreau commented Dec 17, 2024

davidbrochart commented Dec 17, 2024

Carreau commented Dec 17, 2024

Carreau commented Dec 17, 2024

Carreau commented Dec 17, 2024

davidbrochart commented Dec 18, 2024

Carreau commented Dec 18, 2024

davidbrochart commented Nov 7, 2024 •

edited

Loading