Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Asynchronous modules stop processing events on SIGSTOP #253

Closed
cornerman opened this issue Apr 7, 2016 · 28 comments
Closed

Asynchronous modules stop processing events on SIGSTOP #253

cornerman opened this issue Apr 7, 2016 · 28 comments

Comments

@cornerman
Copy link
Contributor

When i3bar is hidden, the statusline generator gets a SIGSTOP signal. Thus, the asynchronous modules cannot process new events, which currently leads to i3/i3#2280 in i3 - ultimately, this can freeze i3. Depending on how this is solved, it might be desirable to have the event processing in a different process, so it is not stopped. What do you think?

This is also mentioned in #179 for a different use case.

@cornerman cornerman changed the title Asynchronous modules should be :D Asynchronous modules stop processing events on SIGSTOP Apr 7, 2016
@tobes
Copy link
Contributor

tobes commented Apr 8, 2016

Interesting stuff. Do we have an easy way to reproduce this?

@cornerman
Copy link
Contributor Author

Yeah, it is rather easy to reproduce.

First, activate an async module in your i3status.conf:
order += "window_title_async"

Then, hide your bar, so py3status gets stopped:
$ i3-msg bar mode hide

Trigger the event a few times, i.e., switch window focus a few times (until i3 freezes).

@tobes
Copy link
Contributor

tobes commented Apr 8, 2016

Hmm, this is a nice problem. So really the modules need to run even when py3ststus is stopped. But modules like group.py hook quite deeply into py3status.

Maybe we need to have py3status as a thin layer that just spawns the actual py3status and reads/writes data to/from i3bar etc. Since we can't trap SIGSTOP and have to live.

@cornerman
Copy link
Contributor Author

Yeah, at least the asynchronous modules should not be stopped. A thin py3status main process, which forks of the actual worker and prints data sounds like a good idea. Another possibility would be to centralize the event processing into another process. But then this process would need to buffer incoming events when the receiving modules are stopped - not sure whether this would be a viable solution.

@tobes
Copy link
Contributor

tobes commented Apr 8, 2016

Is this due to a recent change in i3, or just a long standing issue that hasn't been noticed before?

I think this will need some thinking through to get a nice fix, as ever there are lots of possible options.

@cornerman
Copy link
Contributor Author

Afaik, this is a long standing issue; the IPC interface was designed to have well-behaving clients. I agree, we should definitely take the time to think about a good solution and also see, how i3 solves it.

@tobes
Copy link
Contributor

tobes commented Apr 8, 2016

So my thought is

py3status - spawns new process (py3) write to i3bar and gets i3bar event messages

py3 - reads config, starts modules including i3status ones, creates i3bar ready output.

We then use shared memory to communicate.

py3

writes

  • output - unicode output in i3bar format
  • updated - boolean set to true when output has been updated

reads

  • event_queue - array of event strings

py3status

reads

  • update - checks to see if new output
  • output - passes straight to i3bar then resets update

writes

  • event_queue appends raw event strings from i3bar

That feels like the most minimal exchange we can get away with, both processes would be checking each other 10 times a second and we'd have some lock contention but it should be fairly minimal.
we might need to do a little bit around passing start up parameters to py3 etc but that should be simple.

@cornerman
Copy link
Contributor Author

Generally, this sounds good. Though, I am not totally sure whether the whole logic of py3status should go into the py3 subprocess. Does it then still make sense to stop the i3status process when we get a SIGSTOP or should it keep on running because the i3status modules do? Also, wouldn't we do the whole processing to generate the i3bar output eventhough it does not get printed because the main process is stopped?

@tobes
Copy link
Contributor

tobes commented Apr 9, 2016

Also, wouldn't we do the whole processing to generate the i3bar output eventhough it does not get printed

I think this is the simplest way to start with.

a) we would need to know that py3status has stopped.

b) some modules trigger other modules to update eg group, so we would have to remember all that. Also the output is only generated when a module has updated so we shouldn't be doing too much work.

Though, I am not totally sure whether the whole logic of py3status should go into the py3 subprocess.

I think that might be easier than trying to split things out. The output function Core.run() treats i3status modules as py3modules now so needs to talk to them etc being in the same process would make that far easier.

@tobes
Copy link
Contributor

tobes commented Apr 10, 2016

I was looking at the i3bar protocol http://i3wm.org/docs/i3bar-protocol.html

{ "version": 1, "stop_signal": 10, "cont_signal": 12, "click_events": true }

stop_signal

Specify to i3bar the signal (as an integer) to send to stop your processing. The default value (if none is specified) is SIGSTOP.

cont_signal

Specify to i3bar the signal (as an integer)to send to continue your processing. The default value (if none is specified) is SIGCONT.

So maybe we can just use a different signal and not do much more than not output to the bar when it's sent the 'stop' that would be a lot cleaner me thinks.

@tobes
Copy link
Contributor

tobes commented Apr 10, 2016

A rough cut using SIGUSR2 to stop

https://github.com/ultrabug/py3status/compare/master...tobes:i3bar-hidden?expand=1

It needs some polish but the basic theory works from my testing.

tobes added a commit to tobes/py3status that referenced this issue Apr 11, 2016
By default i3bar sends a -SIGSTOP to a process sending it data.  This causes
some modules interacting with i3 ipc to be suspended, which lead to locking
issues inside i3.  This patch switches to using SIGUSR2 as the suspend
signal.

The async modules eg window_title_async were affected.

see ultrabug#253
@tobes
Copy link
Contributor

tobes commented Apr 11, 2016

@cornerman I've cleaned up my patch a344685 It's much simpler than I initially made it.

Could you test if this fixes the issue for you. In my testing It appears to work as intended

Really we should look at really 'sleeping' our modules but that can be a thing of the future.

@cornerman
Copy link
Contributor Author

Good idea, I was not aware of this possibility. But, hmm, this patch crashes py3status whenever the SIGCONT signal is received. Not sure, what is happening.

We should definitely be able to SIGSTOP the i3status modules when i3bar is hidden. Also, maybe makes sense to use threading.event instead of sleep until i3bar is running again?

@tobes
Copy link
Contributor

tobes commented Apr 12, 2016

We should definitely be able to SIGSTOP the i3status modules when i3bar is hidden.

This is true but I want to start with the simplest patch we can.

crashes py3status whenever the SIGCONT signal is received.

If you do killall -s USR2 py3status and killall -s CONT py3status from the cli does this also fail?
For me this works fine when the patch is applied to the current master.

Do you have any info on the crash?

@cornerman
Copy link
Contributor Author

Interesting! It works as expected if I run py3status in the console and manually send the signals. But if I run it in i3bar and hide the bar, it dies as soon as the bar is shown again. The logs only say:

Apr 12 11:39:03 genius /py3status[8571]: method window_title returned {'cached_until': 0, 'full_text': 'rxvt:journalctl -af /usr/bin/py3status', 'min_width': 240, 'align': 'left', 'separator_block_width': 20, 'name': 'window_title_async', 'instance': '', 'separator': False}
Apr 12 11:39:03 genius /py3status[8571]: method scratchpad_counter returned {'cached_until': 0, 'full_text': '0 ⌫', 'name': 'scratchpad_async', 'instance': '', 'separator': False, 'align': 'left', 'separator_block_width': 20}
Apr 12 11:39:04 genius /py3status[8571]: py3status: i3status died with code -12. Please try to fix this and reload i3wm (Mod+Shift+R)
Apr 12 11:39:04 genius /py3status[8571]: lock cleared, exiting

@tobes
Copy link
Contributor

tobes commented Apr 12, 2016

Ahh, I had this issue before. I think it is a config thing I'll update the patch.

tobes added a commit to tobes/py3status that referenced this issue Apr 12, 2016
By default i3bar sends a -SIGSTOP to a process sending it data.  This causes
some modules interacting with i3 ipc to be suspended, which lead to locking
issues inside i3.  This patch switches to using SIGUSR2 as the suspend
signal.

The async modules eg window_title_async were affected.

see ultrabug#253
@tobes
Copy link
Contributor

tobes commented Apr 12, 2016

is this any happier

tobes@d1474a1

@cornerman
Copy link
Contributor Author

Sadly, not happier :/

It actually also happens when I launch rofi or when I close i3-nagbar, really not sure what is happening there. Any ideas for debugging?

@tobes
Copy link
Contributor

tobes commented Apr 12, 2016

Does it work if you run py3status in standalone -s mode?

That at least narrows it down to the i3status loop

@tobes
Copy link
Contributor

tobes commented Apr 12, 2016

There is something because before I had this issue but it went away, I'm not sure if it is a particular module that triggers the issue.

@cornerman
Copy link
Contributor Author

Does it work if you run py3status in standalone -s mode?

Yes, this works as expected and does not crash.

@tobes
Copy link
Contributor

tobes commented Apr 12, 2016

OK, This is a good start.

So the problem is in I3status.run() so at least it is a small place to look. So maybe you can find something. If not I'll try to reproduce your breakage. I'm not sure why we see the SIGUSR2 here. Is i3status somehow sending it? I think it is a case of lots of logging.

If you just make the

                            err = self.poller_err.readline()
                            code = i3status_pipe.poll()

stuff continue evil though that is does it work then?

Anyhow we are half way there I think.

@cornerman
Copy link
Contributor Author

Ah, the error code from i3status is negative, so the check in run needs to be:

code = i3status_pipe.poll()
if code == -SIGUSR2:
    continue

@tobes
Copy link
Contributor

tobes commented Apr 12, 2016

Weird

But we now have a working patch - even if needs more improvements.

My view would be to try to get the basic patch into py3status and then add improvements separately. Do you agree or think a different approach is better?

Thanks for testing this

@cornerman
Copy link
Contributor Author

Weird

Yes :)

My view would be to try to get the basic patch into py3status and then add improvements separately. Do you agree or think a different approach is better?

I agree, as of now, we can make i3 crash and therefore this should be fixed asap. Improvements can be done in a later stage. Thanks for fixing this!

@tobes
Copy link
Contributor

tobes commented Apr 12, 2016

Do you want to make a PR for this?

@cornerman
Copy link
Contributor Author

You already have it there on your branch and it was your idea to begin with, so I think you should :)

tobes added a commit to tobes/py3status that referenced this issue Apr 12, 2016
By default i3bar sends a -SIGSTOP to a process sending it data.  This causes
some modules interacting with i3 ipc to be suspended, which lead to locking
issues inside i3.  This patch switches to using SIGUSR2 as the suspend
signal.

The async modules eg window_title_async were affected.

see ultrabug#253
tobes added a commit to tobes/py3status that referenced this issue Apr 12, 2016
By default i3bar sends a -SIGSTOP to a process sending it data.  This causes
some modules interacting with i3 ipc to be suspended, which lead to locking
issues inside i3.  This patch switches to using SIGUSR2 as the suspend
signal.

The async modules eg window_title_async were affected.

see ultrabug#253
@ultrabug
Copy link
Owner

Good job guys, and thank you !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants