Flaky hanging tests after merging #54 #66
Labels
part:tests
Affects the unit, integration and performance (benchmarks) tests
priority:high
Address this as soon as possible
resolution:duplicate
This issue or pull request already exists
type:bug
Something isn't working
What happened?
Pull request #54 introduced a timing issue with tests, making them flaky in amd64 but probably consistently failing in arm64 because the CI runs on qemu, which is extremely sloooooowwww.
It is probably related to the dependency bump of
client-dispatch
which in turn bumps the dependency ofchannels
, which has a change in howTimer
s work.There has been some investigation already done mainly by @Marenz, but it seems we'll need to spend some more time on this to find the root cause.
The issue seems to be that some condition variable is run in a different loop than the one it was created:
What did you expect instead?
Tests should run normally.
Affected version(s)
No response
Affected part(s)
Unit, integration and performance tests (part:tests)
Extra information
Here is a capture of logs when this happens: https://gist.github.com/Marenz/1ace8c7c0ccf01db70ceee8f767bb6f9#file-different-eventloop-py-L188.
The error seems to always happen (at least the error about using the wrong loop) inside the clean-up code from
select()
, it might help adding some logging there, like printing a stack trace when theselect()
was created and when it is being cleaned-up to see if both actions are done in different tests (and different loops).The text was updated successfully, but these errors were encountered: