-
Notifications
You must be signed in to change notification settings - Fork 68
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Http4s Client leaks with ZIO 2.x #640
Comments
Further to this, if enabling the e.g. after running for a while
Which means that 224674 fibers aren't complete, but the app should only have a few fibers running at once |
I have an update after @petoalbert and I have been investigating. It turns out that there is (at least) three bugs in play here:
Below is a screenshot of our app leaking. Blaze Server is running all the time holding onto a Dispatcher that is indirectly holding the FiberRefs After some attempts, the following simplified app shows similar behaviour:
Full project at (with dependencies) at https://github.com/ollyw/zio-2-fiber-leak-reproducer/blob/main/src/main/scala/FiberLeakViaDispatcherExample.scala Here is a screenshot in the debugger of the latch holding onto the many values in The suspicion is that when |
Final updated here. It appears the issue specific to the Cats Effect upgrade was caused by one or more issues with the ZIO core library, not cats-interop. After some supervision related PRs were merged, all seems to work as of 2.0.5+49-a3909794-SNAPSHOT zio/zio#7676 Thanks for fixing @adamgfraser |
Background
Since migrating some projects to ZIO 2.0.x there have been a few issues with managed heap space running out. Some improvements in recent ZIO 2.0.x versions have improved it (migration from fiber root nursery), but there still seems at least one underlying issue. The most recent issue was triggered with a Cats-Effect upgrade from 3.3.x to 3.4.x.. When trying to reproduce, a small part of the app behaviour was simulated in this reproducer. It is unclear if it is the root cause in the app, but certainly this reproducer is not behaving as expected.
Reproducer
See https://github.com/ollyw/zio-2-fiber-leak-reproducer
The app simply polls an endpoint using Http4s Client + Blaze + ZIO. Over time the number of suspended fibers grows and grows. The essence of the app is basically just
The app logs the number of suspended fibers so it can be seen. It can also dump the fibers if you uncomment the code
The leak can also be verified with running with tools such as Visual VM and inspecting the heap after running for a while. If you want to speed up the leaking, change the schedule and point to a local http endpoint instead of the hardcoded URL (https://zio.dev)
Results
Example trace of the leaking fibers
Do different versions of ZIO affect it?
Does the latest version of Cats-interop fix it?
Version 23.0.0.0 was release a few weeks ago with some improvements for cats-effect "lawfulness". However this version and the previous version 3.3.0 both some the same behaviour. The fiber trace points to a different line number in ZioAsync though.
Further diagnosis
The fiber dumps seem to indicate that the leaking fiber is started as a timeout action.
It could be that this issue is related existing issues #616
and the follow on "Support the "onCancel associates over uncancelable boundary" law (ZIO 2/CE 3)" #617. However it might also be something different. Perhaps Http4s is misusing the effects somehow?
The text was updated successfully, but these errors were encountered: