-
Notifications
You must be signed in to change notification settings - Fork 31.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Segmentation fault while terminating worker #56312
Comments
while I'm not able to provide the reproduction at this time, if any of you can give me some pointers on what to look out for when debugging this issue would help a lot. I can try to find the bug and submit a fix PR if necessary, but need some help with it. |
Nice... You could build on debug mode and way may get a richer backtrace. Also attach a debugger where Node is crashing. |
FreeBSD FreeBSD 14.1-RELEASE-p5 GENERIC amd64 |
same problem, but i try finish worker when the callback message using terminate() or close(), but if i don't post message and try close there aren't segmetation fault |
Version
v20.17.0
Platform
Subsystem
worker
What steps will reproduce the bug?
I'm still working on creating a minimal reproduction for this bug, but it can be seen reported on this Nuxt issue: nuxt/nuxt#23832
How often does it reproduce? Is there a required condition?
It is consistent, whenever the program tries to terminate a running worker, the segmentation fault happens. It may be related to what the worker node is doing exactly, but I wasn't able to pin point the root cause as an open-source repro project does not exist.
What is the expected behavior? Why is that the expected behavior?
The expected behavior is to not have any segmentation fault at all, and for the program to successfully terminate the worker. It is the expected behavior because the program shouldn't crash and is not violating any of the Node APIs.
What do you see instead?
I successfully extracted a core file using a Debug build of current main, and can see the following in
lldb
when runningbt all
:Backtrace of failed program
You can see that thread number 18 (frame 3) stopped the program because of an assertion in V8 while creating the Isolate:
DisallowGarbageCollection no_gc;
, while thread number 15 was waiting for the previous worker to terminate.Additional information
I'm suspicious there is an issue when creating workers while waiting for a previous worker to terminate (e.g. does the terminate process use garbage collection? if so, that could be the cause the worker starting on thread 18 fails the program since V8 doesn't expect it to be enabled), but I don't have the knowledge if this is something expected or not. If it isn't, then there is a bug somewhere that is allowing this to happen, and is one of the reason that creating a small reproduction is being very hard.
The text was updated successfully, but these errors were encountered: