-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix propagating of unhandled exception to native caller #113980
Fix propagating of unhandled exception to native caller #113980
Conversation
The final propagation of unhandled exception into topmost native frames on foreign threads has a bug that hits when the native caller catches the exception and then calls into managed code again. The problem is that we weren't popping out the explicit frames and ExInfos. So when the managed code is entered again, the `Thread::m_pFrame` points to a garbage. The same is true for the `ExInfo` chain. This is causing crashes in a 3rd party application where a foreign thread runs some MFC code that calls to some managed functions regularly and swallows exceptions stemming from it. While investigating the issue, I have found that there is a secondary related problem. When we mark the thread by setting the TSNC_ProcessedUnhandledException flag, the ProcessCLRException intentionally starts ignoring managed frames, because we try to let the exception bubble out to the 3rd party code. But if it catches the exception and calls back to managed code, the TSNC_ProcessedUnhandledException flag stays set and thus when the exception would need to propagate through native frames, all the managed frames would be ignored. This change fixes both of the problems and adds a test that crashes without the fix in both of the cases. Close dotnet#113751
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR fixes an issue related to the improper propagation of unhandled exceptions from native to managed code by ensuring appropriate cleanup of explicit frames and ExInfos. It adds new native callbacks and updates tests to validate the fix in scenarios where the exception is caught and reentered into managed code.
Files not reviewed (3)
- src/coreclr/vm/exceptionhandling.cpp: Language not supported
- src/coreclr/vm/threads.h: Language not supported
- src/tests/Exceptions/ForeignThread/ForeignThreadExceptionsNative.cpp: Language not supported
Comments suppressed due to low confidence (1)
src/tests/Exceptions/ForeignThread/ForeignThreadExceptions.cs:22
- [nitpick] The method name 'ThrowException' is generic and could be more descriptive. Consider renaming it to something like 'ThrowNativeException' to clarify that it throws a native exception.
public static extern void ThrowException();
Ah, I've forgotten to make the new test Windows coreclr only. |
I can see the change has broken two diagnostic tests. I am looking into it |
The SOS tests that examine crash dump of an unhandled exception were failing with this change due to the fact that the explicit frames and ExInfos were popped before the RaiseException. The Watson / crash dump generation kicks in at the end of the first pass of the RaiseException, so the frames need to be popped in the 2nd pass. I have changed it so that they are popped at the first call to the ProcessCLRException on the 2nd pass.
@jkotas SOS tests ran in the CI were failing because the frames were popped too early. They need to stay until the RaiseException completes the first pass so that a crash dump on Windows contains them. |
/ba-g the test failures are infra issues on azurelinux that occur on all recent CI runs. |
The final propagation of unhandled exception into topmost native frames on foreign threads has a bug that hits when the native caller catches the exception and then calls into managed code again. The problem is that we weren't popping out the explicit frames and ExInfos. So when the managed code is entered again, the
Thread::m_pFrame
points to a garbage. The same is true for theExInfo
chain.This is causing crashes in a 3rd party application where a foreign thread runs some MFC code that calls to some managed functions regularly and swallows exceptions stemming from it.
While investigating the issue, I have found that there is a secondary related problem. When we mark the thread by setting the TSNC_ProcessedUnhandledException flag, the ProcessCLRException intentionally starts ignoring managed frames, because we try to let the exception bubble out to the 3rd party code. But if it catches the exception and calls back to managed code, the
TSNC_ProcessedUnhandledException flag stays set and thus when the exception would need to propagate through native frames, all the managed frames would be ignored.
This change fixes both of the problems and adds a test that crashes without the fix in both of the cases.
Close #113751