Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix propagating of unhandled exception to native caller #113980

Merged

Conversation

janvorli
Copy link
Member

The final propagation of unhandled exception into topmost native frames on foreign threads has a bug that hits when the native caller catches the exception and then calls into managed code again. The problem is that we weren't popping out the explicit frames and ExInfos. So when the managed code is entered again, the Thread::m_pFrame points to a garbage. The same is true for the ExInfo chain.

This is causing crashes in a 3rd party application where a foreign thread runs some MFC code that calls to some managed functions regularly and swallows exceptions stemming from it.

While investigating the issue, I have found that there is a secondary related problem. When we mark the thread by setting the TSNC_ProcessedUnhandledException flag, the ProcessCLRException intentionally starts ignoring managed frames, because we try to let the exception bubble out to the 3rd party code. But if it catches the exception and calls back to managed code, the
TSNC_ProcessedUnhandledException flag stays set and thus when the exception would need to propagate through native frames, all the managed frames would be ignored.

This change fixes both of the problems and adds a test that crashes without the fix in both of the cases.

Close #113751

The final propagation of unhandled exception into topmost native frames
on foreign threads has a bug that hits when the native caller catches
the exception and then calls into managed code again.
The problem is that we weren't popping out the explicit frames and
ExInfos. So when the managed code is entered again, the `Thread::m_pFrame`
points to a garbage. The same is true for the `ExInfo` chain.

This is causing crashes in a 3rd party application where a foreign
thread runs some MFC code that calls to some managed functions regularly
and swallows exceptions stemming from it.

While investigating the issue, I have found that there is a secondary
related problem. When we mark the thread by setting the
TSNC_ProcessedUnhandledException flag, the ProcessCLRException
intentionally starts ignoring managed frames, because we try to let the
exception bubble out to the 3rd party code. But if it catches the
exception and calls back to managed code, the
TSNC_ProcessedUnhandledException flag stays set and thus when the
exception would need to propagate through native frames, all the managed
frames would be ignored.

This change fixes both of the problems and adds a test that crashes
without the fix in both of the cases.

Close dotnet#113751
@janvorli janvorli added this to the 10.0.0 milestone Mar 27, 2025
@janvorli janvorli requested a review from jkotas March 27, 2025 18:47
@janvorli janvorli self-assigned this Mar 27, 2025
@Copilot Copilot bot review requested due to automatic review settings March 27, 2025 18:47
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR fixes an issue related to the improper propagation of unhandled exceptions from native to managed code by ensuring appropriate cleanup of explicit frames and ExInfos. It adds new native callbacks and updates tests to validate the fix in scenarios where the exception is caught and reentered into managed code.

Files not reviewed (3)
  • src/coreclr/vm/exceptionhandling.cpp: Language not supported
  • src/coreclr/vm/threads.h: Language not supported
  • src/tests/Exceptions/ForeignThread/ForeignThreadExceptionsNative.cpp: Language not supported
Comments suppressed due to low confidence (1)

src/tests/Exceptions/ForeignThread/ForeignThreadExceptions.cs:22

  • [nitpick] The method name 'ThrowException' is generic and could be more descriptive. Consider renaming it to something like 'ThrowNativeException' to clarify that it throws a native exception.
public static extern void ThrowException();

@janvorli
Copy link
Member Author

Ah, I've forgotten to make the new test Windows coreclr only.

@janvorli
Copy link
Member Author

I can see the change has broken two diagnostic tests. I am looking into it

The SOS tests that examine crash dump of an unhandled exception were
failing with this change due to the fact that the explicit frames and
ExInfos were popped before the RaiseException. The Watson / crash dump
generation kicks in at the end of the first pass of the RaiseException,
so the frames need to be popped in the 2nd pass. I have changed it so
that they are popped at the first call to the ProcessCLRException on the
2nd pass.
@janvorli
Copy link
Member Author

@jkotas SOS tests ran in the CI were failing because the frames were popped too early. They need to stay until the RaiseException completes the first pass so that a crash dump on Windows contains them.

@janvorli
Copy link
Member Author

/ba-g the test failures are infra issues on azurelinux that occur on all recent CI runs.

@janvorli janvorli merged commit 5b36734 into dotnet:main Mar 31, 2025
101 of 104 checks passed
@janvorli janvorli deleted the fix-unhandled-exception-caught-in-external branch March 31, 2025 11:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants