Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[mono] Chain SIGSEGV native crashes to the default SIGSEGV handler #110741

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

ivanpovazan
Copy link
Member

@ivanpovazan ivanpovazan commented Dec 16, 2024

Description

The problem

When SIGSEGV gets raised from native code Mono does not create crash reports on iOS-like platforms.
Instead our handler dumps native/managed stack traces and returns from the handler normally at:

mono_handle_native_crash (mono_get_signame (SIGSEGV), &mctx, (MONO_SIG_HANDLER_INFO_TYPE*)info);
if (mono_do_crash_chaining) {
mono_chain_signal (MONO_SIG_HANDLER_PARAMS);
return;

If it returns normally the code continues at the point the signal occurred, reexecuting the same instruction raising the signal again. This resulted in "double faults" in mono runtime, which we handled in such way that we exit from double faulting with exit(-1):

if (!double_faulted) {
g_assertion_disable_global (assert_printer_callback);
} else {
g_async_safe_printf ("\nAn error has occurred in the native fault reporting. Some diagnostic information will be unavailable.\n");
g_async_safe_printf ("\nExiting early due to double fault.\n");
_exit (-1);

preventing any crash log of being created.

This is true even in .NET 7, stack trace of an app that crashes, but does not generate any crash report (full crash report: https://gist.github.com/ivanpovazan/798e5185f13dada2d80e9b023b179a18) :

...
0x109687394 - /Users/ivan/tmp/net7/MacCat7/bin/Debug/net7.0-maccatalyst/maccatalyst-arm64/MacCat7.app/Contents/MacOS/MacCat7 : xamarin_main
0x109ae3574 - /Users/ivan/tmp/net7/MacCat7/bin/Debug/net7.0-maccatalyst/maccatalyst-arm64/MacCat7.app/Contents/MacOS/MacCat7 : main
0x187bdc274 - /usr/lib/dyld : start

Exiting early due to double fault.

Crash chaining

On mobile we enable crash chaining mono_do_crash_chaining=true and signal chaining mono_do_signal_chaining=true :
https://github.com/xamarin/xamarin-macios/blob/27e1882103ef4bc1c8db16c18533039ecc03ee3e/runtime/monovm-bridge.m#L83-L84
which means that after handling the native crash we chain the signal to any previous handlers defined for the signal. However, during registration of mono signal handlers, the default handler is being ignored:

/* if there was already a handler in place for this signal, store it */
if (! (previous_sa.sa_flags & SA_SIGINFO) &&
(SIG_DFL == previous_sa.sa_handler)) {
/* it there is no sa_sigaction function and the sa_handler is default, we can safely ignore this */
} else {
if (mono_do_signal_chaining)
save_old_signal_handler (signo, &previous_sa);
}

For this reason there is no effect when mono_chain_signal is called in the basic setup (no user-defined handler).

The fix

In this PR we are changing the logic of preserving original handlers by considering the default SIGSEGV handlers - when sa.sa_handler == SIG_DFL for signal chaining as well.
When the SIGSEGV native crash occurs and after we finish with our handler, we first try to chain the signal to any saved non-default handler - sa.sa_handler != SIG_DFL, if that fails, then we try chaining to the default one - sa.sa_handler == SIG_DFL. This way we avoid double faulting and properly propagate the SIGSEGV to the OS handler.

Test

Tested on MAUI iOS and MacCatalyst apps where now crash reports are properly generated.
Android behaviour is left unchanged.


Fixes: #106064

@lateralusX
Copy link
Member

lateralusX commented Dec 16, 2024

I'm a little curious if this works on Android or if we have the same issue there? As you said, this is really old code and nothing changed in the runtime around this for a long time, so a little curious how long this has been an issue. Did you end up without a chained signal handler on iOS? If that is the case, we hit this condition:

 if (! (previous_sa.sa_flags & SA_SIGINFO) && 
 		(SIG_DFL == previous_sa.sa_handler)) { 
 	/* it there is no sa_sigaction function and the sa_handler is default, we can safely ignore this */ 

Meaning we won't add anything to chain, that would lead to returning back from signal handler re-executing the faulting instruction,

@lateralusX
Copy link
Member

lateralusX commented Dec 16, 2024

An alternative could be to add an internal chained function in runtime in case you end up with default SIGSEGV handler, that function could restore the default handler and re-raise the signal. That will make sure we get the default handler executed and it won't change the current logic. The default handler should never return, so you could probably end that function with g_assert_not_reached.

@ivanpovazan
Copy link
Member Author

ivanpovazan commented Dec 16, 2024

'm a little curious if this works on Android or if we have the same issue there?

Will run some manual tests.

Did you end up without a chained signal handler on iOS?

Yes, on iOS chaining has no effect, since we only have the default handler active at the point of registering our handlers.

An alternative could be to add an internal chained function in runtime in case you end up with default SIGSEGV handler, that function could restore the default handler and re-raise the signal.

You mean instead of calling abort () as I did in this PR, to have a function which would do something like:

   struct sigaction sa;
    sa.sa_handler = SIG_DFL;
    sigemptyset(&sa.sa_mask);
    sa.sa_flags = 0;
    sigaction(SIGSEGV, &sa, NULL);
    raise (SIGSEGV);

and call it instead of abort () ?

@lateralusX
Copy link
Member

lateralusX commented Dec 16, 2024

You mean instead of calling abort () as I did in this PR, to have a function which would do something like:

either like that or you could store the struct from original signal handler when installing it and then when restored the exact same struct back to make sure you get all flags restored, like what's done in remove_signal_handler, but I don't think we should remove anything from the table (its not locked), but you could get the old struct back and restore that since it should be the SIG_DFL.

@ivanpovazan ivanpovazan force-pushed the fix-ios-sigsegv-signal-handling branch from de59b5b to 1cfbbca Compare December 17, 2024 18:32
@ivanpovazan
Copy link
Member Author

I'm a little curious if this works on Android

On Android, during registration of Mono signal handlers we get:

12-17 18:58:19.591 28872 28872 D Mono    : Saving old signal handler for signal 8.
12-17 18:58:19.591 28872 28872 D Mono    : Saving old signal handler for signal 4.
12-17 18:58:19.591 28872 28872 D Mono    : Saving old signal handler for signal 7.
12-17 18:58:19.591 28872 28872 D Mono    : Saving old signal handler for signal 31.
12-17 18:58:19.591 28872 28872 D Mono    : Saving old signal handler for signal 6.
12-17 18:58:19.591 28872 28872 D Mono    : Saving old signal handler for signal 11.

which means that original handlers on Android don't have sigaction.sa_handler == SIG_DFL and the crashes are properly chained to the original handlers once they are handled by Mono.

@ivanpovazan ivanpovazan changed the title [mono] Abort after handling native SIGSEGV crash when chaining fails on iOS-like platforms [mono] Chain SIGSEGV native crashes to default handler on iOS-like platforms Dec 18, 2024
@ivanpovazan ivanpovazan changed the title [mono] Chain SIGSEGV native crashes to default handler on iOS-like platforms [mono] Chain SIGSEGV native crashes to the default SIGSEGV handler Dec 18, 2024
@lateralusX
Copy link
Member

LGTM!

@ivanpovazan ivanpovazan marked this pull request as ready for review December 18, 2024 21:01
@ivanpovazan
Copy link
Member Author

/azp run runtime-extra-platforms

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[mono][ios] Mono prevents crash logs from being generated in case SIGSEGV is raised from native code
3 participants