-
-
Notifications
You must be signed in to change notification settings - Fork 2.8k
std.debug: fix some corner cases #23927
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
// Getting the backtrace inside the signal handler (with the ucontext_t) | ||
// gets stuck in a loop on some systems: | ||
const expect_signal_frame_overflow = | ||
(native_arch == .arm and link_libc); // loops above main() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
glibc vs musl? armeb
, thumb
, thumbeb
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't investigated the failure cases too closely yet. I'm just trying to get the test to compile and not blow up on every architecture. And I'm trying to avoid watering the test down too much on the platforms where it works reliably. That said, the failure I see for this one is when statically linking musl to the test case.
I haven't been building the other ARM variants, so I'll try and mix those in too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see this code was unchanged, so do the others work?
native_arch == .mips or | ||
native_arch == .mipsel or | ||
native_arch == .mips64 or | ||
native_arch == .mips64el or | ||
native_arch == .powerpc64 or | ||
native_arch == .powerpc64le; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's wrong with these?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally they don't seem to generate traces at all (either through dumpCurrentStackTrace
or with StackIterator
). Zig doesn't have a ucontext_t
on MIPS. I'm not sure what's up with the PowerPC ones.
On a related note, Zig CI failed on aarch64-linux
(no libc) because the stack trace gets stuck in a loop above main
(see https://github.com/ziglang/zig/actions/runs/15104670128/job/42451228605?pr=23927). So I've added .aarch64
to this list of ignorable failures for now. Its flaky though. The test sometimes builds traces without looping for me locally, and sometimes not. (From what I can tell a specific build of test test is deterministic, but across multiple builds its not.)
This test creates three nested stack frames and then tests stack trace creation. Add some additional tests of stack traces by invoking "dumpCurrentStackTrace()" and by using a signal handler's "context" parameter to feed backtrace construction. Make the test case at least runnable on a wide variety of systems (including Windows, and WASI). Because `ucontext_t` and `getcontext` are not evenly supported everywhere, some systems are expected only get through parts of the test.
b95af2b
to
37ebc96
Compare
This is ready for a review. I think the actual fixes are all straightforward, but the test is generating a lot of stderr spew (both from the dump-stack-trace functions being tested and my verbose |
You can capture the output from the build script, e.g. by adding an "expected output" check. See for example #23892. |
@@ -13,6 +13,9 @@ const native_arch = builtin.cpu.arch; | |||
const native_os = builtin.os.tag; | |||
const native_endian = native_arch.endian(); | |||
|
|||
/// Maximum distance to walk when iterating through a stack trace. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/// Maximum distance to walk when iterating through a stack trace. | |
/// Maximum number of frames to walk when iterating through a stack trace. |
(native_arch != .wasm32) and | ||
(native_arch != .wasm64); // wasm has no introspection |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit:
(native_arch != .wasm32) and | |
(native_arch != .wasm64); // wasm has no introspection | |
!native_arch.isWasm(); // wasm has no introspection |
// gets stuck in a loop on some systems: | ||
const expect_signal_frame_overflow = | ||
(native_arch == .arm and link_libc) or // loops above main() | ||
(native_arch == .aarch64); // non-deterministic, sometimes overflows, sometimes not |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
aarch64_be
?
(native_arch == .x86_64 and link_libc and builtin.abi.isGnu()) or // stuck on pthread_kill? | ||
(native_arch == .x86_64 and link_libc and builtin.abi.isMusl() and builtin.omit_frame_pointer) or // immediately confused backtrace | ||
(native_arch == .x86_64 and builtin.os.tag.isDarwin()) or // immediately confused backtrace | ||
(native_arch == .aarch64 or native_arch == .aarch64_be) or // non-deterministic, sometimes overflows, sometimes confused |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit:
(native_arch == .aarch64 or native_arch == .aarch64_be) or // non-deterministic, sometimes overflows, sometimes confused | |
native_arch.isAARCH64() or // non-deterministic, sometimes overflows, sometimes confused |
(native_arch == .x86_64 and link_libc and builtin.abi.isMusl() and builtin.omit_frame_pointer) or // immediately confused backtrace | ||
(native_arch == .x86_64 and builtin.os.tag.isDarwin()) or // immediately confused backtrace | ||
(native_arch == .aarch64 or native_arch == .aarch64_be) or // non-deterministic, sometimes overflows, sometimes confused | ||
(native_arch == .riscv64 and link_libc) or // `ucontext_t` not defined yet |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
riscv32
? Also has no ucontext_t
IIRC.
native_arch == .mips or // Missing ucontext_t. Most stack traces are empty ... (with or without libc) | ||
native_arch == .mipsel or // same as .mips | ||
native_arch == .mips64 or // same as .mips | ||
native_arch == .mips64el or // same as .mips |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit:
native_arch == .mips or // Missing ucontext_t. Most stack traces are empty ... (with or without libc) | |
native_arch == .mipsel or // same as .mips | |
native_arch == .mips64 or // same as .mips | |
native_arch == .mips64el or // same as .mips | |
native_arch.isMIPS() or // Missing ucontext_t. Most stack traces are empty ... (with or without libc) |
native_arch == .powerpc64 or // dumpCurrent* useless, StackIterator empty, ctx-based trace empty (with or without libc) | ||
native_arch == .powerpc64le; // same as .powerpc64 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit:
native_arch == .powerpc64 or // dumpCurrent* useless, StackIterator empty, ctx-based trace empty (with or without libc) | |
native_arch == .powerpc64le; // same as .powerpc64 | |
native_arch.isPowerPC64() or // dumpCurrent* useless, StackIterator empty, ctx-based trace empty (with or without libc) |
Also what about powerpc
?
@alexrp Thanks again for the reviews! One more question before I push a new version up: Should I make changes anywhere to get this test to compile/run against targets other than the default on CI? |
I guess you could just change the test's |
Add infinite loop detection to the
std.debug
backtraces. Make the backtrace and stacktrace code more robust on corner-case architectures.Expand the "unwind.zig" test case to exercise
std.debug.dumpCurrentStackTrace()
. And trigger a signal handler so the test can exercisestd.debug.dumpStackTraceFromBase()
andstd.debug.StackIterator.initWithContext()
using a kernel-constructed context.This is preparation for moving
std.debug
away fromgetContext()
(#23801).