Skip to content

execd: global signal capture/reset causes spurious ECHILD and false command failures #1041

@LavenderQAQ

Description

@LavenderQAQ

Summary

execd can report a successfully-executed command as a failure (CommandExecError) when cmd.Wait() returns ECHILD ("waitid: no child processes"). The command's stdout is produced correctly, but because Wait() cannot retrieve the child's exit status, the result is surfaced as an error with a hardcoded exit code of 1.

The root trigger is in runCommand / runBackgroundCommand: they call signal.Notify(signals) with no signal list (capturing ALL signals, including SIGCHLD and SIGURG) and defer signal.Reset() (a process-global reset). This interferes with the Go runtime's own use of SIGCHLD/SIGURG (child reaping coordination and async preemption) and races across concurrent/sequential commands, occasionally leaving Wait() unable to reap its own child (ECHILD).

Version

  • components/execd: main

Reproduction

Run multiple commands back-to-back via the execd command run API, especially ones that fork a background child. cmd.Wait() intermittently returns ECHILD even though the command already finished and produced correct output.

Expected

A command that runs to completion and produces correct output is reported as success (exit status 0). execd should not surface an internal reaping race as a command failure.

Impact

  • Any caller running execd in a sandbox can intermittently get false CommandExecError for commands that actually succeeded.
  • More likely under concurrent/sequential command execution and background-fork workloads.
  • Additionally, capturing ALL signals via signal.Notify(signals) steals SIGURG from the Go runtime (used for async goroutine preemption) and uses a process-global signal.Reset(), which has process-wide side effects across concurrent commands.

Note: I am currently testing a patch. If I can no longer observe such issues after some time, I will submit a pull request.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions