n2 leaks fds on macOS #14

nico · 2022-03-28T14:59:20Z

Command::spawn() in the rust stdlib unconditionally calls anon_pipe here: https://github.com/rust-lang/rust/blob/master/library/std/src/sys/unix/process/process_unix.rs#L62

anon_pipe on Linux calls pipe2 to set CLOEXEC on the pipe atomically: https://github.com/rust-lang/rust/blob/521734787ecf80ff12df7ca5998f7ec0b3b7b2c9/library/std/src/sys/unix/pipe.rs#L18

But macOS has no pipe2, so here the stdlib instead calls pipe() followed by set_cloexec: https://github.com/rust-lang/rust/blob/521734787ecf80ff12df7ca5998f7ec0b3b7b2c9/library/std/src/sys/unix/pipe.rs#L35

This means there's a window where the pipe is created but cloexec isn't set on the pipe's FDs yet. If a different thread forks in that window, the pipe's fds get leaked.

(Not that it matters, but set_cloexec is here https://github.com/rust-lang/rust/blob/master/library/std/src/sys/unix/fd.rs#L204 . On macOS and a few other OSs, it calls ioctl with FIOCLEX, which is just 1 syscall. On Linux and a few others, it calls fcntl with F_GETFD/F_SETFD which is 2 syscalls, but we don't need to call set_cloexec at all in this path on Linux.)

This is likely why n2 -j250 runs out of FDs on macOS, while ninja -j250 works fine.

Possible fixes:

Don't fork on background threads. Create subprocess on main thread, and then hand the created subprocess to background thread for waiting on it and to do deps processing
Don't use threads per subprocesses at all, use a select loop model instead (but still use threads for deps processing)
Don't use the Command object on non-Windows either (we already don't on Windows), and add mutexes around fork and pipe/set_cloexec

Spawning subprocesses takes some time (~~we made it async in chromium at some point for that reason~~ we looked at making it async in ninja for that reason), so moving it to the main thread would add some amount of work on to the critical path of n2.

The text was updated successfully, but these errors were encountered:

evmar · 2022-03-30T04:39:47Z

Having thought about this a bit, I am curious about the "spawning subprocesses takes some time" part. If it's a skip-a-frame kind of latency, like up to 100ms occasionally, then it's probably fine to serialize it in the main process, though I guess that would interact poorly with -j1000. If it's a multi second kind of thing then definitely it needs a different approach.

I'd probably just go with a mutex around calling Command::spawn (instead of ::output) for now, with an eye towards rewriting to ppoll eventually (or tokio maybe, but I'm kinda skeptical of async Rust).

piscisaureus · 2022-03-31T08:09:04Z

Command::spawn() in the rust stdlib unconditionally calls anon_pipe here

I think it is unlikely that FDs from this pipe are leaked.

Both ends are short lived and explicitly closed: the write end is closed immediately after fork() returns, the read end is closed one as soon as the child has called exec().

So the only situation it could cause a problem is when the process literally has 100s of forked-but-not-yet-exec'ed children. Theoretically possible, but it sounds like a stretch to me.

bnoordhuis · 2022-03-31T08:12:01Z

Just throwing it out there: libstd uses posix_spawn() under the right conditions (instead of fork + execve) and macOS has a POSIX_SPAWN_CLOEXEC_DEFAULT attribute that does what its name suggests. Teaching libstd about it probably isn't too hard.

we made it async in chromium at some point for that reason

@nico I'm interested, can you say more? My understanding is that as long as you're using fork, there's no getting away from paying the cost of copying the working set.

nico · 2022-03-31T14:14:30Z

I think it is unlikely that FDs from this pipe are leaked.

It's possible this theory is wrong. Background is that n2 needs way more FDs than ninja (ninja -j250 works fine on my mac, n2 -j250 dies with "Too many open files (os error 24"), and I compared fs_usage logs.

Python script to generate build.ninja file that repros the problem

% cat fdtest.py
import textwrap
def write(f):
  f.write(textwrap.dedent('''\
    rule b
      command = sleep 300; touch $out
    '''))
  for i in range(1000):
    f.write(f'build foo{i}: b\n')
  # n2 needs an explicit default target:
  f.write('default')
  for i in range(1000):
    f.write(f' foo{i}')
  f.write('\n')
with open('build.ninja', 'w', encoding='utf-8') as f:
  write(f)
% python3 fdtest.py

I ran something like sudo fs_usage -w -f filesys| grep n2 > fs-n2.txt and then ran n2 -j50 for the build file generated by that python script. Then I did the same with ninja, and compared the two txt files.

This was the main difference that stood out, and there is a race there. So it seems plausible to me, but it's well possible this is not the root cause. (It does fit with things seemingly working fine on Linux -- but Linux also has a higher FD limit.)

@nico I'm interested, can you say more?

Turns out I was confusing things! We didn't do this in chromium, we were talking about doing this in ninja, here: https://bugs.chromium.org/p/chromium/issues/detail?id=829710#c3 (but as far as I know didn't end up merging it)

IIRC the background there was that someone was running ninja on a Windows system with some thing installed that made process creation very slow. But if you're in that situation, your build is going to be slow anyways and it's imho not necessarily something tools should go out of their way to mitigate.

So it's probably fine to just start processes on the main thread.

Fixes an fd leak on all unix systems that don't have pipe2. As of today, that's all unix systems that are not dragonfly bsd, freebsd, linux, netbsd, openbsd, or redox -- in particular, macOS. Fixes evmar#14.

nico · 2022-04-02T02:11:25Z

#25 fixes this locally for me, so it does look like that race creation race is indeed the cause of the fd leak.

WIth that patch applied, and the build.ninja file generated by the script in #14 (comment), I can run n2 -j250 and it works just as well as ninja.

I can also build llvm with n2 -j250 using goma, just like with ninja.

Fixes an fd leak on all unix systems that don't have pipe2. As of today, that's all unix systems that are not dragonfly bsd, freebsd, linux, netbsd, openbsd, or redox -- in particular, macOS. Fixes evmar#14.

nico · 2022-04-02T02:44:48Z

(I also filed an upstream issue.)

Fixes an fd leak on all unix systems that don't have pipe2. As of today, that's all unix systems that are not dragonfly bsd, freebsd, linux, netbsd, openbsd, or redox -- in particular, macOS. Fixes #14.

This avoids issue #14 and gives us lower-level control over the setup of the pipe.

nico · 2023-09-15T12:24:58Z

I think 9c2104a regressed this. The test in #14 (comment) repros this again.

(I don't have permissions to reopen this issue here.)

From #14, this exercises fd exhaustion.

evmar added the help wanted Extra attention is needed label Mar 30, 2022

nico mentioned this issue Apr 2, 2022

guard process creation with a mutex #25

Merged

nico mentioned this issue Apr 2, 2022

std::process::Command::spawn() leaks file descriptors on unix platforms that don't have pipe2() rust-lang/rust#95584

Open

evmar closed this as completed in #25 Apr 2, 2022

evmar added a commit that referenced this issue Jul 7, 2023

use posix_spawn to spawn subprocesses

9c2104a

This avoids issue #14 and gives us lower-level control over the setup of the pipe.

evmar reopened this Sep 15, 2023

evmar added a commit that referenced this issue Jan 7, 2025

check in fdtest.py

aa619d9

From #14, this exercises fd exhaustion.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

n2 leaks fds on macOS #14

n2 leaks fds on macOS #14

nico commented Mar 28, 2022 •

edited

Loading

evmar commented Mar 30, 2022

piscisaureus commented Mar 31, 2022

bnoordhuis commented Mar 31, 2022

nico commented Mar 31, 2022 •

edited

Loading

nico commented Apr 2, 2022 •

edited

Loading

nico commented Apr 2, 2022

nico commented Sep 15, 2023

n2 leaks fds on macOS #14

n2 leaks fds on macOS #14

Comments

nico commented Mar 28, 2022 • edited Loading

evmar commented Mar 30, 2022

piscisaureus commented Mar 31, 2022

bnoordhuis commented Mar 31, 2022

nico commented Mar 31, 2022 • edited Loading

nico commented Apr 2, 2022 • edited Loading

nico commented Apr 2, 2022

nico commented Sep 15, 2023

nico commented Mar 28, 2022 •

edited

Loading

nico commented Mar 31, 2022 •

edited

Loading

nico commented Apr 2, 2022 •

edited

Loading