Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

reproc::run fails with 'Invalid Argument' in Docker container #105

Closed
ChiefGokhlayeh opened this issue Jun 13, 2023 · 5 comments
Closed

Comments

@ChiefGokhlayeh
Copy link

ChiefGokhlayeh commented Jun 13, 2023

I come here from debugging an issue with mamba-org/micromamba. Libmamba uses reproc++ to invoke shell scripts.

On my native Arch install this works fine. However, I'm trying to set up a devcontainer using Docker containers.

I tried compling reproc++ with examples just to be sure and lo and behold, running reproc examples in Docker:

$ ./build/reproc++/examples/run whoami        
Invalid argument

This applies to all examples in reproc++/examples.

I tested using Debian- and Fedora-based images. Here is a minimal Dockerfile to quickly showcase my problem:

FROM debian:latest

RUN apt-get update \
    && apt-get install -y \
    build-essential \
    cmake \
    git \
    && rm -rf /var/lib/apt/lists/*

Build the Docker image like so:

docker build -t test .

Invoke the Docker container, clone the repository, build the tests and execute them:

$ docker run -it --rm test

#inside the container
root@3a01ceb74d30:/# git clone https://github.com/DaanDeMeyer/reproc.git
...
root@3a01ceb74d30:/# cd reproc/
root@3a01ceb74d30:/reproc# cmake -B build -DREPROC++=ON -DREPROC_EXAMPLES=ON
...
root@3a01ceb74d30:/reproc# cmake --build build
...

#just to test the 'whoami' exists and is an executable binary
root@3a01ceb74d30:/reproc# whoami
root

#now try to run the same program in reproc
root@3a01ceb74d30:/reproc# ./build/reproc++/examples/run whoami
Invalid argument

Again, I tested the same with an Fedora-based image, same result.

@ChiefGokhlayeh
Copy link
Author

ChiefGokhlayeh commented Jun 14, 2023

Looks like the issue is limited to Arch hosts (or possibly only Kernels 6.3.7 and up). I tested on two machines (both Arch) and the above mentioned procedure resulted in "Invalid argument". I got to test it on a Ubuntu 22.04 machine (Kernel 5.15.0) the error does not occur.

@ChiefGokhlayeh
Copy link
Author

I finally go around to actually debugging the issue. The error occurs in the child process (setup GDB with set follow-fork-mode child).

For whatever reason in

static int get_max_fd(void)
{
struct rlimit limit = { 0 };
int r = getrlimit(RLIMIT_NOFILE, &limit);
if (r < 0) {
return -errno;
}
rlim_t soft = limit.rlim_cur;
if (soft == RLIM_INFINITY || soft > INT_MAX) {
return INT_MAX;
}
return (int) (soft - 1);
}
getrlimit() returns limit.rlim_cur = 1073741815 = 0x3FFFFFF8. This of course is much larger than the set limit of MAX_FD_LIMIT = 1024 * 1024 = 1048576 = 0x100000.

The check max_fd > MAX_FD_LIMIT subsequently fails and the process is aborted. This should return EMFILE, but errno is not overwritten, so the child incorrectly reports EINVAL to the parent due to earlier suppressed errors during

for (int signal = 0; signal < 32; signal++) {
r = sigaction(signal, &action, NULL);
if (r < 0 && errno != EINVAL) {
r = -errno;
goto finish;
}
}
(e.g. sigaction(SIGKILL, &action, NULL) -> errno=EINVAL which is ok and gets ignored, but errno is still set to EINVAL).

@ChiefGokhlayeh
Copy link
Author

ChiefGokhlayeh commented Jun 14, 2023

Well, I guess I found the issue. Inside my Docker container (and only in the container) the file descriptor limit is set to an insanely high number, but not unlimited.

$ ulimit -a
-t: cpu time (seconds)              unlimited
-f: file size (blocks)              unlimited
-d: data seg size (kbytes)          unlimited
-s: stack size (kbytes)             8192
-c: core file size (blocks)         unlimited
-m: resident set size (kbytes)      unlimited
-u: processes                       unlimited
-n: file descriptors                1073741816
-l: locked-in-memory size (kbytes)  8192
-v: address space (kbytes)          unlimited
-x: file locks                      unlimited
-i: pending signals                 253585
-q: bytes in POSIX msg queues       819200
-e: max nice                        0
-r: max rt priority                 0
-N 15: rt cpu time (microseconds)   unlimited

@ChiefGokhlayeh
Copy link
Author

ChiefGokhlayeh commented Jun 14, 2023

I was able to resolve the issue by running my Docker container with a more sane file descriptor limit:

docker run --ulimit nofile=1024:1024 ...

In VSCode devcontainer.json simply add:

{
    //...

    "runArgs": [
        "--ulimit", "nofile=1024:1024"
    ],

    // ...
}

For Docker compose file follow https://stackoverflow.com/a/58093008/4069539

@ChiefGokhlayeh
Copy link
Author

Duplicate of #82

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant