Skip to content

fix(docker): remap container UID/GID at runtime to avoid volume mount permission errors#1923

Open
davison wants to merge 1 commit intopaperclipai:masterfrom
radiusred:fix/docker-volumes
Open

fix(docker): remap container UID/GID at runtime to avoid volume mount permission errors#1923
davison wants to merge 1 commit intopaperclipai:masterfrom
radiusred:fix/docker-volumes

Conversation

@davison
Copy link
Copy Markdown

@davison davison commented Mar 27, 2026

Thinking Path

Paperclip orchestrates AI agents for zero-human companies — it runs persistent processes that read and write agent state to a data directory.

To run Paperclip in a hosted or cloud environment, operators use Docker — mounting a host directory as /paperclip so data persists across container restarts.

The container runs as the node user (UID 1000). When the host directory is owned by a different UID (any non-root user on many Linux distros), writes to the mounted volume fail with permission errors. This is a very common problem, raised repeatedly in the community — as seen again recently in #dev.

The standard Docker solution is to remap the container user's UID/GID at startup to match the host directory owner — well-established practice, typically implemented with gosu and a minimal shell entrypoint.

The existing Dockerfile has no mechanism for this: it uses a static USER node directive with no entrypoint script, so there is no opportunity to perform the remap.

This PR adds USER_UID/USER_GID build args for build-time pre-baking, installs gosu, and adds a minimal docker-entrypoint.sh that re-checks and remaps the UID/GID at container startup before dropping privileges and exec-ing the server.

What Changed

  • Dockerfile
    • Added ARG USER_UID=1000 and ARG USER_GID=1000 build args (defaults to 1000 — no behaviour change for existing deployments)
    • Added gosu to the base apt install
    • Added usermod/groupmod at build time to pre-bake the UID/GID into the image layer
    • Added USER_UID and USER_GID to the ENV block so they are available at runtime
    • Replaced static USER node directive with ENTRYPOINT ["docker-entrypoint.sh"]
    • Copies docker-entrypoint.sh to /usr/local/bin/ and makes it executable
  • docker-entrypoint.sh (new file)
    • Reads USER_UID/USER_GID env vars at startup
    • Calls usermod/groupmod if the running UID/GID differ from the node user (handles runtime override without rebuild)
    • Calls chown -R node:node /paperclip to correct volume ownership
    • Drops to node user via exec gosu node "$@" — no persistent root process

Note on customisation: A question that keeps coming up is how to add tools, MCP servers, or extra packages to the image (e.g. for specific agent adapters). The cleanest pattern is to build a downstream image using Paperclip as a base:

FROM paperclipai/paperclip:latest
# Install your tools on top
RUN apt-get update && apt-get install -y ripgrep && rm -rf /var/lib/apt/lists/*

This PR keeps the upstream image minimal and focused. Documenting this pattern properly is a good follow-up.

Verification

Build and run with a non-default UID:

# Build with custom UID/GID baked in
docker build --build-arg USER_UID=1001 --build-arg USER_GID=1001 -t paperclip-test .

# Or override at runtime without rebuild
docker run -e USER_UID=1001 -e USER_GID=1001 \
  -v /host/paperclip/data:/paperclip \
  paperclip-test

# Confirm server runs as expected UID
docker exec <container> id
# uid=1001(node) gid=1001(node)

# Confirm volume ownership is corrected
docker exec <container> ls -la /paperclip
# drwxr-xr-x node node ...

Default build (no args): identical behaviour to the previous image — server runs as node (UID 1000), entrypoint is a no-op remap and proceeds immediately to exec gosu node.

Risks

  • Entrypoint runs briefly as root before the gosu drop — this is intentional and necessary for usermod/chown. The server process itself never runs as root. This is the standard pattern for this use case (Bitnami images, LinuxServer.io images, official Redis/Postgres images all do the same).
  • gosu adds a small dependency (~1 MB). gosu is a well-known, audited privilege-drop tool that avoids the TTY and signal-handling issues of su/sudo in containers.
  • Existing deployments unaffected — the entrypoint detects no UID/GID change and immediately exec-s the server with zero overhead.

Checklist

  • Thinking path traces from project context to change
  • Tests run locally and pass
  • Tests added or updated where applicable (no unit tests exist for the Dockerfile; this is a runtime change — verification steps above cover it)
  • Before/after screenshots included (N/A — no UI change)
  • Risks considered and documented
  • Will address all Greptile and reviewer comments before merge

@davison davison changed the title fix: remap container UID/GID at runtime to avoid volume mount permission errors fix(docker): remap container UID/GID at runtime to avoid volume mount permission errors Mar 27, 2026
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Mar 27, 2026

Greptile Summary

This PR adds runtime UID/GID remapping to the Paperclip Docker image, resolving the persistent volume mount permission errors that occur when the host directory is owned by a non-1000 UID.\n\nThe implementation is clean and follows established container patterns (Bitnami, LinuxServer.io, official Redis/Postgres). Three significant issues surfaced across prior review rounds — ARG propagation to the production stage ENV, unconditional chown -R on every startup, and the missing usermod -g call that left the primary GID in /etc/passwd stale — and all three have been correctly resolved in the head commit (a0e94b49).\n\n- Dockerfile: ARG USER_UID/USER_GID correctly re-declared in both base and production stages; ENV block properly forwards ${USER_UID} / ${USER_GID} build-arg values; gosu installed; ENTRYPOINT wired up.\n- docker-entrypoint.sh: changed=0 flag ensures chown -R only runs on actual remaps; groupmod -o -g + usermod -g correctly updates both /etc/group and /etc/passwd; exec gosu node \"$@\" properly replaces the root shell with a privilege-dropped server process.\n- One remaining P2: there is no guard against USER_UID=0 / USER_GID=0, which would silently remap the node user to root and run the server as UID 0, contradicting the PR's stated invariant that "the server process itself never runs as root."

Confidence Score: 5/5

Safe to merge — all prior P0/P1 issues are resolved; one P2 suggestion remains (root UID guard).

All three previously identified blocking issues (ARG propagation, unconditional chown, primary GID not updated) have been fixed in the head commit. The only remaining finding is a P2 defensive suggestion to reject USER_UID=0/USER_GID=0. Per the confidence guidance, a PR with only P2 findings defaults to 5/5.

docker-entrypoint.sh — minor hardening suggestion for UID/GID=0 guard

Important Files Changed

Filename Overview
Dockerfile Adds gosu, build-time usermod/groupmod with correctly re-declared ARGs in the production stage, runtime ENV forwarding, and entrypoint wiring. All previously flagged issues resolved.
docker-entrypoint.sh Changed-flag pattern correctly gates chown; groupmod + usermod -g correctly updates both /etc/group and /etc/passwd; exec gosu properly drops privileges. No guard against USER_UID/GID=0 (P2).
Prompt To Fix All With AI
This is a comment left during a code review.
Path: docker-entrypoint.sh
Line: 5-6

Comment:
**No guard against root UID/GID values**

If an operator (accidentally or intentionally) sets `USER_UID=0` or `USER_GID=0`, the entrypoint will remap the `node` user to UID/GID 0, then `exec gosu node "$@"` will start the server process as root — silently defeating the privilege-drop that this PR is designed to enforce. The PR description explicitly calls out "the server process itself never runs as root" as a key design invariant.

A simple guard at the top of the remap logic would enforce that:

```suggestion
PUID=${USER_UID:-1000}
PGID=${USER_GID:-1000}

if [ "$PUID" = "0" ] || [ "$PGID" = "0" ]; then
    echo "ERROR: USER_UID and USER_GID must not be 0 (root). Got UID=$PUID GID=$PGID" >&2
    exit 1
fi
```

How can I resolve this? If you propose a fix, please make it concise.

Reviews (3): Last reviewed commit: "fix: support host UID/GID mapping for vo..." | Re-trigger Greptile

Dockerfile Outdated
Comment on lines +61 to +62
USER_UID=1000 \
USER_GID=1000 \
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Build-arg USER_UID/USER_GID not propagated to runtime ENV

ARG values are scoped to the stage they are declared in. ARG USER_UID=1000 / ARG USER_GID=1000 are declared in the base stage, but the production stage never re-declares them. As a result, the ENV block hardcodes the literal 1000 rather than forwarding the build-arg value.

Concrete failure scenario: docker build --build-arg USER_UID=1001 -t paperclip-test .

  1. base stage: usermod -u 1001 node runs correctly — node user is UID 1001 in the image layer.
  2. production stage: ENV USER_UID=1000 — runtime default is still 1000.
  3. Container start: entrypoint reads PUID=1000, sees id -u node is 1001, and runs usermod -o -u 1000 nodeundoing the build-time baking entirely.

The fix is to re-declare the ARGs in the production stage and use them in the ENV block:

ARG USER_UID=1000
ARG USER_GID=1000

ENV NODE_ENV=production \
  ...
  USER_UID=${USER_UID} \
  USER_GID=${USER_GID} \
  ...
Prompt To Fix With AI
This is a comment left during a code review.
Path: Dockerfile
Line: 61-62

Comment:
**Build-arg `USER_UID`/`USER_GID` not propagated to runtime `ENV`**

`ARG` values are scoped to the stage they are declared in. `ARG USER_UID=1000` / `ARG USER_GID=1000` are declared in the `base` stage, but the `production` stage never re-declares them. As a result, the `ENV` block hardcodes the literal `1000` rather than forwarding the build-arg value.

**Concrete failure scenario:** `docker build --build-arg USER_UID=1001 -t paperclip-test .`

1. `base` stage: `usermod -u 1001 node` runs correctly — `node` user is UID 1001 in the image layer.
2. `production` stage: `ENV USER_UID=1000` — runtime default is still **1000**.
3. Container start: entrypoint reads `PUID=1000`, sees `id -u node` is 1001, and runs `usermod -o -u 1000 node`**undoing the build-time baking entirely**.

The fix is to re-declare the ARGs in the `production` stage and use them in the `ENV` block:

```
ARG USER_UID=1000
ARG USER_GID=1000

ENV NODE_ENV=production \
  ...
  USER_UID=${USER_UID} \
  USER_GID=${USER_GID} \
  ...
```

How can I resolve this? If you propose a fix, please make it concise.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in the latest push (268a75e). Re-declared ARG USER_UID=1000 / ARG USER_GID=1000 at the top of the production stage and changed the ENV block to USER_UID=${USER_UID} / USER_GID=${USER_GID} so the build-arg value is correctly forwarded to the runtime environment.

fi

# Ensure correct ownership of the data volume
chown -R node:node /paperclip
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 chown -R runs unconditionally on every container start

chown -R node:node /paperclip is not guarded behind the UID/GID-change branches — it executes on every startup, including the common default case where no remapping is needed. This contradicts the PR description's claim that "the entrypoint detects no UID/GID change and immediately exec-s the server with zero overhead."

For an AI agent orchestration platform, /paperclip accumulates persistent state (logs, databases, agent artefacts). A recursive chown over a large directory tree on every container restart introduces measurable startup latency that will grow over time.

The standard remedy is to move chown inside the branches so it only runs when a remap actually happened:

changed=0

if [ "$(id -u node)" -ne "$PUID" ]; then
    echo "Updating node UID to $PUID"
    usermod -o -u "$PUID" node
    changed=1
fi

if [ "$(id -g node)" -ne "$PGID" ]; then
    echo "Updating node GID to $PGID"
    groupmod -o -g "$PGID" node
    changed=1
fi

if [ "$changed" = "1" ]; then
    chown -R node:node /paperclip
fi
Prompt To Fix With AI
This is a comment left during a code review.
Path: docker-entrypoint.sh
Line: 20

Comment:
**`chown -R` runs unconditionally on every container start**

`chown -R node:node /paperclip` is not guarded behind the UID/GID-change branches — it executes on **every startup**, including the common default case where no remapping is needed. This contradicts the PR description's claim that "the entrypoint detects no UID/GID change and immediately exec-s the server with zero overhead."

For an AI agent orchestration platform, `/paperclip` accumulates persistent state (logs, databases, agent artefacts). A recursive `chown` over a large directory tree on every container restart introduces measurable startup latency that will grow over time.

The standard remedy is to move `chown` inside the branches so it only runs when a remap actually happened:

```sh
changed=0

if [ "$(id -u node)" -ne "$PUID" ]; then
    echo "Updating node UID to $PUID"
    usermod -o -u "$PUID" node
    changed=1
fi

if [ "$(id -g node)" -ne "$PGID" ]; then
    echo "Updating node GID to $PGID"
    groupmod -o -g "$PGID" node
    changed=1
fi

if [ "$changed" = "1" ]; then
    chown -R node:node /paperclip
fi
```

How can I resolve this? If you propose a fix, please make it concise.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in the latest push (268a75e). Added a changed=0 flag; chown -R node:node /paperclip now only runs when changed=1 (i.e. when usermod or groupmod actually executed). Default-UID containers skip the chown entirely and exec immediately.

@radiusred-cody radiusred-cody bot force-pushed the fix/docker-volumes branch from 3b88682 to 268a75e Compare March 27, 2026 16:59
@davison
Copy link
Copy Markdown
Author

davison commented Mar 27, 2026

Updated the PR with fixes for both P1 Greptile findings:

  1. ARG scope — Re-declared ARG USER_UID/USER_GID in the production stage; ENV now uses ${USER_UID}/${USER_GID} so build-args correctly propagate to runtime.
  2. Conditional chown — Added changed=0 flag; chown -R node:node /paperclip only runs when a UID/GID remap actually occurred. Default deployments (UID 1000) skip the chown entirely.

Both fixes are in commit 268a75e.

1 similar comment
@davison
Copy link
Copy Markdown
Author

davison commented Mar 27, 2026

Updated the PR with fixes for both P1 Greptile findings:

  1. ARG scope — Re-declared ARG USER_UID/USER_GID in the production stage; ENV now uses ${USER_UID}/${USER_GID} so build-args correctly propagate to runtime.
  2. Conditional chown — Added changed=0 flag; chown -R node:node /paperclip only runs when a UID/GID remap actually occurred. Default deployments (UID 1000) skip the chown entirely.

Both fixes are in commit 268a75e.

@davison
Copy link
Copy Markdown
Author

davison commented Mar 27, 2026

@greptileai

Comment on lines +17 to +21

if [ "$(id -g node)" -ne "$PGID" ]; then
echo "Updating node GID to $PGID"
groupmod -o -g "$PGID" node
changed=1
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 GID remapping is silently broken

groupmod -o -g "$PGID" node updates the GID record in /etc/group but does not update the user's primary GID field stored in /etc/passwd. When exec gosu node "$@" runs, gosu calls getpwnam("node") which reads the primary GID from /etc/passwd — still the original value. The process therefore always starts with the old GID, regardless of what USER_GID is set to.

Concrete consequence: with USER_GID=1001:

  1. groupmod -o -g 1001 node/etc/group now says node:x:1001:
  2. chown -R node:node /paperclip → sets /paperclip gid to 1001 (resolving via /etc/group)
  3. exec gosu node → reads /etc/passwd, primary gid = 1000 (unchanged) → process runs as gid=1000
  4. New files created by the server inside /paperclip are owned by gid=1000, not the intended 1001 — the host user with gid=1001 has no group write access to them

The same issue affects the build-time groupmod step in the Dockerfile (line 11).

The fix is to call usermod -g "$PGID" node after groupmod, which updates the primary GID entry in /etc/passwd:

Suggested change
if [ "$(id -g node)" -ne "$PGID" ]; then
echo "Updating node GID to $PGID"
groupmod -o -g "$PGID" node
changed=1
if [ "$(id -g node)" -ne "$PGID" ]; then
echo "Updating node GID to $PGID"
groupmod -o -g "$PGID" node
usermod -g "$PGID" node
changed=1
fi

And in the Dockerfile build-time block:

RUN usermod -u $USER_UID --non-unique node \
  && groupmod -g $USER_GID --non-unique node \
  && usermod -g $USER_GID node
Prompt To Fix With AI
This is a comment left during a code review.
Path: docker-entrypoint.sh
Line: 17-21

Comment:
**GID remapping is silently broken**

`groupmod -o -g "$PGID" node` updates the GID record in `/etc/group` but does **not** update the user's primary GID field stored in `/etc/passwd`. When `exec gosu node "$@"` runs, `gosu` calls `getpwnam("node")` which reads the primary GID from `/etc/passwd` — still the original value. The process therefore always starts with the old GID, regardless of what `USER_GID` is set to.

Concrete consequence: with `USER_GID=1001`:
1. `groupmod -o -g 1001 node``/etc/group` now says `node:x:1001:`
2. `chown -R node:node /paperclip` → sets `/paperclip` gid to 1001 (resolving via `/etc/group`)
3. `exec gosu node` → reads `/etc/passwd`, primary gid = 1000 (unchanged) → process runs as gid=1000
4. New files created by the server inside `/paperclip` are owned by gid=1000, not the intended 1001 — the host user with gid=1001 has no group write access to them

The same issue affects the build-time `groupmod` step in the `Dockerfile` (line 11).

The fix is to call `usermod -g "$PGID" node` after `groupmod`, which updates the primary GID entry in `/etc/passwd`:

```suggestion
if [ "$(id -g node)" -ne "$PGID" ]; then
    echo "Updating node GID to $PGID"
    groupmod -o -g "$PGID" node
    usermod -g "$PGID" node
    changed=1
fi
```

And in the `Dockerfile` build-time block:
```dockerfile
RUN usermod -u $USER_UID --non-unique node \
  && groupmod -g $USER_GID --non-unique node \
  && usermod -g $USER_GID node
```

How can I resolve this? If you propose a fix, please make it concise.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in the latest push (a0e94b4). Added usermod -g "$PGID" node after groupmod in docker-entrypoint.sh, and && usermod -g $USER_GID node after groupmod in the base stage Dockerfile. This updates the primary GID field in /etc/passwd so gosu node resolves the correct GID from the start.

Add USER_UID/USER_GID build args and gosu-based entrypoint to remap
the node user's UID/GID at container startup, avoiding permission
errors when mounting a host directory as /paperclip.

- Build-time: usermod/groupmod adjust the node user to match build args
- Runtime: docker-entrypoint.sh re-adjusts UID/GID from env vars and
  drops privileges via gosu before starting the server

Co-Authored-By: Paperclip <noreply@paperclip.ing>
@radiusred-cody radiusred-cody bot force-pushed the fix/docker-volumes branch from 268a75e to a0e94b4 Compare March 27, 2026 17:11
@davison
Copy link
Copy Markdown
Author

davison commented Mar 27, 2026

@greptileai

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant