fix(docker): remap container UID/GID at runtime to avoid volume mount permission errors#1923
fix(docker): remap container UID/GID at runtime to avoid volume mount permission errors#1923davison wants to merge 1 commit intopaperclipai:masterfrom
Conversation
Greptile SummaryThis PR adds runtime UID/GID remapping to the Paperclip Docker image, resolving the persistent volume mount permission errors that occur when the host directory is owned by a non-1000 UID.\n\nThe implementation is clean and follows established container patterns (Bitnami, LinuxServer.io, official Redis/Postgres). Three significant issues surfaced across prior review rounds — ARG propagation to the Confidence Score: 5/5Safe to merge — all prior P0/P1 issues are resolved; one P2 suggestion remains (root UID guard). All three previously identified blocking issues (ARG propagation, unconditional chown, primary GID not updated) have been fixed in the head commit. The only remaining finding is a P2 defensive suggestion to reject USER_UID=0/USER_GID=0. Per the confidence guidance, a PR with only P2 findings defaults to 5/5. docker-entrypoint.sh — minor hardening suggestion for UID/GID=0 guard Important Files Changed
Prompt To Fix All With AIThis is a comment left during a code review.
Path: docker-entrypoint.sh
Line: 5-6
Comment:
**No guard against root UID/GID values**
If an operator (accidentally or intentionally) sets `USER_UID=0` or `USER_GID=0`, the entrypoint will remap the `node` user to UID/GID 0, then `exec gosu node "$@"` will start the server process as root — silently defeating the privilege-drop that this PR is designed to enforce. The PR description explicitly calls out "the server process itself never runs as root" as a key design invariant.
A simple guard at the top of the remap logic would enforce that:
```suggestion
PUID=${USER_UID:-1000}
PGID=${USER_GID:-1000}
if [ "$PUID" = "0" ] || [ "$PGID" = "0" ]; then
echo "ERROR: USER_UID and USER_GID must not be 0 (root). Got UID=$PUID GID=$PGID" >&2
exit 1
fi
```
How can I resolve this? If you propose a fix, please make it concise.Reviews (3): Last reviewed commit: "fix: support host UID/GID mapping for vo..." | Re-trigger Greptile |
Dockerfile
Outdated
| USER_UID=1000 \ | ||
| USER_GID=1000 \ |
There was a problem hiding this comment.
Build-arg
USER_UID/USER_GID not propagated to runtime ENV
ARG values are scoped to the stage they are declared in. ARG USER_UID=1000 / ARG USER_GID=1000 are declared in the base stage, but the production stage never re-declares them. As a result, the ENV block hardcodes the literal 1000 rather than forwarding the build-arg value.
Concrete failure scenario: docker build --build-arg USER_UID=1001 -t paperclip-test .
basestage:usermod -u 1001 noderuns correctly —nodeuser is UID 1001 in the image layer.productionstage:ENV USER_UID=1000— runtime default is still 1000.- Container start: entrypoint reads
PUID=1000, seesid -u nodeis 1001, and runsusermod -o -u 1000 node— undoing the build-time baking entirely.
The fix is to re-declare the ARGs in the production stage and use them in the ENV block:
ARG USER_UID=1000
ARG USER_GID=1000
ENV NODE_ENV=production \
...
USER_UID=${USER_UID} \
USER_GID=${USER_GID} \
...
Prompt To Fix With AI
This is a comment left during a code review.
Path: Dockerfile
Line: 61-62
Comment:
**Build-arg `USER_UID`/`USER_GID` not propagated to runtime `ENV`**
`ARG` values are scoped to the stage they are declared in. `ARG USER_UID=1000` / `ARG USER_GID=1000` are declared in the `base` stage, but the `production` stage never re-declares them. As a result, the `ENV` block hardcodes the literal `1000` rather than forwarding the build-arg value.
**Concrete failure scenario:** `docker build --build-arg USER_UID=1001 -t paperclip-test .`
1. `base` stage: `usermod -u 1001 node` runs correctly — `node` user is UID 1001 in the image layer.
2. `production` stage: `ENV USER_UID=1000` — runtime default is still **1000**.
3. Container start: entrypoint reads `PUID=1000`, sees `id -u node` is 1001, and runs `usermod -o -u 1000 node` — **undoing the build-time baking entirely**.
The fix is to re-declare the ARGs in the `production` stage and use them in the `ENV` block:
```
ARG USER_UID=1000
ARG USER_GID=1000
ENV NODE_ENV=production \
...
USER_UID=${USER_UID} \
USER_GID=${USER_GID} \
...
```
How can I resolve this? If you propose a fix, please make it concise.There was a problem hiding this comment.
Fixed in the latest push (268a75e). Re-declared ARG USER_UID=1000 / ARG USER_GID=1000 at the top of the production stage and changed the ENV block to USER_UID=${USER_UID} / USER_GID=${USER_GID} so the build-arg value is correctly forwarded to the runtime environment.
docker-entrypoint.sh
Outdated
| fi | ||
|
|
||
| # Ensure correct ownership of the data volume | ||
| chown -R node:node /paperclip |
There was a problem hiding this comment.
chown -R runs unconditionally on every container start
chown -R node:node /paperclip is not guarded behind the UID/GID-change branches — it executes on every startup, including the common default case where no remapping is needed. This contradicts the PR description's claim that "the entrypoint detects no UID/GID change and immediately exec-s the server with zero overhead."
For an AI agent orchestration platform, /paperclip accumulates persistent state (logs, databases, agent artefacts). A recursive chown over a large directory tree on every container restart introduces measurable startup latency that will grow over time.
The standard remedy is to move chown inside the branches so it only runs when a remap actually happened:
changed=0
if [ "$(id -u node)" -ne "$PUID" ]; then
echo "Updating node UID to $PUID"
usermod -o -u "$PUID" node
changed=1
fi
if [ "$(id -g node)" -ne "$PGID" ]; then
echo "Updating node GID to $PGID"
groupmod -o -g "$PGID" node
changed=1
fi
if [ "$changed" = "1" ]; then
chown -R node:node /paperclip
fiPrompt To Fix With AI
This is a comment left during a code review.
Path: docker-entrypoint.sh
Line: 20
Comment:
**`chown -R` runs unconditionally on every container start**
`chown -R node:node /paperclip` is not guarded behind the UID/GID-change branches — it executes on **every startup**, including the common default case where no remapping is needed. This contradicts the PR description's claim that "the entrypoint detects no UID/GID change and immediately exec-s the server with zero overhead."
For an AI agent orchestration platform, `/paperclip` accumulates persistent state (logs, databases, agent artefacts). A recursive `chown` over a large directory tree on every container restart introduces measurable startup latency that will grow over time.
The standard remedy is to move `chown` inside the branches so it only runs when a remap actually happened:
```sh
changed=0
if [ "$(id -u node)" -ne "$PUID" ]; then
echo "Updating node UID to $PUID"
usermod -o -u "$PUID" node
changed=1
fi
if [ "$(id -g node)" -ne "$PGID" ]; then
echo "Updating node GID to $PGID"
groupmod -o -g "$PGID" node
changed=1
fi
if [ "$changed" = "1" ]; then
chown -R node:node /paperclip
fi
```
How can I resolve this? If you propose a fix, please make it concise.There was a problem hiding this comment.
Fixed in the latest push (268a75e). Added a changed=0 flag; chown -R node:node /paperclip now only runs when changed=1 (i.e. when usermod or groupmod actually executed). Default-UID containers skip the chown entirely and exec immediately.
3b88682 to
268a75e
Compare
|
Updated the PR with fixes for both P1 Greptile findings:
Both fixes are in commit 268a75e. |
1 similar comment
|
Updated the PR with fixes for both P1 Greptile findings:
Both fixes are in commit 268a75e. |
|
|
||
| if [ "$(id -g node)" -ne "$PGID" ]; then | ||
| echo "Updating node GID to $PGID" | ||
| groupmod -o -g "$PGID" node | ||
| changed=1 |
There was a problem hiding this comment.
GID remapping is silently broken
groupmod -o -g "$PGID" node updates the GID record in /etc/group but does not update the user's primary GID field stored in /etc/passwd. When exec gosu node "$@" runs, gosu calls getpwnam("node") which reads the primary GID from /etc/passwd — still the original value. The process therefore always starts with the old GID, regardless of what USER_GID is set to.
Concrete consequence: with USER_GID=1001:
groupmod -o -g 1001 node→/etc/groupnow saysnode:x:1001:chown -R node:node /paperclip→ sets/paperclipgid to 1001 (resolving via/etc/group)exec gosu node→ reads/etc/passwd, primary gid = 1000 (unchanged) → process runs as gid=1000- New files created by the server inside
/paperclipare owned by gid=1000, not the intended 1001 — the host user with gid=1001 has no group write access to them
The same issue affects the build-time groupmod step in the Dockerfile (line 11).
The fix is to call usermod -g "$PGID" node after groupmod, which updates the primary GID entry in /etc/passwd:
| if [ "$(id -g node)" -ne "$PGID" ]; then | |
| echo "Updating node GID to $PGID" | |
| groupmod -o -g "$PGID" node | |
| changed=1 | |
| if [ "$(id -g node)" -ne "$PGID" ]; then | |
| echo "Updating node GID to $PGID" | |
| groupmod -o -g "$PGID" node | |
| usermod -g "$PGID" node | |
| changed=1 | |
| fi |
And in the Dockerfile build-time block:
RUN usermod -u $USER_UID --non-unique node \
&& groupmod -g $USER_GID --non-unique node \
&& usermod -g $USER_GID nodePrompt To Fix With AI
This is a comment left during a code review.
Path: docker-entrypoint.sh
Line: 17-21
Comment:
**GID remapping is silently broken**
`groupmod -o -g "$PGID" node` updates the GID record in `/etc/group` but does **not** update the user's primary GID field stored in `/etc/passwd`. When `exec gosu node "$@"` runs, `gosu` calls `getpwnam("node")` which reads the primary GID from `/etc/passwd` — still the original value. The process therefore always starts with the old GID, regardless of what `USER_GID` is set to.
Concrete consequence: with `USER_GID=1001`:
1. `groupmod -o -g 1001 node` → `/etc/group` now says `node:x:1001:`
2. `chown -R node:node /paperclip` → sets `/paperclip` gid to 1001 (resolving via `/etc/group`)
3. `exec gosu node` → reads `/etc/passwd`, primary gid = 1000 (unchanged) → process runs as gid=1000
4. New files created by the server inside `/paperclip` are owned by gid=1000, not the intended 1001 — the host user with gid=1001 has no group write access to them
The same issue affects the build-time `groupmod` step in the `Dockerfile` (line 11).
The fix is to call `usermod -g "$PGID" node` after `groupmod`, which updates the primary GID entry in `/etc/passwd`:
```suggestion
if [ "$(id -g node)" -ne "$PGID" ]; then
echo "Updating node GID to $PGID"
groupmod -o -g "$PGID" node
usermod -g "$PGID" node
changed=1
fi
```
And in the `Dockerfile` build-time block:
```dockerfile
RUN usermod -u $USER_UID --non-unique node \
&& groupmod -g $USER_GID --non-unique node \
&& usermod -g $USER_GID node
```
How can I resolve this? If you propose a fix, please make it concise.There was a problem hiding this comment.
Fixed in the latest push (a0e94b4). Added usermod -g "$PGID" node after groupmod in docker-entrypoint.sh, and && usermod -g $USER_GID node after groupmod in the base stage Dockerfile. This updates the primary GID field in /etc/passwd so gosu node resolves the correct GID from the start.
Add USER_UID/USER_GID build args and gosu-based entrypoint to remap the node user's UID/GID at container startup, avoiding permission errors when mounting a host directory as /paperclip. - Build-time: usermod/groupmod adjust the node user to match build args - Runtime: docker-entrypoint.sh re-adjusts UID/GID from env vars and drops privileges via gosu before starting the server Co-Authored-By: Paperclip <noreply@paperclip.ing>
268a75e to
a0e94b4
Compare
Thinking Path
What Changed
DockerfileARG USER_UID=1000andARG USER_GID=1000build args (defaults to 1000 — no behaviour change for existing deployments)gosuto the base apt installusermod/groupmodat build time to pre-bake the UID/GID into the image layerUSER_UIDandUSER_GIDto theENVblock so they are available at runtimeUSER nodedirective withENTRYPOINT ["docker-entrypoint.sh"]docker-entrypoint.shto/usr/local/bin/and makes it executabledocker-entrypoint.sh(new file)USER_UID/USER_GIDenv vars at startupusermod/groupmodif the running UID/GID differ from the node user (handles runtime override without rebuild)chown -R node:node /paperclipto correct volume ownershipnodeuser viaexec gosu node "$@"— no persistent root processNote on customisation: A question that keeps coming up is how to add tools, MCP servers, or extra packages to the image (e.g. for specific agent adapters). The cleanest pattern is to build a downstream image using Paperclip as a base:
This PR keeps the upstream image minimal and focused. Documenting this pattern properly is a good follow-up.
Verification
Build and run with a non-default UID:
Default build (no args): identical behaviour to the previous image — server runs as
node(UID 1000), entrypoint is a no-op remap and proceeds immediately toexec gosu node.Risks
gosudrop — this is intentional and necessary forusermod/chown. The server process itself never runs as root. This is the standard pattern for this use case (Bitnami images, LinuxServer.io images, official Redis/Postgres images all do the same).gosuadds a small dependency (~1 MB).gosuis a well-known, audited privilege-drop tool that avoids the TTY and signal-handling issues ofsu/sudoin containers.Checklist