Skip to content

Bound ReadSymlink blob reads to PATH_MAX#12

Merged
elithrar merged 2 commits into
cloudflare:mainfrom
Nadav0077:bound-readlink-blob
May 5, 2026
Merged

Bound ReadSymlink blob reads to PATH_MAX#12
elithrar merged 2 commits into
cloudflare:mainfrom
Nadav0077:bound-readlink-blob

Conversation

@Nadav0077

@Nadav0077 Nadav0077 commented Apr 18, 2026

Copy link
Copy Markdown
Contributor

Fixes #11.

What

ReadSymlink in internal/fusefs/fuse_unix.go now reads the hydrated Git symlink blob through an io.LimitReader capped at 4096 bytes (Linux PATH_MAX) and returns ENAMETOOLONG for anything larger, instead of calling os.ReadFile with no bound.

Why

Git stores a symlink target as the raw blob body, so its size is controlled by the repository. Without a cap, a single readlink(2) against a hostile repo forces a full-blob allocation, and on blobless clones it also triggers a large lazy fetch. Real symlinks on Linux can never exceed PATH_MAX, so a bounded read is also the correct upper limit for valid input.

Tests

New readsymlink_unix_test.go covers the empty, short, at-limit (4096), over-limit (4097), far-over-limit (1 MiB), and missing-cache cases. The helper was extracted so it can be unit tested without a real FUSE mount.

Git stores a symlink target as the raw blob body, so the size of the target is controlled by the repository. ReadSymlink was calling os.ReadFile on the hydrated blob with no cap, which means a single readlink(2) against a hostile repo could drive an arbitrarily large allocation (and on blobless clones, a large lazy fetch).

Read through an io.LimitReader bounded at 4096 bytes (Linux PATH_MAX) and return ENAMETOOLONG past that. Tests cover the empty, short, at-limit, over-limit, far-over-limit, and missing-cache paths.
Amanibhavam added a commit to defn/defn that referenced this pull request Apr 18, 2026
Two small defence-in-depth fixes from upstream cloudflare/artifact-fs,
both still open there at the time of this commit.

#12 (Bound ReadSymlink blob reads to PATH_MAX):
  ReadSymlink called os.ReadFile on the cached blob with no upper
  bound, so a repo could park an arbitrarily large payload behind a
  mode 120000 entry and every readlink(2) would materialise the whole
  thing. Caps the read at 4096 bytes (Linux PATH_MAX) via
  io.LimitReader and returns ENAMETOOLONG beyond that.
  Upstream PR: cloudflare/artifact-fs#12
  (author: Nadav0077; preserves upstream test as-is)

#14 (Clamp negative inode size to zero):
  inodeAttrs cast a signed int64 size to uint64 unchecked. If the
  stored size_bytes in the snapshot or overlay sqlite is ever
  negative (corruption), -1 wraps to ~18 EB and the mount publishes
  that huge st_size to the kernel. Changes inodeAttrs to accept
  int64, clamps negatives to 0, and converts only at the FUSE
  boundary.
  Upstream PR: cloudflare/artifact-fs#14
  (author: Nadav0077; preserves upstream test as-is)
@elithrar

Copy link
Copy Markdown
Collaborator

Did you run into this as an issue?

@Nadav0077

Copy link
Copy Markdown
Contributor Author

@elithrar
No. I was reading through ReadSymlink and noticed the os.ReadFile on the hydrated blob has no upper bound, while the target it produces is just the raw contents of a Git blob whose size is repo-controlled. A mode 120000 entry with a multi-megabyte body would cause every readlink(2) on that path to materialize the whole blob into memory (and on a blobless clone, to fetch it). Since Linux PATH_MAX is 4096, anything past that is guaranteed to be junk as a symlink target anyway, so a bounded read seemed like the correct shape either way.

@elithrar elithrar merged commit 38efaa5 into cloudflare:main May 5, 2026
1 check passed
@elithrar

elithrar commented May 5, 2026

Copy link
Copy Markdown
Collaborator

Thanks for this @Nadav0077!

@elithrar elithrar mentioned this pull request May 6, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

ReadSymlink reads Git symlink blobs without a size bound

2 participants