Bound ReadSymlink blob reads to PATH_MAX#12
Merged
Conversation
Git stores a symlink target as the raw blob body, so the size of the target is controlled by the repository. ReadSymlink was calling os.ReadFile on the hydrated blob with no cap, which means a single readlink(2) against a hostile repo could drive an arbitrarily large allocation (and on blobless clones, a large lazy fetch). Read through an io.LimitReader bounded at 4096 bytes (Linux PATH_MAX) and return ENAMETOOLONG past that. Tests cover the empty, short, at-limit, over-limit, far-over-limit, and missing-cache paths.
Amanibhavam
added a commit
to defn/defn
that referenced
this pull request
Apr 18, 2026
Two small defence-in-depth fixes from upstream cloudflare/artifact-fs, both still open there at the time of this commit. #12 (Bound ReadSymlink blob reads to PATH_MAX): ReadSymlink called os.ReadFile on the cached blob with no upper bound, so a repo could park an arbitrarily large payload behind a mode 120000 entry and every readlink(2) would materialise the whole thing. Caps the read at 4096 bytes (Linux PATH_MAX) via io.LimitReader and returns ENAMETOOLONG beyond that. Upstream PR: cloudflare/artifact-fs#12 (author: Nadav0077; preserves upstream test as-is) #14 (Clamp negative inode size to zero): inodeAttrs cast a signed int64 size to uint64 unchecked. If the stored size_bytes in the snapshot or overlay sqlite is ever negative (corruption), -1 wraps to ~18 EB and the mount publishes that huge st_size to the kernel. Changes inodeAttrs to accept int64, clamps negatives to 0, and converts only at the FUSE boundary. Upstream PR: cloudflare/artifact-fs#14 (author: Nadav0077; preserves upstream test as-is)
Collaborator
|
Did you run into this as an issue? |
Contributor
Author
|
@elithrar |
Collaborator
|
Thanks for this @Nadav0077! |
Merged
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fixes #11.
What
ReadSymlinkininternal/fusefs/fuse_unix.gonow reads the hydrated Git symlink blob through anio.LimitReadercapped at 4096 bytes (LinuxPATH_MAX) and returnsENAMETOOLONGfor anything larger, instead of callingos.ReadFilewith no bound.Why
Git stores a symlink target as the raw blob body, so its size is controlled by the repository. Without a cap, a single
readlink(2)against a hostile repo forces a full-blob allocation, and on blobless clones it also triggers a large lazy fetch. Real symlinks on Linux can never exceedPATH_MAX, so a bounded read is also the correct upper limit for valid input.Tests
New
readsymlink_unix_test.gocovers the empty, short, at-limit (4096), over-limit (4097), far-over-limit (1 MiB), and missing-cache cases. The helper was extracted so it can be unit tested without a real FUSE mount.