fix(state): hardening — path-traversal validation + atomic save (v4.4.1)#3
Merged
Conversation
Reject session_ids that fail [a-zA-Z0-9_-]{1,64} whitelist before any
filesystem access. Defense-in-depth: resolved path must remain inside
STATE_DIR even if regex were ever loosened (catches symlink edges).
Threat model: session_id is normally a UUID generated by Claude Code,
but the hook contract treats it as an arbitrary string from JSON stdin.
A malicious or buggy upstream that injects "../../etc/passwd" must not
be able to write state files outside the configured state directory.
Tests (17 new, all green):
- TestSessionIdRejection: dotdot, slashes, null-byte, newline, empty,
too-long (>64), shell metachars, non-string, None
- TestSessionIdAcceptance: UUID, short alphanumeric, underscore, all
19 existing test_session_state.py session_ids remain valid
- TestPathStaysInsideStateDir: resolved path parent is STATE_DIR,
no file created on rejected id
Backwards-compat verified: regression suite 619 + 17 = 636 green.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Race-condition / partial-write hardening for SessionState.save():
write to a tempfile in STATE_DIR (same filesystem, guaranteed atomic
rename), then os.replace into place. Cross-platform atomic since Py3.3.
Before: contextlib.suppress(OSError) + path.write_text() — last-write-wins
on concurrent saves, half-written files possible on crash mid-stream.
After: tempfile + os.replace — concurrent saves serialize at the rename
point, partial writes leave tempfile behind (cleaned up in finally) and
never corrupt the destination.
Best-effort behavior preserved: OSError logged to stderr, never raised.
Tempfile fd ownership transferred to fdopen's context-manager (avoids
double-close); cleanup paths only act on still-owned resources.
os.replace + os.unlink (not Path.replace/.unlink) chosen because tests
spy on the atomicity contract via patch("lib.state.os.replace", ...).
PTH105/PTH108 silenced inline with intent comment.
Tests (6 new, all green):
- TestAtomicSave: save uses os.replace, no .tmp left behind, roundtrip
after atomic write, partial-write failure preserves original file
- TestConcurrentSave: 4 threads x 25 saves produce valid JSON, no .tmp
left after concurrent saves
Coverage hooks/lib/state.py: 90.7% (target >85%).
Regression suite 642 green (was 619 + 23 new).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…dex finding) PR #3 review by codex-rescue identified Lost-Update bug not covered by the atomic-save fix in c4c386e: when multiple hook subprocesses write the SAME session_id concurrently (real-world: Claude Code spawns 4 parallel UserPromptSubmit hooks — session-init, correction-detect, scope-tracker, false-positive-guard — each owning its own DEFAULTS namespace), atomic-write alone gives last-writer-wins. Hook B's save() overwrites Hook A's just-committed namespace because B's in-memory state was loaded before A's save landed. Three-layer fix: 1. Cross-platform file-lock (`fcntl.flock` on POSIX, `msvcrt.locking` on Windows): a sentinel `.meta-state-{sid}.lock` file gates the read-modify-write window. Auto-released on process exit (POSIX flock semantics + Windows file-handle close). 2. Read-Modify-Write inside the lock: re-read fresh disk state, merge in-memory namespaces over it, atomic-write the merged result. Helpers `_reload_from_disk` (no defaults merge) and `_write_atomic` extracted from save() for clarity and unit-mockability. 3. Dirty-namespace tracking (`_dirty_ns: set[str]`): only namespaces explicitly written via set() / property setters are merged onto disk- reload. Without this, the in-memory _data — which contains DEFAULTS loaded by __init__ — would clobber disk values that other concurrent writers committed for namespaces this writer never touched. cleanup_stale extended: removes companion `.lock` files alongside their `.json` state files, plus orphan-lock sweep for locks whose state file no longer exists. Tests (4 new, all green): - TestSharedSessionLostUpdate: 4-thread barrier-coordinated disjoint namespace writers preserve all updates; 4 × 10-iter stress on disjoint namespaces converges to last-iter values - TestLockFile: .lock file created alongside state on save(); cleanup_stale removes both companions Coverage hooks/lib/state.py: 87.6% (target >85%, missing lines are POSIX fcntl branch on Windows + defensive error-paths). Regression suite 646 green (642 prior + 4 new). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…eview #2) Codex re-review on PR #3 (commit eafebc1) APPROVED the file-lock + RMW fix but flagged one minor blind spot: get() returns the live reference, not a copy. Callers who mutate via reference (e.g. `qg = s.get("ns"); qg["k"] = v`) without a follow-up s.set(ns, qg) won't trigger dirty- namespace tracking, so the mutation is silently lost on save(). In practice this is mitigated because all existing hooks follow the established `set()-after-mutation` pattern (visible throughout test fixtures and hook code). But the contract was implicit. This commit makes it explicit in the docstring so future hook authors won't be surprised. No code change. Lint clean. 49 state tests still green. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Hardens
hooks/lib/state.pyagainst two audit-confirmed risks (Session D Phase 0 quality audit, 2026-05-02):session_idwhitelist[a-zA-Z0-9_-]{1,64}+ Defense-in-Depth resolved-path checktempfile.mkstemp(dir=STATE_DIR)+os.replaceinstead of directpath.write_text()Threat models documented in test-file docstrings. Best-effort behavior preserved:
save()still never raises, OSError logged to stderr.ERPNext
Test Plan
hooks/lib/state.py: 90.7% (was 87%)ruff format --check+ruff checkclean on changed filesharden.py --scan: 0 Critical findingsCommits (atomic per concern)
bb3766cfix(state): validate session_id against path-traversal (TASK-2026-00679)c4c386efix(state): atomic save via tempfile + os.replace (TASK-2026-00680)Plan / Audit Reference
~/.claude/plans/jaundder-plan-hashed-tiger.mdPhase 1+2Documents/audit-2026-05-01/META-ANALYSIS.md(Path-Traversal + Race-Condition findings)🤖 Generated with Claude Code