Skip to content

fix(setup): wrap claude plugin install with .claude.json snapshot/restore guard (BUG-004)#57

Merged
mlorentedev merged 1 commit into
mainfrom
fix/BUG-004-claude-mem-truncate-guard
May 19, 2026
Merged

fix(setup): wrap claude plugin install with .claude.json snapshot/restore guard (BUG-004)#57
mlorentedev merged 1 commit into
mainfrom
fix/BUG-004-claude-mem-truncate-guard

Conversation

@mlorentedev
Copy link
Copy Markdown
Owner

@mlorentedev mlorentedev commented May 19, 2026

Summary

  • Closes the residual trigger of upstream anthropics/claude-code#59870 (the .claude.json truncation that drops subscription metadata on every claude plugin install).
  • Adds a snapshot/restore wrapper around the install call in both setup-windows.ps1 (Backup-AndRestoreClaudeJson) and setup-linux.sh (snapshot_claude_json + restore_claude_json_if_truncated). Heuristic: restore iff pre-size ≥ 10 KB AND post-size < pre/2. Same 10240 floor as SDD-021's session-start canary (single SSOT).
  • Defense-in-depth: the existing grep -qF / -match [regex]::Escape idempotence guard against claude plugin list output is preserved. The wrapper catches the residual case where the listing omits an installed plugin (e.g. claude-mem@thedotmack, which is from a different marketplace and never appears in claude plugin list).

Root cause

The original trigger fix from dotfiles#33 assumed claude plugin list enumerates every installed plugin literally. It does not enumerate plugins from third-party marketplaces (@thedotmack). Every setup run therefore attempts a real install of claude-mem@thedotmack, which hits the upstream bug and truncates ~/.claude/.claude.json from ~75 KB to ~1.5 KB. Empirically reproduced 2026-05-19: setup output Claude Code plugins ready (1 added, 11 already present) -> file at 3444 bytes -> re-login prompt in every project. Restored manually from ~/.claude/backups/.claude.json.backup.*.

SDD-021 (PR #54) added a 10 KB canary in claude-session-start.{sh,ps1} that surfaced the recurrence — confirming the monitor works but also that the trigger fix was incomplete. This PR is the third layer (prevention).

Three-layer model after this PR

  1. Prevention (existing) — idempotence guard against claude plugin list. Catches the common case.
  2. Prevention (new, this PR) — snapshot/restore wrapper around the install call. Catches the false-negative case.
  3. Detection (SDD-021) — session-start canary alerts if both prevention layers fail.

Test plan

  • 6 new bats asserts in tests/setup-windows.bats (helper defined, #59870 cited, 10240 floor, wrapper precedes install, idempotence preserved)
  • 6 new bats asserts in tests/setup-linux.bats including 2 cross-OS parity tests
  • Synthetic smoke (Windows): 52060 -> 2 -> 52060 bytes with [WARNING] line citing #59870
  • Sub-threshold smoke: 52060 -> 28633 (above 50% floor) -> no restore, as expected
  • Real smoke x2 on admin Windows machine: .claude.json preserved at 52055 bytes
  • PSScriptAnalyzer Error+Warning clean on changes (pre-existing empty-catch warning relocated, semantics unchanged)
  • bash -n setup-linux.sh clean
  • CI: GitGuardian, integration (Docker Ubuntu 24.04), lint, lint-powershell, test (bats) — pending CI run

Caveats (honest reporting)

  • The upstream bug #59870 is intermittent. Reproduced once this morning, then could not be reproduced in vivo for the rest of the afternoon despite multiple setup runs. The wrapper's in-vivo behavior under the real bug is therefore demonstrated only synthetically in this session. The synthetic test exercises the same code path the real bug would trigger; structural symmetry between the wrapper logic and the upstream failure mode is documented in verification.md.
  • BUG-005 (Windows PowerShell 5.1 -AsHashtable incompatibility in Merge-ClaudeSettings) is the sibling spec; separate atomic PR.

References

…tore guard (BUG-004)

Defends against upstream anthropics/claude-code#59870: every `claude plugin
install` call writes ~/.claude/.claude.json via a deserialize-modify-serialize
cycle that silently drops subscription metadata (organizationType,
organizationRateLimitTier, projects map, onboarding flags), shrinking the file
from ~75 KB to ~1.5 KB and forcing re-authentication in every project.

The existing idempotence guard (`grep -qF` bash / `-match [regex]::Escape` PS)
yields a false negative for claude-mem@thedotmack because it does not appear in
`claude plugin list` output (different marketplace, @thedotmack vs
@claude-plugins-official), so every setup run triggers one real install of
claude-mem and hits the upstream bug. dotfiles#33 was the original trigger fix
(idempotence guard) but this case slipped through; SDD-021 (PR #54) added the
session-start size canary as a detector. This PR is the third layer: prevention.

Implementation:
- setup-windows.ps1: `Backup-AndRestoreClaudeJson -Action { ... }` snapshots to
  `[System.IO.Path]::GetTempFileName()` before the action, restores via
  Copy-Item in `finally` iff pre-size >= 10240 AND post-size < pre/2.
- setup-linux.sh: `snapshot_claude_json` (echoes tempfile path) +
  `restore_claude_json_if_truncated` pair around the bash `claude plugin
  install` call. Same heuristic.
- Both citations of #59870, #33, and SDD-021 in the helper headers for
  traceability.
- 10240 byte floor reused from SDD-021's canary -- single SSOT for
  "anomalously small .claude.json".

Verification:
- Synthetic smoke (pwsh): 52060 -> 2 bytes -> wrapper detected and restored
  to 52060 with [WARNING] line citing #59870.
- Sub-threshold smoke: 52060 -> 28633 (55% of pre) -> no restoration (above
  50% floor), as expected.
- Real smoke x2 on Windows admin machine: 52055 bytes preserved across runs.
  Upstream bug did not fire in vivo this afternoon (intermittent), so the
  wrapper's in-vivo behavior under the real bug is not demonstrated this
  session; synthetic test proves the logic.
- 6 new bats asserts in tests/setup-windows.bats + 6 in tests/setup-linux.bats
  (including 2 cross-OS parity tests). Verified red->green via grep-by-grep
  emulation; full bats run in CI.
- PSScriptAnalyzer clean on changes; bash -n setup-linux.sh clean.

Closes residual trigger from dotfiles#33; complements SDD-021 size monitor.

Spec: specs/BUG-004-claude-mem-truncate-guard/
Vault: 10_projects/dotfiles/11-tasks.md (entry added)
Lesson: 10_projects/dotfiles/90-lessons.md "Defensive monitors are not fixes"

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@mlorentedev mlorentedev merged commit a1792c9 into main May 19, 2026
5 checks passed
@mlorentedev mlorentedev deleted the fix/BUG-004-claude-mem-truncate-guard branch May 19, 2026 18:34
mlorentedev added a commit that referenced this pull request May 19, 2026
…werShell 5.1 (BUG-005)

SDD-002 (PR #51) introduced Merge-ClaudeSettings which uses
`ConvertFrom-Json -AsHashtable` -- a parameter added in PowerShell 7.0 that
does NOT exist in Windows PowerShell 5.1. The natural Windows command
`PowerShell -ExecutionPolicy Bypass -File .\setup-windows.ps1` resolves
`PowerShell` to 5.1 on a stock Windows machine, the Merge function's wide
try/catch swallows the ParameterBindingException as if it were a JSON parse
error ("not valid JSON after placeholder substitution: A parameter cannot be
found that matches parameter name 'AsHashtable'"), and the per-key merge from
ai/claude/settings.json is silently skipped. Existing ~/.claude/settings.json
survives untouched, so the bug is invisible until a fresh-machine bootstrap
depends on the template merge.

Implementation: preamble block after param() and before CONFIGURATION header
detects $PSVersionTable.PSVersion.Major < 7. If `Get-Command pwsh` resolves
to PowerShell 7+, re-exec under it with `-NoProfile -ExecutionPolicy Bypass
-File $PSCommandPath @args` and exit with the child's exit code. Otherwise,
print an actionable [ERROR] line pointing at `winget install Microsoft.PowerShell`
and exit 1 (fail-loud, no silent skip).

Verification:
- Smoke under Windows PowerShell 5.1.22621.6931: first output line is
  `[INFO] Windows PowerShell 5.1.22621.6931 detected; re-executing under
  pwsh (...) for full feature compatibility (BUG-005)`. Setup proceeds under
  pwsh, settings.json merged correctly (1687 bytes, 14 plugins + preserved
  user customizations), zero AsHashtable warnings.
- Smoke under pwsh 7.6.1 direct: preamble no-op, no `[INFO] ... detected`
  lines, existing behavior preserved (verified during BUG-004 smoke same day).
- pwsh-missing branch: not directly testable (pwsh installed on dev machine);
  structurally locked by bats grep asserting `winget install Microsoft.PowerShell`
  literal + `exit 1`.
- 5 new bats asserts in tests/setup-windows.bats; RED -> GREEN via grep-by-grep
  emulation.
- PSScriptAnalyzer: 4 new Write-Host warnings, style-consistent with existing
  Write-Info/Success/Warn/Err wrappers (those wrappers are themselves
  Write-Host calls; preamble cannot use them because the wrappers are defined
  below the preamble's runtime position). Zero Errors. Zero new rule classes.

Linux side (setup-linux.sh) is immune by construction (bash + jq, no
shell-version coupling); zero changes there. Negative-parity bats assert
locks this in.

Spec: specs/BUG-005-setup-ps7-reexec/
Vault: 10_projects/dotfiles/11-tasks.md (entry added in same session)
Sibling: PR #57 (BUG-004 claude-mem-truncate-guard)
Upstream: https://learn.microsoft.com/powershell/scripting/whats-new/what-s-new-in-powershell-70

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant