Skip to content

feat(workspace): enforce personal-workspace immutability invariants#266

Merged
mgoldsborough merged 4 commits into
mainfrom
worktree-agent-a1c79404636eac9f1
May 22, 2026
Merged

feat(workspace): enforce personal-workspace immutability invariants#266
mgoldsborough merged 4 commits into
mainfrom
worktree-agent-a1c79404636eac9f1

Conversation

@mgoldsborough
Copy link
Copy Markdown
Contributor

@mgoldsborough mgoldsborough commented May 22, 2026

Summary

Personal workspaces (isPersonal === true) become immutable from a user/permission standpoint. Motivated by a production observation: one user's personal workspace had three admins. Stage 1 introduced personal workspaces but didn't enforce the invariants that follow from "personal" — Stage 1.1 makes the store the source of truth.

Four invariants enforced by WorkspaceStore

  1. Members locked to [{ userId: ownerUserId, role: "admin" }]. addMember / removeMember / updateMemberRole and update({ members }) patches all reject mutations against personal workspaces.
  2. isPersonal frozen post-create, both directions.
  3. ownerUserId frozen on personal workspaces.
  4. ownerUserId forbidden on non-personal workspaces (the two fields travel together).

bundles, name, about, customInstructions stay freely mutable — these are workspace-content edits, not identity edits.

Typed error → HTTP 422

PersonalWorkspaceInvariantError (new, src/workspace/errors.ts) carries { workspaceId, reason } where reason ∈ { members_mutation, is_personal_frozen, owner_user_id_frozen, owner_user_id_on_non_personal }. The HTTP handler maps it to 422 personal_workspace_invariant with the same structured body — mirrors the ConversationCorruptedError → 422 precedent from Stage 1.

The typed class doesn't survive the in-process MCP serialization boundary, so the workspace-mgmt tool handlers encode it into structuredContent and handleToolCall re-detects → 422. External MCP clients see the same isError: true result with the structured payload.

Retroactive cleanup

scripts/cleanup-personal-workspace-members.ts (alias: bun run cleanup:personal-workspace-members) repairs pre-Stage-1.1 data. Idempotent; dry-run by default, --apply to write. A personal workspace missing ownerUserId is a hard-error — operator must triage.

Operators must run this after deploying to converge any multi-admin personal workspaces.

A wrapping Makefile target was scoped for deployments/nimblebrain/Makefile but isn't part of this PR — that submodule has unrelated in-progress work. Will land as a separate submodule commit.

Test plan

  • Unit: 19 cases covering all four invariants positively + adversarial topology (bundles / name updates still succeed on personal workspaces). test/unit/workspace/personal-workspace-invariants.test.ts.
  • Unit: provisioning self-heal — 2 cases pinning the concurrent-create-delete race (ensureUserWorkspace recreates when canonical is deleted between create-conflict and re-read; gives up after 3 attempts under pathological churn). test/unit/workspace/provisioning.test.ts.
  • Integration: HTTP boundary — /v1/tools/call manage_workspaces returns 422 with structured body on add_member and update_member against a personal workspace; non-identity updates still succeed. test/integration/api/personal-workspace-invariants.test.ts.
  • Integration: cleanup script — multi-admin → sole-owner-admin, idempotent on canonical shape, hard-errors on missing ownerUserId, untouched on non-personal, dry-run is default. test/integration/scripts/cleanup-personal-workspace-members.test.ts.
  • Full bun run verify: static green, 2993 unit / 0 fail, 258 web / 0, 551 integration / 0 (12 skip), 17 smoke / 0.

Changes since initial PR

Round 1 of QA review landed on this branch. Three Critical findings addressed:

  • Provisioning regression fixed. The original PR replaced ensureUserWorkspace's self-healing recreate path with a throw, which would have 500'd login under a concurrent create-delete race. Fold reconcileConflict back into ensureUserWorkspace as a bounded retry loop (3 attempts) — shorter code than the original AND restores the self-heal. Two new unit tests pin the race shape.
  • OSS hygiene. Removed tenant references and seed-user names that named specific people from comments, tests, AGENTS.md, CHANGELOG.md, and the cleanup script's docstring. The repo is public OSS; internal infra context doesn't belong in commit history.
  • Documentation drift fixed. AGENTS.md previously documented a make cleanup-personal-workspace-members target that this PR doesn't deliver. Replaced with the real entry point (bun run cleanup:personal-workspace-members).

Personal workspaces (`isPersonal === true`) are now sole-owner-by-design.
The store throws `PersonalWorkspaceInvariantError` on attempts to mutate
the locked fields and the HTTP layer maps it to a structured 422 — the
silent-strip behavior that allowed multi-admin personal workspaces on
hq production is gone.

Invariants enforced by `WorkspaceStore`:
1. members locked to `[{ userId: ownerUserId, role: "admin" }]` —
   `addMember` / `removeMember` / `updateMemberRole` and any
   `update({ members })` patch reject mutations on personal workspaces.
2. `isPersonal` frozen post-create (both directions).
3. `ownerUserId` frozen on personal workspaces.
4. `ownerUserId` forbidden on non-personal workspaces.

`bundles`, `name`, `about`, `customInstructions` remain freely mutable.

The HTTP mapping mirrors the `ConversationCorruptedError → 422` precedent:
`PersonalWorkspaceInvariantError` becomes `422 personal_workspace_invariant`
with `{ workspaceId, reason }`. Because the typed error class doesn't
survive the in-process MCP serialization boundary, the workspace-mgmt
tool handlers encode it into `structuredContent` so `handleToolCall` can
re-detect and emit the 422.

Pre-Stage-1.1 data is repaired by `scripts/cleanup-personal-workspace-members.ts`
(also `bun run cleanup:personal-workspace-members` and a `make` target).
Idempotent, dry-run by default; `--apply` writes. A personal workspace
missing `ownerUserId` is a hard-error — operator must triage.
@mgoldsborough mgoldsborough marked this pull request as ready for review May 22, 2026 00:57
…oc drift

Addresses the three Critical findings on PR #266's QA review.

**Provisioning regression (C2).** The original PR replaced
`ensureUserWorkspace`'s self-healing recreate with a throw — under a
concurrent create-then-delete race, `reconcileConflict` would surface
"workspace disappeared between create-conflict and re-read" and bubble
out of every auth provider (oidc/workos/dev), producing a 500 on
login. The "callers retry" doc claim wasn't true.

Folded `reconcileConflict` back into `ensureUserWorkspace` as a
bounded retry loop (3 attempts). Body is shorter than the prior code
AND covers the race the original self-healed against. The loop
shapes:
  attempt 1: read → null → create → conflict from another caller
  attempt 2: read → null (third caller deleted) → create succeeds
  attempt 3: only reached under pathological create+delete churn

Two new unit tests pin the race in `provisioning.test.ts` using a
stub store with controlled get/create call counts. Pathological case
surfaces a diagnosable error message naming the attempt count.

**OSS hygiene (C1).** The repo is public OSS; internal tenant context
and seed-user names that named specific people don't belong in commit
history. Six sites neutralized:
  - src/workspace/workspace-store.ts:268 — "on hq production" → "in production"
  - test/unit/workspace/workspace-store.test.ts:360 — same
  - scripts/cleanup-personal-workspace-members.ts:8 — "the hq production tenant" → "a production tenant"
  - test/integration/scripts/cleanup-personal-workspace-members.test.ts — "user_mat"/"user_mario" → "user_b"/"user_c"; drop "hq-production" comment
  - AGENTS.md:153 — "seen on hq production" → "observed in production"
  - CHANGELOG.md:90 — drop "(observed on hq production)" parenthetical

**Documentation drift (C3).** AGENTS.md previously documented
`make cleanup-personal-workspace-members CLIENT=<x> ENV=<y>` as a
real operator entry point, but this PR doesn't ship that Makefile
addition (the deployments submodule has unrelated in-progress work).
Operators reading the docs would reach for a target that doesn't
exist. Replaced with the real entry point: `bun run cleanup:personal-
workspace-members`. PR body refreshed to match.

Verify: 2993 unit / 0 fail (+2 race tests), 258 web / 0, 551
integration / 0 (12 skip), 17 smoke / 0. All static checks green.
Round-1 QA flagged the same drift in AGENTS.md, fixed there but
missed the parallel site in CHANGELOG. Symmetric-defense miss on my
part — the new AGENTS.md convention says to find parallel sites
when fixing a check, and I didn't grep for sibling
`make cleanup-personal-workspace-members` references before pushing
round-1.

This is the operator-facing surface for "you must run this after
deploying", so a wrong command here is materially worse than in
AGENTS.md.

Confirmed no other stray make-target references remain in the diff
via `grep -nE 'make (cleanup|heal|migrate)-...'`.
@mgoldsborough mgoldsborough added the qa-reviewed QA review completed with no critical issues label May 22, 2026
…04636eac9f1

# Conflicts:
#	AGENTS.md
#	package.json
@mgoldsborough mgoldsborough merged commit 8994d56 into main May 22, 2026
4 checks passed
@mgoldsborough mgoldsborough deleted the worktree-agent-a1c79404636eac9f1 branch May 26, 2026 03:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

qa-reviewed QA review completed with no critical issues

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant