Skip to content

feat(bootstrap): resource-action-map for synth-time validation#165

Open
scottschreckengaust wants to merge 8 commits into
mainfrom
feat/bootstrap-action-map
Open

feat(bootstrap): resource-action-map for synth-time validation#165
scottschreckengaust wants to merge 8 commits into
mainfrom
feat/bootstrap-action-map

Conversation

@scottschreckengaust
Copy link
Copy Markdown
Contributor

Summary

Closes #124
Closes #164

Creates a mapping from CloudFormation resource types to required IAM actions (CRUD lifecycle), scoped to all resource types in this app's synthesized template. Introduces getRequiredBootstrapPolicies() for downstream consumption by the Aspect (#125) and preflight validator (#126). Gates ECS construct on compute_type context variable (replaces comment toggle).

Stack position

PR 5 for #120 — least-privilege CDK bootstrap policies as code

Prior: Custom template generator + compute variants (PR #162, #123)

This PR: Resource-action-map + ECS context gate + required-policies module

Next: CDK Aspect for policy envelope checking (#125)

Key decisions

  • ECS context gate (refactor(compute): gate ECS construct on compute_type context instead of comment toggle #164): Construct is always in source, compute_type governs synthesis — no commenting/uncommenting
  • getRequiredBootstrapPolicies(computeType): Single function declaring what the app needs, consumed by Aspect and preflight
  • Dual-config synth-coverage test: Validates map completeness for both agentcore and ecs configurations
  • Map scoped to this app resources (~60 types): Unknown types produce warnings, not errors
  • All map actions within configured policy set: Test enforces the map never requires more than policies allow

Deliverables

Test plan

  • All existing CDK tests pass
  • Map covers all resource types in both synth configurations
  • All mapped actions exist in the combined policy set (wildcard-aware)
  • getRequiredBootstrapPolicies returns correct sets for each compute type
  • tsc --noEmit compiles cleanly
  • No circular imports between preflight/ and policies/

Open questions

  • SQS: AWS::SQS::Queue is in the template but no policy has SQS actions — needs investigation (may require policy update + version bump)

Implementation plan

See: docs/superpowers/plans/2026-05-21-resource-action-map.md

Blocked by: #123 (PR #162)
References: RFC #120, ADR-002

🤖 Generated with Claude Code

@scottschreckengaust scottschreckengaust force-pushed the feat/bootstrap-action-map branch from d31fd4d to d3a9804 Compare May 21, 2026 07:50
@scottschreckengaust
Copy link
Copy Markdown
Contributor Author

┌─────────┬──────┬───────────────────────────────────────────┐
│ Commit  │ Task │                   What                    │
├─────────┼──────┼───────────────────────────────────────────┤
│ d3a9804 │ 0    │ ECS context gate (closes #164)            │
├─────────┼──────┼───────────────────────────────────────────┤
│ 5ed8db3 │ 1    │ getRequiredBootstrapPolicies(computeType) │
├─────────┼──────┼───────────────────────────────────────────┤
│ 83099e1 │ 2    │ Resource-action-map (57 CF types)         │
├─────────┼──────┼───────────────────────────────────────────┤
│ ed0cf6b │ 3    │ Dual-config synth-coverage test           │
└─────────┴──────┴───────────────────────────────────────────┘

Note: this branch currently sits on top of feat/bootstrap-template (#162). When #162 merges to main, I'll retarget and rebase per ADR-001 §8 — the scaffold commit
(f46cfb7) will be skippable and the #123 commits will drop out, leaving just the 4 clean #124 commits on main.

Comment thread cdk/src/bootstrap/required-policies.ts Outdated
Comment thread cdk/src/bootstrap/required-policies.ts
Comment thread cdk/src/stacks/agent.ts
scottschreckengaust and others added 7 commits June 3, 2026 18:29
Replace comment toggle with proper context gate. ECS resources only
synthesize when compute_type=ecs is passed. Default (agentcore) behavior
unchanged. Closes #164

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…are policy selection

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Maps all CloudFormation resource types used by the ABCA stack to their
required IAM actions per lifecycle phase (create/read/update/delete).
Actions are sourced from CloudTrail-validated policies in DEPLOYMENT_ROLES.md.
Tests validate structure, format, and policy coverage (with known gaps
for SQS, S3 bucket lifecycle, and Lambda ESM/Layer actions documented).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Validates that all resource types in the synthesized CloudFormation
template have entries in the resource-action-map. Tests agentcore from
existing cdk.out and attempts ECS synth gracefully skipping when AWS
credentials are unavailable.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
compute_type drives which compute policy is needed — agentcore and ecs
are independent choices, not base+optional. An operator deploying only
ECS should not require agentcore permissions.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The resource-action-map test previously synthesized into cdk/cdk.out.ecs/
inside the repo tree. CDK's AgentRuntimeArtifact.fromAsset(repoRoot)
fingerprints the entire tree, so when github-tags.test runs in parallel it
can stat synth.lock mid-lifecycle and hit ENOENT.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@scottschreckengaust scottschreckengaust force-pushed the feat/bootstrap-action-map branch from a3fcb8f to 684817e Compare June 3, 2026 19:06
@krokoko
Copy link
Copy Markdown
Contributor

krokoko commented Jun 4, 2026

on it

@krokoko
Copy link
Copy Markdown
Contributor

krokoko commented Jun 5, 2026

Review — feat(bootstrap): resource-action-map for synth-time validation

Thanks for this — the direction is great. Converting the bootstrap permissions into a versioned, testable map advances the "bounded blast radius" tenet from RFC #120, and replacing the comment-toggle with a real compute_type context gate (#164) is a clear improvement that keeps the ECS path always-compilable. The data model is clean and the pure-function unit tests are genuinely strong.

I do think a few things should be addressed before merge. Verdict: request changes — all correctable in a focused follow-up.

Blocking

B1 — The map is missing the ECS resource types this PR enables, and the included test is red.
EcsAgentCluster synthesizes AWS::ECS::Cluster and AWS::ECS::TaskDefinition (cdk/src/constructs/ecs-agent-cluster.ts:79,108), but RESOURCE_ACTION_MAP has no AWS::ECS::* entry and neither type is in the test's SKIP_TYPES. Running cdk synth -c compute_type=ecs and then the suite fails:

Test "all ecs resource types have map entries"
  Received + ["AWS::ECS::Cluster", "AWS::ECS::TaskDefinition"]

The sibling policy compute-ecs.ts already grants the matching ecs:* actions, so the map is the only thing out of sync. Suggested fix — add both entries, deriving actions from compute-ecs.ts:

'AWS::ECS::Cluster': {
  create: ['ecs:CreateCluster', 'ecs:TagResource', 'ecs:PutClusterCapacityProviders'],
  read:   ['ecs:DescribeClusters', 'ecs:ListTagsForResource'],
  update: ['ecs:UpdateCluster', 'ecs:UpdateClusterSettings', 'ecs:PutClusterCapacityProviders', 'ecs:TagResource', 'ecs:UntagResource'],
  delete: ['ecs:DeleteCluster'],
},
'AWS::ECS::TaskDefinition': {
  create: ['ecs:RegisterTaskDefinition', 'ecs:TagResource'],
  read:   ['ecs:DescribeTaskDefinition', 'ecs:ListTaskDefinitions', 'ecs:ListTagsForResource'],
  update: ['ecs:RegisterTaskDefinition', 'ecs:TagResource', 'ecs:UntagResource'],
  delete: ['ecs:DeregisterTaskDefinition'],
},

B2 — Both "Synth coverage" tests pass/skip silently, so the map's only drift-defense isn't load-bearing. (cdk/test/bootstrap/resource-action-map.test.ts)

  • The agentcore test reads cdk.out/backgroundagent-dev.template.json; if it's missing → if (types.length === 0) return; runs zero assertions. cdk.out/ is gitignored and cdk:test depends only on :compile (not :synth), running parallel to :synth:quiet — so on a clean CI runner the template may not exist yet and the test passes vacuously.
  • The ECS test wraps the whole synth in a bare catch { …; return; } // skip gracefully, which swallows every failure mode (a real synth/cdk-nag regression, a timeout, an npx blip) and reports green. This is most likely why CI is green while the test is actually red locally — CI hits the skip path.

Suggested fix: synthesize in-process so the tests are self-contained, and add a non-vacuous guard:

expect(types.length).toBeGreaterThan(0); // fail if synth produced nothing
expect(types.filter(t => !SKIP_TYPES.has(t) && !RESOURCE_ACTION_MAP[t])).toEqual([]);

For the ECS shell-out, capture the error and throw with stderr rather than return; if a no-tooling skip is genuinely needed, gate it on an explicit env flag and use it.skip so the skip is reported, not hidden.

B3 — The compute_type gate (the actual behavioral change) has no direct test. (cdk/src/stacks/agent.ts:566-619)
Nothing asserts that default context produces no ECS resources, that compute_type=ecs produces them, or that ecsConfig is wired into TaskOrchestrator only in the ECS case. The touched assertion in github-tags.test.ts:142 was changed 'ecs''custom-compute', which (understandably, to avoid the Docker asset build) removes the sole incidental exercise of the ECS path. Suggested:

test('default → no ECS', () => { const t = synth({}); t.resourceCountIs('AWS::ECS::Cluster', 0); });
test('compute_type=ecs → ECS cluster + task def', () => {
  const t = synth({ compute_type: 'ecs' });
  t.resourceCountIs('AWS::ECS::Cluster', 1);
  t.resourceCountIs('AWS::ECS::TaskDefinition', 1);
});

plus an assertion that the orchestrator receives ecsConfig only when ecs.

Non-blocking suggestions

  • Unused for now: getRequiredBootstrapPolicies, RESOURCE_ACTION_MAP, getActionsForResource, getAllMappedActions have no production caller yet (only barrels + tests). Totally fine as staged scaffolding for the Aspect (feat(bootstrap): CDK Aspect for policy envelope checking #125) / preflight (feat(bootstrap): live-account preflight validator #126) — just worth stating explicitly in the description so reviewers don't expect synth-time enforcement yet.
  • Untracked gaps: KNOWN_GAP_SERVICES/KNOWN_GAP_ACTIONS (sqs:*, s3:CreateBucket+lifecycle, Lambda EventSourceMapping/LayerVersion) are genuine — the stack creates those resources but no policy grants the actions, so the "all mapped actions exist in policies" test passes only by excluding the cases it most needs to catch. Could each be tied to a tracking issue (// gap tracked in #NNN), noted in DEPLOYMENT_ROLES.md, and guarded with a "gap set only shrinks" assertion?
  • Test polish: temp-dir rmSync is duplicated rather than in a finally (:233,241); a couple of throws after an expect().toBe(true) that already aborts are dead (:101-104,:116-119); getActionsForResource only asserts create/delete, never read/update.
  • The ...(ecsCluster && { ecsConfig: {...} }) spread in agent.ts:596 is correct (ecsConfig is optional and spreading a falsy is a no-op) — flagging only because it's an unusual pattern.

Docs / security

No docs changes needed for this internal scaffolding, and the Starlight mirror sync isn't triggered (no docs//CONTRIBUTING.md edits). The map is inert until #125/#126 consume it, so there's no new IAM/network surface at deploy time — but when the Aspect lands, please make sure the KNOWN_GAP_* exclusions don't translate into under- or over-scoped roles.

Really nice foundation overall — just want the coverage tests to actually fail when they should, and the ECS entries added, before this goes in. 🙏

Review assisted by Claude Code (code-reviewer, silent-failure-hunter, pr-test-analyzer agents).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

2 participants