RFC: Automated Deployment Pipeline with Protected Environments
Status: Draft (revised per feedback)
Related: #70 (context stack names), #72 (ephemeral cleanup)
Summary
Establish a GitHub Actions deployment pipeline that:
- Builds and synthesizes CDK once per compute_type in
build.yml (always all registered types)
- Stores
cdk-<compute_type>.out as immutable deployment artifacts (synth once, deploy exact artifact)
- Gates all deployments behind a protected GitHub environment (
deploy) requiring manual approval — triggered by deploy label (with optional type qualifiers)
- Deploys to AWS using OIDC federation assuming CDK bootstrap roles (no long-lived credentials)
- Stack naming:
main-<compute_type>-prd for production, ephemeral for PRs/branches
- On successful deployment: creates a GitHub Release (drafted → published) with tagged
main and cdk-*.out artifacts
- Cleanup targets stacks tagged with
github:* context keys (presence of any github:sha != none), gated behind approval with cancel-in-progress concurrency
Decisions (from discussion)
| Question |
Decision |
| PR deployments |
Opt-in via deploy label (with optional type qualifiers) |
| Synth strategy |
Once in build.yml for ALL registered compute_types, deploy the exact artifact — no re-synth |
| Cleanup approval |
Always manually gated — later runs cancel prior pending requests |
| Cost gate |
No — resource review in approval is sufficient |
| Permissions boundary |
Yes — use CDK bootstrap roles (deploy, lookup, file-publishing, image-publishing) |
main deploy approval |
Always require — never skip, even after PR merge |
| Deploy selection |
Label-driven: deploy = all registered types, deploy:<type> = only that type |
| Baselines |
Per-compute_type against main-<compute_type>-prd — stored as release artifacts |
Design
Architecture
┌─────────────────────────────────────────────────────────────────────┐
│ GitHub Actions │
│ │
│ build.yml (CI) — every push/PR │
│ ├─ steps: install → compile → test → lint → synth (per compute_type)│
│ ├─ matrix: ALL registered compute_types (static list, always built) │
│ ├─ artifact: cdk-<compute_type>.out (immutable, uploaded per leg) │
│ └─ output: stack_name, is_protected, compute_type │
│ │
│ deploy.yml (CD) — on `deploy` label OR main merge │
│ ├─ trigger: label added + build success, OR push to main │
│ ├─ environment: "deploy" (ALWAYS requires approval, no bypass) │
│ ├─ matrix: filtered by labels (deploy=all, deploy:<type>=one) │
│ ├─ steps: │
│ │ ├─ download cdk-<compute_type>.out artifact (exact build output)│
│ │ ├─ configure-aws-credentials (OIDC → CDK bootstrap roles) │
│ │ ├─ baseline-diff (compare vs last release baseline) │
│ │ ├─ post diff summary to deployment log │
│ │ ├─ cdk deploy --app cdk-<compute_type>.out --require-approval never │
│ │ └─ on success: draft release → tag → attach artifacts → publish │
│ └─ concurrency: one deploy at a time per stack │
│ │
│ cleanup.yml │
│ ├─ trigger: schedule (every 4h) + workflow_dispatch │
│ ├─ environment: "deploy" (ALWAYS requires approval) │
│ ├─ concurrency: cancel-in-progress (later runs cancel prior) │
│ └─ steps: find stacks with github:* tags → force-detach ENIs → del │
└─────────────────────────────────────────────────────────────────────┘
│
│ OIDC (aws-actions/configure-aws-credentials)
│ role-to-assume: CDK deploy role
▼
┌─────────────────────────────────────────────────────────────────────┐
│ AWS Account │
│ ├─ IAM OIDC Provider (token.actions.githubusercontent.com) │
│ ├─ CDK Bootstrap Roles (permissions boundary): │
│ │ ├─ cdk-hnb659fds-deploy-role-* │
│ │ ├─ cdk-hnb659fds-lookup-role-* │
│ │ ├─ cdk-hnb659fds-file-publishing-role-* │
│ │ └─ cdk-hnb659fds-image-publishing-role-* │
│ ├─ CloudFormation Stacks (tagged: github:sha != 'none') │
│ │ ├─ main-agentcore-prd (protected, terminationProtection=true) │
│ │ ├─ main-ecs-prd (protected, terminationProtection=true) │
│ │ ├─ pr-42-abc1234-agentcore (ephemeral, tagged) │
│ │ └─ commit-abc1234-ecs (ephemeral, tagged) │
│ └─ CDK Bootstrap (cdk-toolkit stack) │
└─────────────────────────────────────────────────────────────────────┘
Label-Driven Deploy Selection
Key principle: Build ALL, deploy selectively
build.yml always synthesizes all registered compute_types (today: [agentcore]). Labels only control what deploy.yml deploys.
Labels
| Label |
Types deployed |
Use case |
deploy |
All registered types |
Standard full deployment |
deploy:agentcore |
agentcore only |
Deploy only agentcore |
deploy:ecs |
ecs only |
Deploy only ECS (when available) |
deploy:* |
All (same as deploy) |
Explicit "all" synonym |
No deploy* label |
Nothing deployed |
Default (CI only) |
Resolution logic (in deploy.yml)
- name: Resolve deploy targets from labels
id: targets
run: |
LABELS='${{ toJson(github.event.pull_request.labels.*.name) }}'
# All registered compute_types (must match build.yml matrix)
ALL_TYPES='["agentcore"]'
if echo "$LABELS" | jq -e 'index("deploy:*")' > /dev/null; then
# deploy:* = all (explicit synonym)
echo "matrix=$ALL_TYPES" >> "$GITHUB_OUTPUT"
elif echo "$LABELS" | jq -e '[.[] | select(startswith("deploy:"))] | length > 0' > /dev/null; then
# Specific type labels — deploy only those
TYPES=$(echo "$LABELS" | jq '[.[] | select(startswith("deploy:")) | ltrimstr("deploy:")]')
echo "matrix=$TYPES" >> "$GITHUB_OUTPUT"
elif echo "$LABELS" | jq -e 'index("deploy")' > /dev/null; then
# Plain "deploy" = all registered types
echo "matrix=$ALL_TYPES" >> "$GITHUB_OUTPUT"
else
echo 'matrix=[]' >> "$GITHUB_OUTPUT"
fi
Release Flow
Successful deployments from main produce GitHub Releases:
main merge
→ build.yml (synth ALL registered compute_types in matrix)
→ upload artifacts: cdk-agentcore.out, (cdk-ecs.out when available, ...)
→ deploy.yml (approval gate — downloads exact artifacts, label filters which deploy)
→ successful deployment
→ Draft Release created:
Tag: v<date>-<short-sha> (e.g. v2026.05.11-abc1234)
Assets:
- cdk-agentcore.out.tar.gz
- (cdk-ecs.out.tar.gz when available)
- agentcore.resource-types.json (baseline)
- (ecs.resource-types.json when available)
→ Publish Release
Baselines live in releases, not in the repo. The diff step downloads the baseline from the latest published release for that compute_type:
- name: Download baseline from latest release
run: |
LATEST=$(gh release view --json tagName -q .tagName 2>/dev/null || echo "")
if [[ -n "$LATEST" ]]; then
gh release download "$LATEST" \
--pattern "${{ matrix.compute_type }}.resource-types.json" \
--dir /tmp/baseline/ || true
fi
This means:
- No baseline commits polluting the repo history
- Baselines are immutable (tied to a release tag)
- First deploy (no prior release) has no baseline → everything shows as "new" (correct)
- Rollback = re-deploy from a prior release's
cdk-*.out artifact
Synth-Once, Deploy-Exact Artifact
The cdk.out is synthesized exactly once per compute_type during build.yml. The deploy.yml never re-synths — it downloads and deploys the exact artifact:
# build.yml — always synths ALL registered types
strategy:
matrix:
compute_type: [agentcore] # extend when new types are ready
# Context is generated into cdk/cdk.context.json before build
- name: Generate CDK context
run: |
jq -n \
--arg compute_type "${{ matrix.compute_type }}" \
--arg stackName "backgroundagent-dev" \
--arg sha "$TAG_SHA" \
... \
'{ "compute_type": $compute_type, "stackName": $stackName, "github:sha": $sha, ... }' \
> cdk/cdk.context.json
- uses: actions/upload-artifact@v4
with:
name: cdk-${{ matrix.compute_type }}-out
path: |
cdk/cdk.out/
cdk/cdk.context.json
# deploy.yml (no synth — uses exact artifact from build)
- uses: actions/download-artifact@v4
with:
name: cdk-${{ matrix.compute_type }}-out
path: cdk-${{ matrix.compute_type }}.out/
- name: Deploy
run: npx cdk deploy --app cdk-${{ matrix.compute_type }}.out --all --require-approval never
This guarantees what was tested in CI is exactly what gets deployed — no new Date() drift, no env var differences, no CDK version skew.
Permissions: CDK Bootstrap Role Assumption
The GitHub OIDC role only needs permission to assume the CDK bootstrap roles. This is the CDK security best practice:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": "sts:AssumeRole",
"Resource": [
"arn:aws:iam::ACCOUNT:role/cdk-hnb659fds-deploy-role-*",
"arn:aws:iam::ACCOUNT:role/cdk-hnb659fds-lookup-role-*",
"arn:aws:iam::ACCOUNT:role/cdk-hnb659fds-file-publishing-role-*",
"arn:aws:iam::ACCOUNT:role/cdk-hnb659fds-image-publishing-role-*"
]
},
{
"Sid": "CleanupENIs",
"Effect": "Allow",
"Action": [
"ec2:DescribeNetworkInterfaces",
"ec2:DetachNetworkInterface",
"ec2:DeleteNetworkInterface",
"cloudformation:ListStacks",
"cloudformation:DescribeStacks",
"cloudformation:DeleteStack",
"cloudformation:ListStackResources"
],
"Resource": "*"
}
]
}
Trust policy (OIDC):
{
"Version": "2012-10-17",
"Statement": [{
"Effect": "Allow",
"Principal": {
"Federated": "arn:aws:iam::ACCOUNT:oidc-provider/token.actions.githubusercontent.com"
},
"Action": "sts:AssumeRoleWithWebIdentity",
"Condition": {
"StringEquals": {
"token.actions.githubusercontent.com:aud": "sts.amazonaws.com"
},
"StringLike": {
"token.actions.githubusercontent.com:sub": "repo:aws-samples/sample-autonomous-cloud-coding-agents:*"
}
}
}]
}
Stack Naming and Tagging
| Git ref |
Label |
Stack name |
Protected |
main |
(auto) |
main-agentcore-prd |
true |
main |
deploy:ecs |
main-ecs-prd |
true |
| PR #42 |
deploy |
pr-42-abc1234-agentcore |
false |
| PR #42 |
deploy:ecs |
pr-42-abc1234-ecs |
false |
| Branch push |
deploy |
commit-abc1234-agentcore |
false |
All stacks deployed via this pipeline are identified by the 13 github:* tags applied via CDK context (PR #91, #93). Cleanup identifies CI-deployed stacks by checking github:sha != none. Additionally:
Tags.of(stack).add('compute_type', computeType);
The compute_type tag enables per-type baseline queries and cost attribution.
GitHub Environment: deploy
| Setting |
Value |
Rationale |
| Required reviewers |
≥1 reviewer, NOT the actor who triggered |
Prevents self-approval |
| Wait timer |
0 (manual approval is the gate) |
— |
| Deployment branches |
All branches |
Allow PR deploys via label |
| Allow administrators to bypass |
No |
No bypass for anyone |
| Prevent self-review |
Yes |
Enforce separation of duties |
Environment secrets:
| Secret |
Value |
AWS_ROLE_ARN |
arn:aws:iam::ACCOUNT:role/GitHubActionsCDKRole |
AWS_REGION |
us-east-1 |
Cleanup Workflow
name: Cleanup Ephemeral Stacks
on:
schedule:
- cron: '0 */4 * * *'
workflow_dispatch:
inputs:
max_age_hours:
description: 'Max age in hours (0 = all non-protected)'
default: '0'
dry_run:
description: 'Dry run mode'
type: boolean
default: true
concurrency:
group: cleanup-ephemeral
cancel-in-progress: true # later runs cancel prior pending requests
jobs:
cleanup:
runs-on: ubuntu-latest
environment: deploy # ALWAYS requires approval
permissions:
id-token: write
contents: read
steps:
- uses: actions/checkout@v4
- uses: aws-actions/configure-aws-credentials@v4
with:
role-to-assume: ${{ secrets.AWS_ROLE_ARN }}
aws-region: ${{ secrets.AWS_REGION }}
- name: Run cleanup
env:
MAX_AGE_HOURS: ${{ inputs.max_age_hours || '0' }}
run: ./scripts/cleanup-ephemeral-stacks.sh --tag-key github:sha --tag-value-not none
Resource Baseline and Diff (via Releases)
Diff output example (shown to approver in Step Summary)
## ⚠️ New AWS Resource Types (agentcore)
The following resource types are NEW compared to latest release v2026.05.10-fa647ca:
+ AWS::EKS::Cluster
+ AWS::EKS::Nodegroup
+ AWS::IAM::OpenIDConnectProvider
Approver action: Verify cost model, quotas, security posture, and cleanup behavior.
## Resource count: 47 → 50 (+3)
Approval Gate: What Reviewers Should Check
The deployment summary provides:
- Resource type diff from baseline (new/removed services)
- Full
cdk diff (property-level changes from the synthesized artifact)
- Compute type and stack name being deployed
- Labels that triggered the deployment
Per new resource type, verify:
| Check |
How |
| Cost model |
AWS Pricing / awspricing MCP |
| Service quotas |
aws service-quotas list-service-quotas --service-code <code> |
| Security posture |
Public endpoints? VPC-only? Encryption at rest? |
| IAM blast radius |
What * permissions does CDK grant? |
| Cleanup behavior |
RemovalPolicy.DESTROY? Orphan risk? |
| Regional availability |
Available in target region? |
Implementation Plan
Phase 1: Foundation
Phase 2: Build pipeline
Phase 3: Deploy pipeline
Phase 4: Cleanup
Phase 5: Observability
Security Considerations
- No long-lived credentials: OIDC only → assumes CDK bootstrap roles
- Permissions boundary: GitHub role can ONLY assume the 4 CDK bootstrap roles + ENI cleanup
- No self-approval: Enforced at GitHub environment level
- No admin bypass: Even org owners must get approval
- Audit trail: GitHub deployment history + CloudTrail
- Tag-based targeting: Cleanup identifies stacks by
github:sha tag (applied to all CI-deployed stacks)
- Termination protection:
main-*-prd stacks cannot be accidentally deleted
- Artifact integrity: What CI tested is exactly what gets deployed (no re-synth)
References
RFC: Automated Deployment Pipeline with Protected Environments
Status: Draft (revised per feedback)
Author: @scottschreckengaust
Related: #70 (context stack names), #72 (ephemeral cleanup)
Summary
Establish a GitHub Actions deployment pipeline that:
build.yml(always all registered types)cdk-<compute_type>.outas immutable deployment artifacts (synth once, deploy exact artifact)deploy) requiring manual approval — triggered bydeploylabel (with optional type qualifiers)main-<compute_type>-prdfor production, ephemeral for PRs/branchesmainandcdk-*.outartifactsgithub:*context keys (presence of anygithub:sha!=none), gated behind approval with cancel-in-progress concurrencyDecisions (from discussion)
deploylabel (with optional type qualifiers)build.ymlfor ALL registered compute_types, deploy the exact artifact — no re-synthmaindeploy approvaldeploy= all registered types,deploy:<type>= only that typemain-<compute_type>-prd— stored as release artifactsDesign
Architecture
Label-Driven Deploy Selection
Key principle: Build ALL, deploy selectively
build.ymlalways synthesizes all registered compute_types (today:[agentcore]). Labels only control whatdeploy.ymldeploys.Labels
deploydeploy:agentcoreagentcoreonlydeploy:ecsecsonlydeploy:*deploy)deploy*labelResolution logic (in
deploy.yml)Release Flow
Successful deployments from
mainproduce GitHub Releases:Baselines live in releases, not in the repo. The diff step downloads the baseline from the latest published release for that compute_type:
This means:
cdk-*.outartifactSynth-Once, Deploy-Exact Artifact
The
cdk.outis synthesized exactly once per compute_type duringbuild.yml. Thedeploy.ymlnever re-synths — it downloads and deploys the exact artifact:This guarantees what was tested in CI is exactly what gets deployed — no
new Date()drift, no env var differences, no CDK version skew.Permissions: CDK Bootstrap Role Assumption
The GitHub OIDC role only needs permission to assume the CDK bootstrap roles. This is the CDK security best practice:
{ "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": "sts:AssumeRole", "Resource": [ "arn:aws:iam::ACCOUNT:role/cdk-hnb659fds-deploy-role-*", "arn:aws:iam::ACCOUNT:role/cdk-hnb659fds-lookup-role-*", "arn:aws:iam::ACCOUNT:role/cdk-hnb659fds-file-publishing-role-*", "arn:aws:iam::ACCOUNT:role/cdk-hnb659fds-image-publishing-role-*" ] }, { "Sid": "CleanupENIs", "Effect": "Allow", "Action": [ "ec2:DescribeNetworkInterfaces", "ec2:DetachNetworkInterface", "ec2:DeleteNetworkInterface", "cloudformation:ListStacks", "cloudformation:DescribeStacks", "cloudformation:DeleteStack", "cloudformation:ListStackResources" ], "Resource": "*" } ] }Trust policy (OIDC):
{ "Version": "2012-10-17", "Statement": [{ "Effect": "Allow", "Principal": { "Federated": "arn:aws:iam::ACCOUNT:oidc-provider/token.actions.githubusercontent.com" }, "Action": "sts:AssumeRoleWithWebIdentity", "Condition": { "StringEquals": { "token.actions.githubusercontent.com:aud": "sts.amazonaws.com" }, "StringLike": { "token.actions.githubusercontent.com:sub": "repo:aws-samples/sample-autonomous-cloud-coding-agents:*" } } }] }Stack Naming and Tagging
mainmain-agentcore-prdtruemaindeploy:ecsmain-ecs-prdtruedeploypr-42-abc1234-agentcorefalsedeploy:ecspr-42-abc1234-ecsfalsedeploycommit-abc1234-agentcorefalseAll stacks deployed via this pipeline are identified by the 13
github:*tags applied via CDK context (PR #91, #93). Cleanup identifies CI-deployed stacks by checkinggithub:sha!=none. Additionally:The
compute_typetag enables per-type baseline queries and cost attribution.GitHub Environment:
deployEnvironment secrets:
AWS_ROLE_ARNarn:aws:iam::ACCOUNT:role/GitHubActionsCDKRoleAWS_REGIONus-east-1Cleanup Workflow
Resource Baseline and Diff (via Releases)
Diff output example (shown to approver in Step Summary)
Approval Gate: What Reviewers Should Check
The deployment summary provides:
cdk diff(property-level changes from the synthesized artifact)Per new resource type, verify:
awspricingMCPaws service-quotas list-service-quotas --service-code <code>*permissions does CDK grant?RemovalPolicy.DESTROY? Orphan risk?Implementation Plan
Phase 1: Foundation
deploy(no self-approval, no bypass, prevent self-review)sts:AssumeRoleto CDK bootstrap rolesPhase 2: Build pipeline
build.yml(PR feat(ci): synth-per-variant build with github:* context in artifact #91) — currently[agentcore]cdk.context.jsonwith all 13github:*tags +compute_type+stackName(PR feat(ci): synth-per-variant build with github:* context in artifact #91)github:*resource tags via CDK context (PR feat(ci): synth-per-variant build with github:* context in artifact #91, feat(cdk): add 4 additional github:* resource tags #93)cdk-<compute_type>-outimmutable artifact per matrix leg (PR feat(ci): synth-per-variant build with github:* context in artifact #91)compute_typefrom context in CDK and apply as resource tag (PR feat(ci): rename computeVariant to compute_type and apply as resource tag #97)computeVariant→compute_typeinbuild.ymlcontext generation (PR feat(ci): rename computeVariant to compute_type and apply as resource tag #97)Phase 3: Deploy pipeline
deploy.yml— downloads exact artifact, never re-synths (PR feat(ci): add deploy pipeline with OIDC, dynamic stack naming, and deploy-intent artifact #98)deploy= all,deploy:<type>= one) (PR feat(ci): add deploy pipeline with OIDC, dynamic stack naming, and deploy-intent artifact #98)cdk diffoutput to step summaryPhase 4: Cleanup
cleanup-ephemeral-stacks.shto target bygithub:shatag presencecleanup.ymlwith approval gate andcancel-in-progressPhase 5: Observability
CONTRIBUTING.mdSecurity Considerations
github:shatag (applied to all CI-deployed stacks)main-*-prdstacks cannot be accidentally deletedReferences
github:*context in artifactgithub:*resource tags (13 total)