Skip to content

feat(notifications): Linear dispatcher in fanout consumer (cost/turns/duration)#243

Open
isadeks wants to merge 28 commits into
aws-samples:mainfrom
isadeks:feat/239-linear-fanout-dispatcher
Open

feat(notifications): Linear dispatcher in fanout consumer (cost/turns/duration)#243
isadeks wants to merge 28 commits into
aws-samples:mainfrom
isadeks:feat/239-linear-fanout-dispatcher

Conversation

@isadeks
Copy link
Copy Markdown
Contributor

@isadeks isadeks commented Jun 2, 2026

Closes #239. Stacked on #241 (the screenshot pipeline) — review #241 first; this PR is the dispatcher that composes with it for the screenshot-from-Linear-issue feedback loop.

Summary

  • New Linear dispatcher in fanout-task-events.ts, mirroring the slack/github/email pattern. Single-entry registration via the existing NotificationChannel + CHANNEL_DEFAULTS + DISPATCHERS extension surface.
  • Posts a deterministic final-status comment to the linked Linear issue on terminal task events. Three framing modes: ✅ task_completed, ⚠️ error_max_turns-with-PR, ❌ all other terminal states.
  • Catches the case where the agent crashes (OOM, SDK buffer overflow, max_turns) before reaching its own step-3 completion comment — see ABCA-91 in feat(notifications): platform-side final-status Linear comment with cost/turns/duration #239's description.
  • Reuses the existing postIssueComment helper; no new outbound Linear API surface.

Architecture

  • Channel registration: 1 line in NotificationChannel, 1 entry in CHANNEL_DEFAULTS (terminal events only), 1 entry in DISPATCHERS map.
  • dispatchToLinear gates on channel_source === 'linear' so non-Linear tasks short-circuit after one DDB Get — same shape as the Slack dispatcher's gate.
  • renderLinearFinalStatusComment is exported for testability and cleanly separates the (event_type, pr_url) → header logic from the body assembly.
  • Construct gets two new guarded props (linearWorkspaceRegistryTable, linearOauthSecretArnPattern) matching the slackSecretArnPattern pattern. A deployment without LinearIntegration gets no dangling IAM grants.
  • FanOutConsumer moved below LinearIntegration in agent.ts so it can receive the registry table reference (LinearIntegration depends on runtime and orchestrator, both already constructed earlier).

Test plan

  • 6 new Linear-dispatcher tests: happy path, failed without PR, ABCA-91 max_turns-with-PR, channel_source short-circuit, missing channel_metadata, postIssueComment graceful no-op
  • Existing routing tests updated for the new dispatcher count (4 channels: slack/github/linear/email)
  • Full CDK suite: 1816 passing, 101 suites (15 min on M1)
  • CI: build + tests pass on PR
  • Smoke test on dev stack (deferred until feat(notifications): preview-deploy screenshot pipeline (provider-agnostic) #241 lands so this doesn't fight against the in-flight stack)

Out of scope (deferred per #239)

  • Linear reaction-flip logic (the ❌ vs ⚠️ on the original issue)
  • Migrating the agent's three-comment prompt contract to platform-side (step 3 in the prompt becomes redundant once this lands)
  • Real-time progress comments — Linear's save_comment doesn't support edit, so this is post-once on terminal events only

isadeks and others added 24 commits June 2, 2026 13:21
… yet)

Lambda + AgentCore Browser plumbing for capturing screenshots of
preview deployments. Provider-agnostic — listens for GitHub
deployment_status events from any source (Vercel, Amplify Hosting,
Netlify, GitHub Actions custom CD).

This commit lands the handler / construct code only. Stack wiring
follows in the next commit.
- New `GitHubScreenshotIntegration` construct (mirrors `LinearIntegration`):
  bundles the screenshot bucket, dedup table, signing-secret placeholder,
  receiver Lambda, processor Lambda, and the API Gateway route. cdk-nag
  suppressions added inline (HMAC auth instead of Cognito; AgentCore
  Browser sessions have no per-resource ARN; Secrets Manager rotation
  is owned by GitHub).

- Wired into `agent.ts` after the LinearIntegration block. Reuses the
  existing `githubTokenSecret` (the processor uses ABCA's main GitHub
  token to look up which PR a deploy SHA belongs to and post the
  screenshot comment — no new credential).

- Three new stack outputs: `GitHubWebhookUrl`, `GitHubWebhookSecretArn`,
  `ScreenshotBucketName`.

- Bumped agent.test.ts table count from 13 to 14 to account for the
  new dedup table.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ot bucket

cdk-nag's S2 fires on any bucket that has `blockPublicPolicy: false`
even when the policy is intentionally permissive. Add the suppression
with the same rationale as S1/S5 — public reads are required by
GitHub Markdown renderers and Linear `imageUploadFromUrl`, and the
read grant is prefix-scoped to `screenshots/*`.

Caught when the first deploy attempt aborted at synth-time on the new
GitHubScreenshotIntegration construct.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The first deploy attempt failed at CFN-execute time on the bucket
policy:

  s3:PutBucketPolicy ... because public policies are prevented by
  the BlockPublicPolicy setting in S3 Block Public Access.

Account-level Block Public Access is on for this AWS account, which
overrides per-bucket BPA settings. Disabling it would change the
security posture of the whole account, so route around the constraint
with the AWS-recommended pattern: private S3 + CloudFront with Origin
Access Control.

Changes:
- `ScreenshotBucket` is now `BLOCK_ALL` BPA, no public bucket policy.
  Adds a `cloudfront.Distribution` whose origin is the bucket via
  `S3BucketOrigin.withOriginAccessControl`. The distribution policy is
  scoped to the CloudFront service principal only, so account-level
  BPA accepts it.
- Processor reads `SCREENSHOT_PUBLIC_HOST` (the CloudFront domain)
  instead of building an S3 URL. PR comments now embed
  `https://<dist>.cloudfront.net/screenshots/...` URLs.
- New stack output `ScreenshotCloudFrontDomain`.
- Bucket-level S2/S5 suppressions removed (no longer applicable —
  bucket is private). Distribution gets CFR1/CFR2/CFR3/CFR4/CFR7
  suppressions with rationales.

Heads up on deploy time: CloudFront distributions take 5-15 min to
provision on first create.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The CommonRuleSet was 403'ing GitHub deployment_status webhooks before
the request reached our Lambda — the deployment payload contains
absolute Vercel preview URLs in the body, which trips GenericRFI_BODY.

Mirror the Linear webhook exemption: the GitHub webhook path is
HMAC-verified in the Lambda, parsed as strict JSON, never
interpolated into SQL/HTML, and rate-limited by the priority-3 rule.
CRS still applies to every other route.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…loyment

GitHub's `deployment_status` webhook puts the deployed URL on the
*status* object, not the deployment itself. The deployment object is
immutable per (sha, environment); the status changes through the
deploy lifecycle (`pending` → `success`) and carries the URL only
once the deploy finishes.

Symptom: receiver kept short-circuiting `success` events from Vercel
with `{ok: true, skipped_no_url: true}` because we read the wrong
field. Verified by inspecting the webhook delivery payload via
`gh api .../deliveries/<id> --jq .request.payload.deployment_status` —
URL was there all along.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…dshake

Node 24's global WebSocket (from undici) does NOT support arbitrary
HTTP headers on the upgrade request — passing them as the second arg
gets silently ignored. AgentCore Browser's WSS handshake requires
SigV4-signed Authorization + X-Amz-* headers, so the connection was
opening but then getting rejected, which surfaced as an empty
`error` event ("AgentCore Browser WebSocket error: ").

Switch to the `ws` package which natively supports `options.headers`.
Also add an `unexpected-response` handler so HTTP-level handshake
failures (403, 400) surface with status codes instead of empty errors.

Smoke verified locally — the ws-based path opens cleanly against
example.com and Vercel preview URLs.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Lambda runtime returned a 403 on the WSS upgrade despite well-formed
SigV4 headers — `ws` rewrites the Host header during the upgrade
GET, which invalidates the canonical-request signature we computed
against the original Host. This works locally because Node's tooling
on macOS keeps the original Host through the handshake, but the
Lambda runtime's TLS stack normalizes differently.

Switch to query-parameter SigV4 (presigned URL): SignatureV4.presign
returns a wss://...?X-Amz-Algorithm=...&X-Amz-Signature=... URL where
the auth lives in the URL itself, so any Host-header rewriting
downstream doesn't break the signature.

Smoke verified locally — presigned URL connects cleanly to AgentCore
Browser and the screenshot pipeline runs end-to-end (6.3s, valid
PNG, captures example.com correctly).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The minimal IAM I shipped earlier (`StartBrowserSession`,
`StopBrowserSession`, `GetBrowserSession`, `UpdateBrowserStream`)
wasn't enough — the WSS automation-stream connect requires an
additional `ConnectBrowserAutomationStream`-flavored action that
isn't in the public CLI command list. Lambda invocations were
opening sessions cleanly but 403'ing on the WSS upgrade.

Widen to `bedrock-agentcore:*` to unblock the e2e flow. Followup:
scope back down to the specific connect action once it's documented
or surfaced via CloudTrail decoded-message-on-deny.

Smoke verified: PR #1 on isadeks/vercel-abca-linear now receives a
screenshot comment within ~7s of the deployment_status webhook.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Extends the screenshot processor to find a Linear issue via the PR's
title/body and post the same image comment there.

Approach (no GSI write-back needed):
- Regex-extract Linear identifier (e.g. `ABCA-42`) from PR title/body.
  These are present whether the agent put them there
  (`task_description` carries the identifier) or Linear's own GitHub
  integration auto-injected the back-reference on PR open.
- Scan `LinearWorkspaceRegistryTable` for `status=active` workspaces.
  Per-workspace, query Linear's `issueVcsBranchSearch` (which accepts
  the human-readable identifier) and accept the first exact-match
  hit.
- Post the markdown image comment via the existing `postIssueComment`
  helper from Phase 2.0b.

The Linear post is best-effort — if the registry table isn't wired,
the identifier doesn't extract, or the lookup misses, the GitHub PR
comment still lands. New env var `LINEAR_WORKSPACE_REGISTRY_TABLE_NAME`
is optional on the processor; the construct only sets it when the
prop is provided.

CDK: `GitHubScreenshotIntegrationProps` gains an optional
`linearWorkspaceRegistryTable`. When provided, the processor's IAM
grows: ReadData on the registry, GetSecretValue+PutSecretValue on
`bgagent-linear-oauth-*`. `agent.ts` wires
`linearIntegration.workspaceRegistryTable` into the screenshot
construct.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Some providers (Vercel, Netlify) post deployment_status faster than
the agent can run `gh pr create`. Retry the GitHub PR-lookup with
backoff so the screenshot finds the open PR rather than dropping the
event when the timing is reversed.
…issue comment spam

Move the trigger-label check ahead of every user-facing comment path in
the Linear webhook processor, and switch the default trigger label from
'bgagent' to 'abca'. An unlabeled issue is now a true no-op: no comment,
no reaction, no createTaskCore, no DDB writes — regardless of whether
the project is onboarded.

Why: workspace webhooks fire workspace-wide. A single un-onboarded team
in the same Linear workspace produced 47 identical "❌ project isn't
onboarded" comments on GRO-783 in 5 minutes because every Issue event
(create/update/label-change) hit the not-onboarded gate before the
label gate. With the gate order flipped, only issues that explicitly
opt in via the trigger label can ever generate user-facing feedback.

Per-project label_filter override is still respected — the project
mapping lookup now happens once, before the label gate, instead of after.

Tests: two new regression tests pin the spam scenario (unlabeled issue
in a non-onboarded project, and unlabeled issue with no projectId) to
zero side effects. Full CDK suite (89 suites / 1572 tests) passes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds the operator walkthrough for wiring up the AgentCore-Browser
preview-deploy screenshot pipeline.
Mirrors the Linear webhook-info pattern so docs and onboarding don't
have to embed stack-specific URLs or copy-paste aws CLI invocations.

Two subcommands:
- `webhook-info` — read-only. Reads GitHubWebhookUrl + GitHubWebhookSecretArn
  from the CFN stack outputs and prints values to paste into a GitHub
  repo's webhook config (Settings → Webhooks → Add webhook). Includes
  the event-type ('Deployment statuses') and content-type guidance
  that operators consistently miss.
- `set-webhook-secret` — interactive PutSecretValue against the stack
  output ARN. Replaces the cargo-cult `aws secretsmanager put-secret-
  value` operators were copy-pasting from the screenshot setup notes.
  Warns before overwriting an existing real secret (heuristic: a CDK-
  seeded JSON placeholder starts with `{`; a real GitHub secret won't).

No CDK changes — both stack outputs were already there. Pure CLI add.
The pipeline was always provider-agnostic — it listens for GitHub
deployment_status events, which Vercel, AWS Amplify, Netlify, and
any GitHub-Actions-driven CD pipeline all post. Code comments,
inline strings, and the setup guide referenced Vercel as if it were
the only supported path; this commit aligns the surfacing with what
the code actually does.

Code:
- Linear comment body: "after the Vercel preview deploy finished"
  → "after the deploy finished" (the GitHub PR comment already
  said this; just the Linear path was inconsistent)
- Webhook receiver doc-comment + envelope interface comment: drop
  Vercel-only language; explain that the `environment` filter
  (`SCREENSHOT_TARGET_ENVIRONMENT` env var) is configurable per-
  provider, with a table of common values
- Processor PR-race comment: explain that the gap is also seen on
  Netlify/Amplify, not unique to Vercel
- AgentCore Browser comment: drop Vercel-specific phrasing on
  "what we don't try to be clever about"
- GitHubScreenshotIntegration construct prop docstring: explain
  the per-provider env-name conventions

Docs:
- Rename VERCEL_SETUP_GUIDE.md → DEPLOY_PREVIEW_SCREENSHOTS_GUIDE.md
- Lead with a "works with any provider that posts deployment_status"
  table (Vercel / Amplify / Netlify / GitHub Actions custom CD,
  with "out-of-the-box?" yes/no per provider)
- Keep Vercel as the worked example since it's what we smoke-tested,
  but add a "skip Steps 1-2" callout for non-Vercel providers
- New "Configuring for non-Vercel providers" section with the
  SCREENSHOT_TARGET_ENVIRONMENT override pointer
- Replace 4a/4b's CFN-output spelunking with `bgagent github
  webhook-info` + `bgagent github set-webhook-secret` (commands
  shipped in 1c1b618)
- Troubleshooting: mention that 401 "Invalid signature" is the
  set-webhook-secret-mismatch case
- Sync registration: register as DEPLOY_PREVIEW_SCREENSHOTS_GUIDE
  in sync-starlight.mjs route map + the explicit mirror call;
  added to astro.config.mjs sidebar after the PAK runbook

No CDK structural changes — the construct prop, env-var, and code
behaviour were already provider-agnostic. Pure surfacing fix.
…eamble

Step 3 (repo onboarding + Linear project mapping) duplicated work
the Prerequisites section already establishes ('Linear OAuth installed
for at least one workspace'). If the user followed the Linear setup
guide, both are done. If they didn't, Step 4's smoke test fails fast
and the troubleshooting routes them back. Net: 30 lines of doc gone,
no information lost.

Renumbered Step 4 → 3 and Step 5 → 4 (and the 4a/b/c → 3a/b/c
sub-steps).

Also dropped the 'demo configuration optimizes for "look, it works"
rather than security posture' framing on the production-hardening
section. The list of followups stands on its own; the framing reads
as condescending toward someone reaching the bottom of the guide.
… state

Public docs that say 'followup' read as commitments to do that work.
Reframe gaps as current limitations with neutral language:
- 'Production hardening (followups)' → 'Production hardening
  considerations'; bullets describe what to think about, not what
  ABCA promises to ship
- Netlify table row: 'followup to support pattern matching' →
  '⚠ workable today only by picking one specific PR's
  environment string; broader pattern matching isn't shipped'
- Vercel auth callout: 'tracked as a followup' → 'currently not
  implemented'
- Non-Vercel providers table: drop 'followup aws-samples#96 covers prefix
  routing' reference (issue numbers don't belong in user-facing
  docs)

Net: same information, no implicit roadmap commitments.
The screenshot pipeline only needs GitHub. Linear-side posting was
phrased as a hard requirement throughout the guide because the demo
flow happens to use Linear, but a non-Linear team gets a perfectly
useful integration: screenshots land on GitHub PRs, the Linear
lookup silently no-ops.

Reframings:
- Lead-in: 'on both the open GitHub PR AND the linked Linear issue'
  → 'on the open GitHub PR. If you also have Linear configured,
  the same screenshot is posted to the linked Linear issue as a
  bonus.' Plus a note on the gating (LinearWorkspaceRegistryTable
  having active rows is what flips the Linear path on).
- 'How it works': step 4 (Linear post) marked optional with the
  silent-skip behaviour spelled out
- Architecture comment: 'GitHub PR comment + Linear issue comment'
  → '... (+ Linear issue comment if linked)'
- Prerequisites: Linear OAuth marked optional with rationale
- Smoke test: rewritten as PR-driven by default ('open any PR on
  the configured repo'), with Linear-driven path as a follow-on
  paragraph ('If you also have Linear configured...')
- Troubleshooting: 'Linear is best-effort' → 'opt-in and best-
  effort', explicit note that skipping is normal without Linear
GitHub PR comment now reads 'From [preview link](url)' and Linear
comment reads '[Preview link](url)' instead of pasting the bare URL.
Cleaner visual when the same comment is posted on both surfaces.
Closes the doc gaps from the screenshot feature followup list:

- USER_GUIDE.md: new 'Preview-deploy screenshots (optional)' subsection
  under Notifications, points at DEPLOY_PREVIEW_SCREENSHOTS_GUIDE.md.
- COST_MODEL.md: 'Optional: deploy-preview screenshots' table covering
  AgentCore Browser session, Lambda processor, S3, CloudFront line items
  (~$0.01 per screenshot, dominated by Browser session time).
- ROADMAP.md: marks the feature shipped under Notification plane with a
  one-line description of the trigger model and post-deploy latency.

Mirrors regenerated via docs/scripts/sync-starlight.mjs.
The 'Inviting teammates' section was missing the prerequisite that the
teammate needs their own ABCA account (Cognito user + configured CLI)
before they can redeem a Linear invite code. New flow walks through:

  Admin: invite-user (Cognito) → invite-user <slug> (Linear)
  Teammate: configure --from-bundle → login → linear link <code>

with cross-references to USER_GUIDE.md's 'Joining an existing
deployment' for the Cognito-side details. Also corrects the stale
'auto-links the person running the wizard' claim — setup now offers
an inline picker (opt-in by admin), not an automatic mapping.
Last batch of stale 'Vercel' framing in CLI command output, missed in
the original de-Vercel-ize sweep. Provider-agnostic now: webhook-info
header reads 'preview-deploy screenshot pipeline', the closing note
lists Vercel/Amplify/Netlify/GitHub Actions as example providers, and
the smoke-test instruction says 'push to a PR-attached branch' rather
than 'trigger a Vercel preview deploy'.

No behaviour change; pure copy.
…mutation)

The local build's eslint --fix step rewrote a no-interpolation
template literal to single quotes; CI's 'Fail build on mutation'
guard caught that the mutation wasn't committed. Apply the fix.
…ost/turns/duration

Closes aws-samples#239.

Adds a Linear dispatcher to the fanout consumer alongside the existing
slack/github/email dispatchers. Posts a deterministic final-status
comment on terminal task events for Linear-origin tasks, with cost,
turns/max_turns, duration, and PR link rendered.

Three framing modes by (event_type, pr_url):

  ✅ task_completed                 → 'Task completed'
  ⚠️ error_max_turns + pr_url       → 'Shipped a PR but stopped early'
  ❌ all other terminal states      → 'Task <subtype>: <classifier title>'

The ⚠️ case is the motivating ABCA-91 scenario: agent hit max_turns on
turn 101 after shipping PR aws-samples#35; previous behaviour was a silent ❌
reaction with no metrics surfaced to the requester. The platform-side
comment fires deterministically even when the agent crashes (OOM,
SDK buffer overflow, max_turns) before reaching its own step-3
completion comment.

Architecture:
- One new entry in NotificationChannel + CHANNEL_DEFAULTS + DISPATCHERS
  in fanout-task-events.ts. Dispatcher gates on channel_source ===
  'linear' so non-Linear tasks short-circuit after one DDB Get.
- Reuses the existing postIssueComment helper from
  shared/linear-feedback.ts (already in use by the screenshot pipeline
  + orchestrator failure-reporting paths).
- New construct props linearWorkspaceRegistryTable + linearOauthSecretArnPattern
  guard the IAM grants the same way slackSecretArnPattern does — a
  deployment without LinearIntegration gets no dangling permission to
  bgagent-linear-oauth-*.
- FanOutConsumer instantiation moved below LinearIntegration in agent.ts
  so it can receive the registry table reference.

Tests: 92 passing in fanout-task-events.test.ts (1816 across full
CDK suite). New Linear-dispatcher describe block covers happy path,
failed without PR, ABCA-91 max_turns-with-PR, channel_source short-
circuit, missing metadata, and postIssueComment-returning-false
graceful no-op.
@isadeks isadeks requested a review from a team as a code owner June 2, 2026 22:17
@krokoko
Copy link
Copy Markdown
Contributor

krokoko commented Jun 2, 2026

Looking 👁️

@krokoko
Copy link
Copy Markdown
Contributor

krokoko commented Jun 2, 2026

Review — Linear dispatcher in fanout consumer (#239)

Thanks for this — it's a clean, well-tested addition and it closes a real UX gap. The ABCA-91 case (agent crashes at max_turns after shipping a PR, leaving the Linear requester with only a ❌ reaction and no metrics) is exactly the kind of "outcomes must stay inspectable" gap worth closing, and doing it platform-side so it fires even when the agent never reaches its own step-3 comment is the right call.

A few things I especially liked:

  • The dispatcher reuses the existing NotificationChannel + CHANNEL_DEFAULTS + DISPATCHERS surface and mirrors Slack's channel_source gate and best-effort failure contract — same idiom, real per-channel substance, no copy-paste.
  • Numerics are coerced at the DDB boundary with coerceNumericOrNull, matching the GitHub dispatcher.
  • IAM grants are guarded behind optional props, so a deployment without Linear onboarding gets no dangling permissions.
  • The mockDdbSend refactor from index-based assertions to command-type filtering is the correct defensive fix now that GitHub + Linear both hit the shared Get mock.
  • Docs are in good shape — I ran node scripts/sync-starlight.mjs on the branch and the Starlight mirror is in sync, so CI's mutation guard should pass.

Verdict: Approve with nits. One item I'd like scoped down before merge, plus some polish.

Please address before merge

1. bedrock-agentcore:* action wildcard — least privilege (cdk/src/constructs/github-screenshot-integration.ts:195)

actions: ['bedrock-agentcore:*'],
resources: ['*'],

The handler only calls StartBrowserSession, StopBrowserSession, and the SigV4-presigned ConnectBrowserAutomationStream WSS endpoint. resources: ['*'] is defensible (ephemeral sessions have no stable ARN, and the IAM5 suppression documents it), but the action wildcard grants every AgentCore action (memory, runtime, gateway, identity, code-interpreter) we don't use. I see the commit history notes this was widened to unblock e2e with a "scope back down" follow-up — this is a good moment to do it:

actions: [
  'bedrock-agentcore:StartBrowserSession',
  'bedrock-agentcore:StopBrowserSession',
  'bedrock-agentcore:ConnectBrowserAutomationStream',
],

If ConnectBrowserAutomationStream can't be confirmed as the exact IAM action name (it wasn't in the public CLI list), CloudTrail's decoded deny message will have it. If it genuinely can't be resolved, keeping the wildcard is acceptable as long as the blocker is documented inline with a tracking issue rather than a bare "followup."

Non-blocking nits

  1. Doc-vs-code mismatch on the ⚠️ frame (renderLinearFinalStatusComment JSDoc, fanout-task-events.ts). The comment keys case 2 on error_max_turns as an eventType, but error_max_turns is an agent_status, never an event_type — the real trigger is !isCompleted && prUrl != null (any terminal failure that still produced a PR). The test correctly passes event_type: 'task_failed'. Worth aligning the comment to the real condition.

  2. classifyError returns UNKNOWN_CLASSIFICATION, not null, for unmatched non-empty messages (error-classifier.ts:359). The dispatcher's inline comment says it "returns null … the renderer falls back to a generic frame," but a failed task with any non-empty unmatched error_message will actually render ❌ Task failed: Unexpected error. Not a bug — just pick the intended behavior and align the comment with it.

  3. The classifier title is dropped on the ⚠️ path. For the ABCA-91 case (max-turns + PR), classification.title ("Exceeded max turns") is computed but not rendered — the requester sees the PR but not why it stopped, which is the most useful context for the motivating issue. Consider appending it to the ⚠️ frame.

  4. Log severity (dispatchToLinear): the missing_env and post_failed paths log at INFO, whereas the GitHub dispatcher's equivalent missing-env path is WARN. Since this comment is the only completion signal for the crash case, a misconfigured env var or revoked OAuth token failing every post is worth a WARN + error_id so it's alarmable. (The underlying linear-feedback.ts failures already WARN, so this is a backstop.)

  5. Stale comment in routeEvent (fanout-task-events.ts): "at most 3 channels" — now 4 (slack/github/linear/email).

  6. Output description leftover (agent.ts:883): one ScreenshotBucketName description still reads "Vercel-preview screenshots" after the de-Vercel-ize sweep. Cosmetic.

Tests

Coverage is solid and the suite passes locally (86 tests in fanout-task-events.test.ts). A few gaps the issue's acceptance criteria named that would be cheap to close:

  • Missing LINEAR_WORKSPACE_REGISTRY_TABLE_NAME env path — the env is set at module load and never unset, so the deploy-misconfig safety valve is unexercised.
  • renderLinearFinalStatusComment null-metric rendering — every fixture supplies non-null metrics, so the fallbacks and formatDuration boundaries (<60s, exact-minute Nm) aren't covered. This is exactly the crash-before-metrics case the feature is built for; a small table-driven test calling the exported renderer directly would do it.
  • error_max_turns without pr_url — the ⚠️↔❌ boundary (the prUrl != null flip) isn't asserted in that direction.

(The missing-OAuth-token graceful no-op, AC item 5, is covered at the helper layer in linear-feedback.test.ts — no need to duplicate it here.)

One small heads-up: since the PR is stacked on #241, the diff against main carries the full screenshot pipeline too — happy to review just the #239 delta if you retarget the base, otherwise no worries.

Nice work overall — scope the IAM action and this is good to go.


Reviewed with the help of automated agents (code-reviewer, silent-failure-hunter, pr-test-analyzer). 🤖

@isadeks isadeks marked this pull request as draft June 2, 2026 23:23
@codecov-commenter
Copy link
Copy Markdown

Welcome to Codecov 🎉

Once you merge this PR into your default branch, you're all set! Codecov will compare coverage reports and display results in all future pull requests.

Thanks for integrating Codecov - We've got you covered ☂️

isadeks added 3 commits June 3, 2026 18:24
Addresses the non-blocking nits from aws-samples#239 review:

- JSDoc on renderLinearFinalStatusComment now describes the actual
  (eventType, prUrl) discriminator rather than 'error_max_turns' as
  an event type. The agent_status discrimination lives in the error
  classifier, not in the dispatcher's framing logic.
- Inline comment on classifyError result corrected: returns null only
  for empty error_message, UNKNOWN_CLASSIFICATION (title 'Unexpected
  error') for any non-empty unmatched message.
- ⚠️ frame now appends classifier title — for ABCA-91 the requester
  sees 'Shipped a PR but stopped early — Exceeded max turns', not
  just the bare PR-shipped frame.
- missing_env and post_failed log paths bumped from INFO to WARN
  with error_id tags so missing-env / post-failure are alarmable.
  The Linear comment is the only completion signal for the
  agent-crash case, so silent drops defeat the dispatcher's purpose.
- Stale 'at most 3 channels' comment in routeEvent updated to 4.

Test coverage:
- New test: LINEAR_WORKSPACE_REGISTRY_TABLE_NAME unset → WARN + skip
  (the deploy-misconfig safety valve was unexercised).
- New test: error_max_turns WITHOUT pr_url renders ❌, not ⚠️
  (the ⚠️↔❌ boundary the other direction).
- New describe block: 8 direct tests of renderLinearFinalStatusComment
  covering null-metric fallbacks, formatDuration boundaries (<60s,
  exact-minute Nm, mixed Nm Ss), classifier-title rendering on ⚠️ and
  ❌ frames, and the no-trailing-colon when errorTitle is null.

Total: 102 tests in fanout-task-events.test.ts (was 92), 1826
passing across the full CDK suite.
… dev smoke

After the first dev deploy of the Linear dispatcher (aws-samples#239), two
near-duplicate things showed up on the Linear thread:

1. The platform's ✅ comment carried PR: <url> while the agent's
   step-2 'PR opened' comment had already posted the same link one
   slot earlier. Two clickable copies of the same URL adds noise.
2. The agent's step-3 'task completed' free-form comment stacked
   right next to the platform's ✅ structured comment with full
   metrics. Two completion comments back-to-back with overlapping
   intent — the platform one is strictly more informative.

Changes:

- renderLinearFinalStatusComment: render PR url ONLY on the ⚠️
  shipped-but-stopped-early path. On ✅, the agent's step-2 comment is
  guaranteed to have fired with the PR link; on ⚠️ the agent may have
  crashed before step-2 (e.g. ABCA-91 max-turns on turn 101), so the
  platform comment is the backup signal and the PR url has to be
  there.
- Updated the corresponding test to assert not.toContain on the ✅
  fixture's PR URL.
- Removed step 3 from the Linear-channel prompt contract in
  prompt_builder.py. Replaced with an explicit prohibition against
  posting a final 'task completed/failed' comment, with a sentence
  pointing the agent at the platform fan-out plane (aws-samples#239) as the
  source of truth for terminal status.

Net Linear thread shape post-task: agent posts start (1) + PR-opened
(2); platform posts terminal ✅/⚠️/❌ (3). One PR url, one completion
comment. Krokoko predicted this exact migration in their PR-243
review — 'the agent prompt can drop step 3 once the platform side is
reliable.'

Targeted suite still 102 passing.
@isadeks isadeks marked this pull request as ready for review June 5, 2026 16:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat(notifications): platform-side final-status Linear comment with cost/turns/duration

3 participants