Skip to content

feat(campaign-sender): campaigns.send consumer with tabular idempotency and pause/stop recheck (EVO-1217)#47

Merged
dpaes merged 7 commits into
developfrom
feat/EVO-1217
Jun 10, 2026
Merged

feat(campaign-sender): campaigns.send consumer with tabular idempotency and pause/stop recheck (EVO-1217)#47
dpaes merged 7 commits into
developfrom
feat/EVO-1217

Conversation

@nickoliveira23

Copy link
Copy Markdown

Summary

  • New campaign-sender runner (story 4.3): consumes campaigns.send pages published by the packer (4.2), hydrates contacts from the CRM and dispatches each PENDING contact through the shared CrmInboxDispatcher (2.2).
  • Tabular idempotency via CampaignContactStatus (2.3): non-PENDING contacts are skipped (skipped: already sent); SENT/FAILED updates are conditional on status='PENDING' so replica races never double-mark (FR30, NFR16).
  • Pause/stop honored mid-batch: Campaign.status rechecked before every dispatch through a per-instance 5s TTL cache; on Paused/Stopped the batch aborts, acks and logs aborted: campaign paused (FR21–FR24).
  • 4xx/5xx dispatch failures mark the contact FAILED immediately with the reason in the structured log — retry/backoff is story 4.5 by design (FR33).
  • RUN_MODE=campaign-sender wired in main.ts/app.module.ts, replacing the 1.1 stub (FR5); consumer group campaign-sender-campaigns.send supports horizontal replicas (FR6).
  • Observability (descoped here from EVO-1223): per-message duration (p50/p95/p99), throughput, categorized error counters, and a 15s consumer-lag poll backed by a new IMessageBroker.getTopicLag (Kafka: high-watermark minus committed offsets; RabbitMQ: queue ready count via idempotent assertQueue).

Security

  • No new HTTP surface (pure broker consumer); payload validated with the zod contract at the edge, invalid messages dropped to DLQ.
  • CRM dispatch reuses the shared dispatcher (X-Service-Token via env); contact ids only ever bound through TypeORM parameterized queries.
  • Defensive dedupe of contactIds prevents double-send on a malformed page.

Test plan

  • npm run typecheck — clean
  • npm test -- src/runners/campaign-sender — 30 tests, all 6 ACs covered literally (batch SENT+sent_at, already-sent skip, mid-batch pause abort, 4xx FAILED, subscribe on boot, metrics/lag)
  • npm test -- src/shared/broker src/runners/campaign-packer — regression green
  • Boot smoke: npm run dev:campaign-sender joins group campaign-sender-campaigns.send with all 12 partitions assigned and stays alive (stub previously exited 0)
  • Known pre-existing failure on develop (campaigns.controller.spec.ts stop/accountId) — unrelated, reproduces on clean checkout

Changed Files

  • src/runners/campaign-sender/ — module, campaigns-send.consumer.ts, campaign-sender.service.ts, batch-dispatcher.service.ts, errors, specs
  • src/shared/broker/interfaces/message-broker.interface.ts + Kafka/RabbitMQ adapters — additive getTopicLag
  • src/main.ts, src/app.module.ts — RUN_MODE wire-up

Deferred ACs

  • AC6 (metrics "visible at GET /metrics"): metrics are emitted on the shared Prometheus register, but worker modes don't start an HTTP server — per-mode HTTP exposure is story 5.1 (EVO-1226), same pattern as event-process in EVO-1223. Accepted by @nickoliveira23 on 2026-06-10.

Linked Issue

  • EVO-1217

nickoliveira23 and others added 7 commits June 10, 2026 11:24
Kafka sums high-watermark minus committed offset across partitions for the
mode's consumer group; RabbitMQ reads the ready count of the mode's queue.
Feeds the consumer_lag gauge required by NFR33.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…RM inbox (EVO-1217)

Loads the batch's MessageTemplate once (missing template is terminal),
renders legacy-compatible placeholders per contact and delegates delivery
to the shared CrmInboxDispatcher (story 2.2).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…use recheck (EVO-1217)

Processes one campaigns.send page: skips non-PENDING contacts (FR30/NFR16),
rechecks Campaign.status before every dispatch through a 5s TTL cache
(FR21-FR24) and records SENT/FAILED via updates conditional on PENDING so
replica races never double-mark. 4xx/5xx fail immediately - retry is 4.5.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…s (EVO-1217)

Consumer mirrors the packer pattern: contract validation up-front,
correlationId wrapping and shared ack/nack policy (terminal -> DLQ).
Emits per-message duration (p50/p95/p99), throughput, categorized errors
and a 15s background consumer-lag poll via PipelineMetricsService (NFR33).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…1 stub (EVO-1217)

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…-closing 404 (EVO-1217)

checkQueue on a missing queue raises a channel-level error that closes the
adapter's single shared channel, so a best-effort metrics poll could take
down publish/ack for the whole process. assertQueue with the consumer's
exact declaration is idempotent and returns the same messageCount.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…e, cache bound (EVO-1217)

- hydrate only PENDING contacts so a redelivered already-SENT page skips
  cheaply instead of re-fetching the whole batch from the CRM (NFR16)
- dedupe payload contactIds: a repeated id would dispatch twice since the
  in-memory row stays PENDING after the first send
- bound the campaign status cache (prune expired entries past 1000)
- log 'campaign contact failed' instead of 'dispatch failed' on the
  pre-dispatch contact_not_found path; JSON-stringify object-valued custom
  attributes in template rendering

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

@sourcery-ai sourcery-ai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry @nickoliveira23, you have reached your weekly rate limit of 500000 diff characters.

Please try again later or upgrade to continue using Sourcery

@dpaes dpaes left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Approved (with nits) — EVO-1217

Reviewed adversarially; cloned the PR head to verify load-bearing context. Base develop, MERGEABLE/CLEAN.

AC verification

  • AC1–AC5 ✓ — dispatch → markSent conditional-on-PENDING (replica-race-safe status lock); pending-only hydration; pause/stop rechecked before every dispatch via a bounded 5s TTL cache; 4xx → FAILED with reason logged; RUN_MODE=campaign-sender wired (1.1 stub removed), consumer subscribes and stays alive. Producer (campaign-packer) publishes exactly the CampaignsSendContract the consumer validates; every dependency exists with the assumed signature, incl. the new additive IMessageBroker.getTopicLag on both Kafka and RabbitMQ adapters.
  • AC6 ✓ emitted / ⚠️ partial on the GET /metrics HTTP-exposure clause only — worker modes don't start an HTTP server; deferred to EVO-1226 (5.1), accepted with a dated callout. Same pattern as EVO-1223.

Non-blocking (DM'd to the dev):

  • [Medium] dispatch-then-claim → cross-replica double-SEND under redelivery (at-least-once; consistent with the card's status-lock model, not a spec violation). Worth a note when retry/rate-limit (4.4/4.5) land.
  • [Low] getTopicLag has no broker.contract.spec coverage (the only CI gate that runs real brokers) — its Kafka offset math / RabbitMQ assertQueue path is unit-mocked only. Consumer-group naming verified to match subscribe().
  • [Nit] AC5 card text names a non-existent script (start:dev:campaign-sender vs the real dev:campaign-sender); processed_params array-through-Record cast (behavior-preserving).

Note CI does not run jest/tsc/lint — recommend a local npm test -- src/runners/campaign-sender + npm run typecheck before merge. Squash-merging to develop.

@dpaes dpaes merged commit 2c88250 into develop Jun 10, 2026
4 checks passed
@dpaes dpaes deleted the feat/EVO-1217 branch June 10, 2026 15:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants