Skip to content

fix(daemon): deduplicate Dolt alerts and use nudge instead of full restart#3488

Open
jw409 wants to merge 1 commit intogastownhall:mainfrom
jw409:fix/daemon-deduplicate-alerts-3474
Open

fix(daemon): deduplicate Dolt alerts and use nudge instead of full restart#3488
jw409 wants to merge 1 commit intogastownhall:mainfrom
jw409:fix/daemon-deduplicate-alerts-3474

Conversation

@jw409
Copy link
Copy Markdown
Contributor

@jw409 jw409 commented Apr 2, 2026

Summary

  • Adds doltAlertTracker with fingerprint-based cooldowns (5 min) to prevent wisp storms during Dolt crash loops
  • Switches sendCrashAlert and sendUnhealthyAlert from gt mail send (permanent Dolt commit) to gt nudge (ephemeral, zero cost)
  • Keeps escalation on gt mail send with --wisp-type escalation tag — escalations must survive session death
  • Removes witness mail fan-out; witnesses already detect degraded state via the DOLT_UNHEALTHY signal file
  • Resets alert tracker when backoff resets after a healthy period
  • Fixes wispTypeToCategory test to match current 2-arg signature
  • Adds nolint:misspell annotations for bead ID references (gt-ect)
  • Fixes serialisedserialized typo in convoy_manager.go

Follows the communication hygiene principle: mail only for messages that must survive session death.

Fixes #3474

Test plan

  • Dolt alerts deduplicated — same alert not sent repeatedly within 5 min cooldown
  • Nudge used for routine crash/unhealthy alerts, mail reserved for escalations
  • Alert tracker resets when server recovers (backoff reset path)
  • Builds clean

🇺🇸 Reid Wiseman — Commander
🇺🇸 Victor Glover — Pilot
🇺🇸 Christina Koch — Mission Specialist
🇨🇦 Jeremy Hansen — Mission Specialist

Artemis II. Open source for all — on this planet and beyond it.

- Add doltAlertTracker with fingerprint-based cooldowns (5min) to
  prevent wisp storms during Dolt crash loops
- Switch sendCrashAlert and sendUnhealthyAlert from gt mail send to
  gt nudge (ephemeral, zero-cost — no Dolt commit per alert)
- Keep escalation on gt mail send (must survive session death) with
  --wisp-type escalation tag
- Remove witness mail fan-out; witnesses detect degraded state via
  DOLT_UNHEALTHY signal file instead
- Reset alert tracker when backoff resets after healthy period
- Fix wispTypeToCategory test to match current 2-arg signature
- Add nolint:misspell annotations for bead ID references (gt-ect)
- Fix serialised → serialized typo in convoy_manager

Fixes gastownhall#3474
@github-actions github-actions bot added the status/needs-triage Inbox — we haven't looked at it yet label Apr 2, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

status/needs-triage Inbox — we haven't looked at it yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant