Skip to content

Conversation

@sauerdaniel
Copy link
Owner

Summary

  • Improves gastown shutdown reliability
  • Addresses issues with gt down not properly stopping all components

Test plan

  • Run gt down
  • Verify all sessions are properly terminated
  • Confirm clean shutdown

sauerdaniel added a commit that referenced this pull request Jan 12, 2026
Changed from nuclear=true to nuclear=false when polecats self-destruct
via gt done. The nuclear flag bypasses ALL safety checks including the
cleanup_status field that was added as part of ZFC #10 to prevent
accidental work loss.

Now polecats will validate their self-reported cleanup_status before
removing themselves, consistent with how the witness handler handles
cleanup.

Fixes steveyegge#360
sauerdaniel added a commit that referenced this pull request Jan 12, 2026
Changed from nuclear=true to nuclear=false when polecats self-destruct
via gt done. The nuclear flag bypasses ALL safety checks including the
cleanup_status field that was added as part of ZFC #10 to prevent
accidental work loss.

Now polecats will validate their self-reported cleanup_status before
removing themselves, consistent with how the witness handler handles
cleanup.

Fixes steveyegge#360
Three related fixes for polecat lifecycle management:

1. Push branch to origin before self-nuke (done.go)
   - Ensures work is preserved on remote before worktree cleanup
   - Prevents orphaned local-only branches

2. Respect cleanup_status in selfNukePolecat (done.go)
   - Changed nuclear=true to nuclear=false
   - Validates cleanup_status before removal
   - Prevents destruction with uncommitted/unpushed work

3. Respawn done polecats with hooked work (manager.go, handlers.go)
   - loadFromBeads now checks hook_bead field
   - Added FindPolecatsWithHookedWork() and RespawnPolecatWithHookedWork()
   - Witness can auto-respawn polecats that have pending work

Fixes steveyegge#360
Two fixes for daemon-managed agent startup:

1. Boot watchdog CLAUDE.md creation (boot.go, templates.go)
   - Add CreateBootCLAUDEmd function to templates package
   - Add EnsureCLAUDEmd method to create context before session spawn
   - Enables Boot to perform intelligent triage decisions

2. Deacon startup auto-execution (deacon.go)
   - Execute gt prime directly via SendKeys instead of nudge message
   - Prevents text appearing in prompt area without execution
   - Fixes endless restart loop in Claude Code v2.1.4+
1. Attach new patrol wisp to hook for autonomous continuation
   - Ensures witness continues patrol after session restart

2. Add --hook flag to SessionStart hooks in createPatrolHooks
   - Properly signals hook attachment during session creation
After polecats push their work branches to origin before self-nuke,
the refinery was only deleting local branches after merge, leaving
stale remote branches accumulating.

Added remote branch deletion in handleSuccess and handleSuccessFromQueue
to clean up both local and remote copies after successful merge.
When convoy leg beads complete, they now record output_path metadata
so synthesis workflows can discover and aggregate outputs without
hunting through worktrees or guessing branch names.

Changes:
- formula.go: parse output section, include output_path in leg descriptions
- convoy.go: add Description field to issueDetails for metadata parsing
- synthesis.go: parse output_path from leg descriptions with template fallback

Fixes steveyegge#303
Adds support for configuring a separate push URL (fork) when the
upstream repository is read-only. This allows polecats to push to
a personal fork while still pulling from the upstream repository.

Changes:
- Added PushURL field to RigConfig and Rig struct
- Added PushURL to AddRigOptions
- Added ConfigurePushURL function to git package
- Configure push URL in bare repo when PushURL is set

Usage:
  gt rig add --git-url=https://github.com/upstream/repo \
             --push-url=https://github.com/user/fork \
             myrig
Add Community section with link to Discord server for real-time
support and collaboration.

Fixes steveyegge#305
sauerdaniel and others added 19 commits January 12, 2026 20:23
…/witness-improvements', 'pr/refinery-branch-cleanup', 'pr/synthesis-output-metadata', 'pr/push-url-config' and 'pr/discord-link'
The post-startup nudges were arriving before Claude Code's input was
ready, causing only the Enter key to make it through (empty input).

Changes:
- Pass "gt prime" as CLI argument to Claude Code startup command
- Remove unreliable post-startup nudges and timing delays
- The SessionStart hook provides a backup propulsion mechanism

The CLI prompt approach is more reliable because the prompt is queued
before Claude even starts, avoiding timing issues entirely.

Fixes: gt-x7p3
The boot role was added but the test expectation wasn't updated,
causing TestRoleNames to fail.

Fixes: gt-j7wl
Apply the same fix as Mayor (d509f7c) to Deacon, Witness, Refinery,
and Polecat. Post-startup nudges arrive before Claude Code's input is
ready, causing only the Enter key to make it through (empty input).

Changes for each agent:
- Pass "gt prime" as CLI argument to startup command
- Remove unreliable post-startup nudges and timing delays
- Keep SessionStart hook as backup propulsion mechanism

The CLI prompt approach is more reliable because the prompt is queued
before Claude even starts, avoiding timing issues entirely.

Fixes: gt-mghw
Boot agent was getting wrong settings template due to:
1. RoleTypeFor() missing "boot" - fell through to Interactive
2. spawnTmux() not calling EnsureSettingsForRole()

Add "boot" to autonomous roles list and call EnsureSettingsForRole()
in spawnTmux() to create proper .claude/settings.json for Boot.

Fixes: gt-hnjp
Adds per-agent-type health tracking to the Mayor's tmux statusline, showing
working/idle counts for Polecats, Witnesses, Refineries, and Deacon.

All agent types are always displayed, even when no agents of that type are
running (shows as '0/0 😺').

Format: active: 4/4 😺 6/10 👁️ 7/10 🏭 1/1 ⛪
- Abbreviate long rig names (design_forge→df, gastown→gt, etc.)
- Update tests for new abbreviations
- Addresses issue hq-dn15
- Add AgentCrew to tracked agent types in mayor statusline
- Show 👷 icon for crew agents
- Display crew count in statusline (e.g., 👷1/5)
- Removes crew from skip filter so they're properly tracked

Fixes issue where crew agents were not shown in statusline.
Active rigs now appear first (alphabetically), followed by parked/docked
rigs (also alphabetically). This makes it easier to see which rigs are
operational at a glance.
Move dynamic status content from status-right to status-left to
utilize available space and prevent rig name truncation.

- SetStatusFormat: Now sets status-right with compact identity
- SetDynamicStatus: Now sets status-left with dynamic content
- Increased status-left-length to 150 for more space
- Removed time from dynamic status (was %H:%M)

Fixes hq-s1il
The issue describes Mayor not monitoring convoys, but the root cause
is that Deacon's patrol loop never called the existing infrastructure
(gt convoy stranded + mol-convoy-feed).

This implements the daemon-driven convoy progression approach
(suggested option #1 in the issue).

Changes:
- Added feed-stranded-convoys step to mol-deacon-patrol formula
- Deacon now runs gt convoy stranded --json each patrol cycle
- For each stranded convoy, dispatches mol-convoy-feed dog
- Updated dependency chain
- Bumped formula version from 8 to 9
- Removed space between counts and emojis (e.g., "3 😺" → "3😺")
- Removed space between emojis and counts/subjects (e.g., "📬 3" → "📬3")
- Removed space between hook emoji and text (e.g., "🪝 work" → "🪝work")
IsClaudeRunning was calling IsAgentRunning (which calls GetPaneCommand),
then immediately calling GetPaneCommand again. This duplicate subprocess
call was slowing down gt startup and daemon heartbeat operations.

Changed IsAgentRunning to return (bool, string) - the running status
and the pane command it checked. IsClaudeRunning now reuses the
command instead of making a redundant tmux subprocess call.

Fixes gt-kpii: zombie session detection slows gt up
The notifyRecipient function was using NudgeSession which sends
notifications to the input buffer. Changed to use SendNotificationBanner
which displays the banner in the message history using echo.

This fixes the issue where notification banners appeared in Claude
Code's input buffer instead of in the conversation history.

Fixes hq-nc9mr
Replaces: hq-1qhj
Previously, the witness statusline only showed the crew count when it was
greater than 0. Now all agent types (polecats 😺 and crew 👷) are always
displayed, even when their count is 0.
For Claude Code sessions, mail notifications now use NudgeSession instead
of SendNotificationBanner. This ensures notifications appear in the message
history rather than being injected into the input buffer.

Fixes: hq-1qhj
Changed the statusline format from "1/10😺" to "😺1/10" to match
the documented format in the comment. This ensures the icon appears
before the working/total counts for all agent types.
The warning when processes respawn after 'gt down --all' now includes
more comprehensive troubleshooting guidance, including checking gt status
and mentioning that the gt daemon itself could be the cause.
After the polecat self-nuke fix, branches are now pushed to origin
before the polecat's worktree is deleted. The refinery was only
deleting local branches after merge, leaving stale remote branches.

Fix: Updated handleSuccess and handleSuccessFromQueue to also delete
the remote branch from origin after deleting the local branch.

Related to: hq-nju99, GitHub issue steveyegge#359
Boot was designed as an ephemeral triage agent that runs on each daemon
tick, observes Deacon's state, and exits. However, Boot was getting stuck
at interactive prompts after completing triage, which prevented the daemon
from spawning fresh Boot instances.

Fix: Create CLAUDE.md for Boot that instructs it to:
1. Check Deacon status and heartbeat
2. Take action if needed (nudge/restart Deacon)
3. Exit immediately using `tmux kill-session -t gt-boot`

This ensures Boot functions as designed - ephemeral watchdog that runs
triage and exits, allowing the daemon to spawn fresh Boot instances on
each heartbeat.

Related: hq-6p7g4
Fixes hq-lglmw

When gt sling assigns work to a polecat, it now automatically
attaches the mol-polecat-work molecule to the polecat's agent bead.

Changes:
- Added attachPolecatWorkMolecule() function that cooks the formula
  and attaches the molecule to the polecat's agent bead
- Added molecule attachment call after hooking work (single sling mode)
- Added molecule attachment call after hooking work (batch sling mode)
- Implementation is idempotent (checks if already attached)
- Non-blocking: logs warnings but doesn't fail sling operation
Issue steveyegge#197: Polecat fails to hook when slinging a bead with a molecule
to a rig.

Root cause: attachPolecatWorkMolecule was running 'bd cook' from the
polecat's worktree (which doesn't have a .beads directory) instead of
from the rig directory where the bead database lives.

Fix: Use beads.ResolveHookDir() to resolve the correct rig directory
for running bd commands, consistent with how the hook command works.
The sling command needs to handle .repo.git symlinks correctly
for polecat spawning across all rigs.

Related: hq-dp3ss
Fixes bug where work slung to 'done' polecats (no active tmux session)
would never get processed. Now when gt sling resolves an existing
polecat target and finds no active session, it spawns a fresh polecat
instead of failing or leaving the work stuck.

This addresses hq-50u3h: 43+ stale convoys were not progressing because
polecats in 'done' state had work hooked to them but weren't processing it.
The health tracking loop in runMayorStatusLine was counting all
agents regardless of whether their rig was registered in rigs.json.
This caused count discrepancies when sessions existed for unregistered
rigs.

Now the health tracking loop applies the same registeredRigs filter
that the earlier rig status loop uses, ensuring consistent counts
across all statusline displays.

Fixes hq-auhq
Boot was designed to be a watchdog that runs on daemon ticks and manages
Deacon lifecycle, but it wasn't functioning because Boot's CLAUDE.md
context file was missing from the boot directory.

Changes:
- Add CreateBootCLAUDEmd function to templates package
- Add EnsureCLAUDEmd method to Boot to create CLAUDE.md from template
- Update spawnTmux to call EnsureCLAUDEmd before creating session
- Add "boot" to RoleNames list

This ensures Boot has proper context when spawned by the daemon, enabling
it to perform intelligent triage (start/wake/nudge/interrupt decisions)
instead of running without instructions.

Fixes: hq-6p7g4
Fixes steveyegge#210 - Creating a convoy as mayor results in prefix mismatch

The town-level beads database is initialized with issue_prefix=hq,
but convoy creation was generating IDs with hq-cv- prefix, causing
bd create to fail with prefix mismatch error.

Changed convoy ID generation from hq-cv-<hash> to hq-<hash>. Convoys
are distinguished by type=convoy attribute, not by special ID prefix.
Comprehensive research on media processing optimization covering:
- Performance bottleneck analysis (I/O, CPU, memory)
- Parallel processing strategies (pipeline, data, hybrid)
- Multi-layer caching architecture (Redis + local SSD)
- Format optimization matrix and codec comparisons
- Cost reduction opportunities (40-60% estimated savings)
- 6-week proof of concept implementation plan
- Recommended technology stack and code examples

Deliverables complete: Performance audit, optimization recommendations, PoC plan.
Implements GitHub issue steveyegge#220 - Worktree setup hook for injecting
local configurations.

When polecats are spawned, their worktrees are created from the rig's
repo. Previously, there was no way to inject custom configurations
during this process.

Now users can place executable hooks in <rig>/.runtime/setup-hooks/
to run custom scripts during worktree creation:

  rig/
    .runtime/
      setup-hooks/
        01-git-config.sh    <- Inject git config
        02-copy-secrets.sh  <- Copy secrets
        99-finalize.sh      <- Final setup

Features:
- Hooks execute in alphabetical order
- Non-executable files are skipped with a warning
- Hooks run with worktree as working directory
- Environment variables: GT_WORKTREE_PATH, GT_RIG_PATH
- Hook failures are non-fatal (warn but continue)

Example hook to inject git config:
  #!/bin/sh
  git config --local user.signingkey ~/.ssh/key.asc
  git config --local commit.gpgsign true

Related to: hq-fq2zg, GitHub issue steveyegge#220
Adds per-agent-type health tracking to the Mayor's tmux statusline, showing
working/idle counts for Polecats, Witnesses, Refineries, and Deacon.

All agent types are always displayed, even when no agents of that type are
running (shows as '0/0 😺').

Format: active: 4/4 😺 6/10 👁️ 7/10 🏭 1/1 ⛪
Fixes steveyegge#291 - gastown is very hard to kill/shutdown/stop

Changes:
- Add shutdown coordination: daemon checks shutdown.lock and skips
  heartbeat auto-restarts during shutdown to prevent fighting shutdown
- Extend grace period from 100ms to 30 seconds for graceful session exit
- Add polling to detect when sessions exit gracefully before force kill
- Add orphaned Claude/node process detection in shutdown verification

The daemon's heartbeat now checks for shutdown.lock (created by gt down)
and skips auto-restart logic when shutdown is in progress. This prevents
the daemon from restarting agents that were intentionally killed during
shutdown.

Sessions now receive Ctrl-C and have up to 30 seconds to exit cleanly,
with polling every 500ms to detect graceful exit. Only sessions that
don't exit within the grace period are force-killed.

Shutdown verification now includes detection of orphaned Claude/node
processes that may be left behind when tmux sessions are killed but
child processes don't terminate.
The sling refactor (cd2de6e) split the 1560-line sling.go into 7 focused
modules, but left duplicate function declarations in the original file.
This commit removes the duplicates, keeping only the implementations in
the split files.

Also fixes related build issues:
- Remove unused claude import from boot/boot.go
- Fix IsAgentRunning() calls to handle multiple return values
- Fix atomic operation on startedAny counter in start.go
- Remove duplicate health tracking code in statusline.go
- Add missing imports (strings, config) to sling.go
IsClaudeRunning was calling IsAgentRunning (which calls GetPaneCommand),
then immediately calling GetPaneCommand again. This duplicate subprocess
call was slowing down gt startup and daemon heartbeat operations.

Changed IsAgentRunning to return (bool, string) - the running status
and the pane command it checked. IsClaudeRunning now reuses the
command instead of making a redundant tmux subprocess call.

Fixes gt-kpii: zombie session detection slows gt up
Fixes steveyegge#291 - gastown is very hard to kill/shutdown/stop

Changes:
- Add shutdown coordination: daemon checks shutdown.lock and skips
  heartbeat auto-restarts during shutdown to prevent fighting shutdown
- Extend grace period from 100ms to 30 seconds for graceful session exit
- Add polling to detect when sessions exit gracefully before force kill
- Add orphaned Claude/node process detection in shutdown verification

The daemon's heartbeat now checks for shutdown.lock (created by gt down)
and skips auto-restart logic when shutdown is in progress. This prevents
the daemon from restarting agents that were intentionally killed during
shutdown.

Sessions now receive Ctrl-C and have up to 30 seconds to exit cleanly,
with polling every 500ms to detect graceful exit. Only sessions that
don't exit within the grace period are force-killed.

Shutdown verification now includes detection of orphaned Claude/node
processes that may be left behind when tmux sessions are killed but
child processes don't terminate.
@sauerdaniel sauerdaniel force-pushed the polecat/organic-mkabz4tm branch from f26d421 to eea3230 Compare January 13, 2026 04:50
@sauerdaniel sauerdaniel force-pushed the main branch 4 times, most recently from a67da82 to 60ed204 Compare January 20, 2026 21:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants