Read this before merging any change that touches process spawning, path handling, shell commands, network binding, file I/O, runtime/agent/workspace plugins, or anything that does platform-specific work.
AO ships on macOS, Linux, and Windows. All three are first-class — every change must keep all three working.
Never write
process.platform === "win32"in new code. UseisWindows()from@aoagents/ao-core. If you need branching the helper doesn't cover, add it topackages/core/src/platform.ts(or one of the targeted helpers in the inventory) — never inline at the call site.
This isn't stylistic. The branching in platform.ts is centrally tested with Object.defineProperty(process, "platform", …) so both Windows and POSIX paths are exercised on every CI runner. Inline process.platform checks are invisible to that test pattern, drift out of sync, and produce the bugs that took weeks to track down on the way to shipping the Windows port.
If you find yourself typing process.platform:
- Stop. Look at the helper inventory below — almost certainly the helper you need already exists.
- If it doesn't, ask: "Could a future feature also need this branch?" Almost always yes. Add a function to
platform.ts(or the closest existing helper module) and test both branches. - Only if the branch is genuinely a one-off (e.g. a single test guarding a Linux-only assertion) is an inline check acceptable, and even then prefer
isWindows()for readability.
If your change does any of the following, you must read the relevant section below:
| If you're touching… | …read |
|---|---|
process.spawn, child_process, runtime plugins |
The two runtimes, Process management |
process.kill, signals, process-tree teardown |
Process management |
| Anything with file paths (compare, join, walk) | Paths |
Shell commands (exec, command strings) |
Shell |
server.listen, sockets, localhost |
Networking |
| tmux / lsof / pkill / which / coreutils shell-outs | POSIX-only tools |
Adding a new if (process.platform === "win32") |
The Golden Rule, Helper inventory |
| Agent plugins (PATH wrappers, hooks, launch commands) | Agent plugin helpers |
| Activity detection / JSONL processing | Activity-state helpers |
| Tests for any of the above | Testing for cross-platform behaviour |
| Anything else? | At minimum, the pre-merge checklist |
Every helper you need to write Windows-safe code. Memorise the imports — these are the building blocks.
import {
isWindows,
getDefaultRuntime,
getShell,
killProcessTree,
findPidByPort,
getEnvDefaults,
} from "@aoagents/ao-core";| Symbol | Purpose | Notes |
|---|---|---|
isWindows(): boolean |
The canonical OS check. Always use this instead of process.platform === "win32". |
Constant-time. Trivially mockable in tests. |
getDefaultRuntime(): "tmux" | "process" |
Returns "process" on Windows, "tmux" elsewhere. Used by ao start / startup-preflight to default runtime selection. |
Don't hardcode "tmux". |
getShell(): { cmd, args(command) } |
Resolves the shell for non-interactive command execution. POSIX → /bin/sh -c. Windows → priority order: AO_SHELL env override → pwsh → powershell.exe (absolute path, robust to degraded PATH) → powershell → cmd.exe. Cached. |
Use this whenever you need to run any shellish string. Don't assume bash. |
killProcessTree(pid, signal?) |
Kills a process and its descendants. Windows → taskkill /T /F /PID <pid>. POSIX → process.kill(-pid, signal) with direct-PID fallback. Guards pid > 0. |
Never write process.kill(-pid, …) directly. Negative PIDs are POSIX-only. |
findPidByPort(port): Promise<string | null> |
Finds the LISTENING PID on a port. Windows → parses netstat -ano. POSIX → lsof -ti :PORT -sTCP:LISTEN. |
Use this; don't shell-out yourself. |
getEnvDefaults(): { HOME, SHELL, TMPDIR, PATH, USER } |
Returns platform-correct env defaults: Windows reads USERPROFILE/TEMP/USERNAME, POSIX reads HOME/SHELL/TMPDIR/USER. |
Use instead of hardcoding /tmp, ~, $HOME. |
_resetShellCache() |
Test-only — clears the cached shell resolution. | @internal. |
import { pathsEqual, canonicalCompareKey } from "../../src/lib/path-equality.js";| Symbol | Purpose |
|---|---|
pathsEqual(a, b): boolean |
"Same filesystem entry" comparison. Resolves both via realpathSync (falls back to literal on error), then lowercases on Windows so D:\Foo == d:\foo. |
canonicalCompareKey(input): string |
Stable Map/Set key for a path. Expands ~, resolves to absolute, calls realpathSync, lowercases on Windows. |
Rule: never compare paths with ===. Always go through these.
Only used by Windows runtime code, but exported from @aoagents/ao-core so the CLI's ao stop can find detached pty-hosts that taskkill /T cannot reach.
import {
registerWindowsPtyHost,
unregisterWindowsPtyHost,
getWindowsPtyHosts,
clearWindowsPtyHostRegistry,
} from "@aoagents/ao-core";| Symbol | Purpose |
|---|---|
registerWindowsPtyHost(entry) |
Add/replace a {sessionId, ptyHostPid, pipePath} entry in ~/.agent-orchestrator/windows-pty-hosts.json. Called when runtime-process spawns a pty-host. |
unregisterWindowsPtyHost(sessionId) |
Remove on session destroy. |
getWindowsPtyHosts(): WindowsPtyHostEntry[] |
Return all entries whose PID is still alive (probes via process.kill(pid, 0) treating EPERM as alive). Auto-prunes dead ones. |
clearWindowsPtyHostRegistry() |
Wipe the file (recovery / tests). |
Use these whenever you need to talk to a Windows pty-host over its named pipe. The mux WS server, runtime-process, and sweepWindowsPtyHosts all go through this module — never write to a \\.\pipe\… directly.
import {
getPipePath,
connectPtyHost,
ptyHostSendMessage,
ptyHostGetOutput,
ptyHostIsAlive,
ptyHostKill,
MessageParser,
encodeMessage,
} from "@aoagents/ao-plugin-runtime-process";| Symbol | Purpose |
|---|---|
getPipePath(sessionId) |
Returns \\.\pipe\ao-pty-<sessionId>. Don't construct the path manually. |
connectPtyHost(pipePath, timeoutMs?) |
Open a net.Socket to the named pipe with timeout. |
ptyHostSendMessage(pipePath, message) |
Send keystrokes; chunks into ≤512-char pieces with 15 ms gaps to dodge ConPTY input-buffer truncation. |
ptyHostGetOutput(pipePath, lines?) |
Request scrollback buffer. Returns "" on timeout. |
ptyHostIsAlive(pipePath) |
Liveness probe; true ≡ pipe reachable. |
ptyHostKill(pipePath) |
Cooperative shutdown (host disposes ConPTY then exits). Silently succeeds if pipe is unreachable. |
MessageParser, encodeMessage |
Frame-protocol primitives if you're writing new pty-host integrations. |
import { sweepWindowsPtyHosts } from "@aoagents/ao-plugin-runtime-process";sweepWindowsPtyHosts(): Promise<{ attempted, gracefullyExited, forceKilled, failed }> — iterates the registry, sends graceful MSG_KILL_REQ, polls up to 500 ms, then killProcessTree for stragglers. Called by ao stop. No-op on non-Windows.
The exit-poll inside this function is the canonical EPERM/ESRCH pattern — copy it whenever you probe a Windows process for liveness:
while (Date.now() < deadline) {
try {
process.kill(entry.ptyHostPid, 0);
} catch (err: unknown) {
// EPERM = alive but unsignalable (cross-context on Windows) → fall through to force-kill.
// ESRCH (or anything else) = process is gone → mark exited.
if ((err as { code?: string }).code !== "EPERM") {
exited = true;
}
break;
}
await new Promise((r) => setTimeout(r, 25));
}// packages/web/server/tmux-utils.ts
import { validateSessionId, resolvePipePath } from "@/server/tmux-utils";
// packages/web/src/lib/windows-pty-cleanup.ts
import { stopStaleWindowsPtyHosts } from "@/lib/windows-pty-cleanup";| Symbol | Purpose |
|---|---|
validateSessionId(id): boolean |
Charset/length guard. Always validate any session ID before using it in a tmux command, named-pipe path, or shell argument — these are user-controllable inputs. |
resolvePipePath(sessionId, projectId?) |
Reads the session metadata file and returns the pipePath field stored by runtime-process. Returns null on non-Windows. Used by the mux WS server when relaying pipe traffic. |
stopStaleWindowsPtyHosts(projectDir) |
Defensive sweeper. Uses a PowerShell Get-CimInstance Win32_Process query to find pty-hosts whose command line contains a project dir, then taskkill's them. No-op on non-Windows. Use as a recovery escape hatch, not in the hot path. |
import { setupPathWrapperWorkspace, buildAgentPath } from "@aoagents/ao-core";| Symbol | Purpose |
|---|---|
setupPathWrapperWorkspace(workspacePath) |
Installs ~/.ao/bin PATH wrappers for gh / git so AO can intercept agent commands. Cross-platform. On Windows it generates .cjs + .cmd wrapper pairs (skipping bash); on Unix it generates the bash equivalents. Every agent plugin that uses PATH-wrapper interception (codex, kimicode, aider, opencode) must call this — never reimplement. |
buildAgentPath(basePath?) |
Prepends ~/.ao/bin to PATH using the right separator (; on Windows, : on Unix). Use when constructing the agent's env. |
import {
appendActivityEntry,
readLastActivityEntry,
checkActivityLogState,
getActivityFallbackState,
classifyTerminalActivity,
recordTerminalActivity,
readLastJsonlEntry,
} from "@aoagents/ao-core";getActivityFallbackState is mandatory for new agent plugins. See the agent-plugin section in the root CLAUDE.md for the full contract — but the relevant cross-platform note is: AO activity JSONL works the same on all platforms, so write your activity-detection logic against it, not against tmux capture-pane / ps output.
import { shellEscape } from "@aoagents/ao-core";shellEscape(arg) produces a safely-quoted argument. Always use it when interpolating any value into a shell command line, even on Windows. Windows quoting rules are messier than POSIX and the helper handles them.
import { forwardSignalsToChild } from "../lib/shell.js";forwardSignalsToChild(pid, child) — call only on POSIX (if (!isWindows() && pid)). On Windows, Ctrl+C reaches the entire console group natively; explicit forwarding is harmful (double-signals).
| Variable | Effect |
|---|---|
AO_SHELL |
Override getShell() resolution. Set to an absolute path or shell name (pwsh, cmd, bash, …). Args are inferred from basename. The supported escape hatch for Git Bash users on Windows. |
AO_BASH_PATH |
Used by script-runner.ts on Windows to locate bash before falling back to Git Bash auto-detection. WSL bash is intentionally excluded. |
| Platform | Default runtime | How PTYs work |
|---|---|---|
| macOS / Linux | tmux |
Real tmux server, POSIX signals, Unix sockets |
| Windows | process |
node-pty + ConPTY, named pipes (\\.\pipe\ao-pty-…), pty-host helper process |
Pick the runtime via getDefaultRuntime(), never hardcode. Plugin code that runs across runtimes must handle both — for Windows that means no tmux shell-outs, no SIGTERM/SIGKILL group kills, no POSIX-only tools.
For the architectural detail of how the Windows pty-host, named-pipe protocol, and mux WS Windows branch fit together, see the "Windows Runtime Architecture" section at the bottom of docs/ARCHITECTURE.md.
process.kill(pid, 0)distinguishes liveness on POSIX, but on Windows it can throwEPERMwhen the target exists in a different security context. TreatEPERMas alive but unsignalable (fall through to force-kill); onlyESRCH(or any other code) means the process is gone. The pattern is shown in thesweepWindowsPtyHostssnippet above — copy it, don't bare-catch. The same pattern lives inruntime-processdestroy()(around line 290) and was the bug fix that prompted this section.- Never
process.kill(-pid, …)to kill a process group. Negative PIDs are POSIX-only and become a no-op or worse on Windows. UsekillProcessTree(). - Graceful shutdown before SIGKILL on Windows: SIGKILL'ing the pty-host while ConPTY is mid-spawn orphans
conpty_console_list_agent.exeand triggers a Windows Error Reporting dialog (0x800700e8). Send the cooperative kill (ptyHostKill) first, poll for exit ~500 ms, thenkillProcessTree. pid <= 0guard:process.kill(0, …)signals the current process group on Unix. Always guardpid > 0before signalling.- Detached children: on Windows
ao startdoes NOT detach its dashboard child (so Ctrl+C reaches the whole console group natively); on POSIX it does. Usedetached: !isWindows()rather than always-trueor always-false.
- Filesystem case-insensitive on Windows (NTFS) and macOS (default APFS), case-sensitive on Linux.
D:\Fooandd:\fooare the same directory;/fooand/Fooare not. Compare paths viapathsEqual(), never===. - Always use
path.join()/path.sep. Never hardcode/or\separators. Never split paths on/to walk segments. - Drive letters and UNC paths exist. A path can start with
C:\,\\?\C:\,\\server\share\, orD:. Don't assume paths begin with/. - Paths can contain spaces (
C:\Program Files\…,C:\Users\Some Name\…). Always quote when interpolating into shell commands; preferexecFileoverexec. - HOME / tmp paths differ: use
getEnvDefaults()rather than hardcoding/tmp,~, or$HOME. - Drive-letter slugs: when encoding a path as a filename slug (used by Claude Code's session-JSONL lookup),
C:\Users\dev\project→C--Users-dev-project. Preserve the leading drive-letter dash; don't strip the colon-replacement.
- Default shell on Windows is PowerShell, not bash. Bash syntax (
&&chains,$VAR,2>/dev/null, here-docs) won't work incmd.exeand is only partially supported by PowerShell. When you need to run anything shellish from Node, preferexecFilewith explicit args; if you must use a shell, route throughgetShell(). - PowerShell call operator: a launch command that begins with a quoted absolute path needs
&prepended on Windows (e.g.& "C:\path\to\bin.exe" arg1) or PowerShell parses the quoted path as a string expression. Theagent-codexandagent-kimicodeplugins do this informatLaunchCommand. - No
/dev/nullon Windows — useNUL, or just discard the stream in Node. - Env vars in PowerShell:
$env:NAME, not$NAME. Line continuation is backtick (`), not backslash. .cmd/.bat/.exeshims: spawning npm-installed CLIs (e.g.codex,where) needsshell: trueon Windows soPATHEXTis consulted; otherwise Node only finds extensionless executables. Pattern:spawn(cmd, args, { shell: isWindows(), windowsHide: true }).windowsHide: trueon everyspawn/execFileyou don't want flashing a console window.- Always
shellEscape()any value that ends up in a shell command line, even on Windows. Windows quoting rules are tricky and the helper handles them. - Avoid pipes / redirection in shell strings — they don't behave consistently across cmd.exe / PowerShell / bash. Build the pipeline in Node with stream APIs instead.
$(cat …)substitution doesn't exist in PowerShell or cmd.exe. If you're inlining a file's contents into a command line, read it in Node and pass the contents as an argument (e.g.--append-system-prompt <content>).
- Bind to
127.0.0.1explicitly, notlocalhost, when starting local servers. On Windowslocalhostresolves to::1first; if the server only listens on IPv4 the client stalls ~21 s before the kernel falls back. The same problem reverses if you bind IPv6-only. - Named pipes are the Windows IPC primitive (
\\.\pipe\…); the relay code already handles them inmux-websocket.tsviahandleWindowsPipeMessage. Don't introduce Unix-socket assumptions in new code paths. - Firewall prompts: any
0.0.0.0bind on Windows can pop a Windows Defender Firewall prompt the first time it runs. Stick to loopback unless there's a real reason. - Pipe path injection: a pipe path is constructed from a session ID; always validate that ID with
validateSessionId()before passing togetPipePath()or interpolating into any system call.
tmux, screen, lsof, pkill, which, most coreutils — gone on Windows. If you need their function, either branch through platform.ts or use a Node API instead.
Examples already in platform.ts:
findPidByPortusesnetstat -anoon Windows vslsofelsewherekillProcessTreeusestaskkill /T /Fvs POSIX signal-based killgetShellresolves PowerShell on Windows vs/bin/shon POSIX
If you find yourself reaching for a POSIX-only binary in new code, add the Windows alternative to platform.ts rather than gating the feature.
When writing or modifying an agent plugin (packages/plugins/agent-*), these are the patterns to follow:
- Use
setupPathWrapperWorkspacefor PATH-wrapper interception (gh / git). It auto-handles bash vs.cmd+.cjswrappers per platform. isProcessRunningmust short-circuit on Windows when it would have used tmux orps -eo:if (isWindows()) return false(or implement a real Windows check via tasklist / signal-0 with EPERM handling — never assume tmux exists).detect()spawn options should be{ shell: isWindows(), windowsHide: true }so.cmdshims resolve viaPATHEXTand no console window flashes.- Stderr suppression — the cursor plugin's
detect()previously bled stderr to the user's console on Windows; it now usesstdio: ['ignore', 'pipe', 'ignore']for the probe. Match that pattern. getCachedProcessList()(Claude Code) should return""on Windows —ps -eodoesn't exist.formatLaunchCommand: when the binary is at a quoted absolute path, prepend&on Windows so PowerShell parses it as a call.systemPromptFile: instead of$(cat <file>)shell substitution, read the file in Node and inline as--append-system-prompt <content>.- Codex binary resolution: prefer
.cmdshims (npm) over.exe(Cargo) on Windows; usewhere.exe(notwhich).
The activity-detection contract in CLAUDE.md is platform-agnostic — same JSONL on all platforms — but the inputs (terminal output) come from different runtimes. Use recordTerminalActivity from core (which delegates to classifyTerminalActivity → appendActivityEntry) so you don't have to think about platform.
The mandatory getActivityFallbackState step (see CLAUDE.md "Activity detection architecture") is what keeps the dashboard alive when a native agent API is unavailable — which on Windows happens more often than on Unix because more things shell-out and fail silently. Skipping it has historically broken stuck-detection on Windows.
CI runs on Linux, macOS, and Windows. To make platform-specific code reviewable in a single host environment and to catch regressions even when one runner is unavailable:
- Any new function in
platform.ts(or platform-branching elsewhere) must have both anit.skipIf(process.platform !== "win32")test and a POSIX test. Seepackages/cli/__tests__/lib/path-equality.test.tsfor the pattern (it mocksprocess.platformviaObject.definePropertyto exercise both branches on a single CI host). - For process-kill / EPERM-handling code, add a unit test that simulates
process.killthrowing{ code: "EPERM" }and asserts force-kill is still attempted. Theruntime-processtest suite has examples (look for "win32 destroy when graceful shutdown times out"). - Plugin tests that hit a tmux runtime must
skipIf(isWindows()). Plugin tests that hitruntime-processshould run on all platforms. - For path code, test mixed-case inputs and inputs with spaces.
Pattern for mocking platform on Linux CI:
let originalPlatform: PropertyDescriptor | undefined;
beforeEach(() => {
originalPlatform = Object.getOwnPropertyDescriptor(process, "platform");
});
afterEach(() => {
if (originalPlatform) Object.defineProperty(process, "platform", originalPlatform);
});
function setPlatform(p: NodeJS.Platform) {
Object.defineProperty(process, "platform", { value: p, configurable: true });
}Before saying "done" on any feature, verify each of these (or mark N/A with reasoning):
- No raw
process.platformchecks — usedisWindows()from@aoagents/ao-core? - Process spawning — used
runtime-process(Windows) orruntime-tmux(POSIX) abstractions? Shell-out usedshellEscape+getShellorexecFile?windowsHide: trueandshell: isWindows()for.cmd/.batresolution? - Process killing — distinguished
EPERMfromESRCH? No negative PIDs? UsedkillProcessTree? Guardedpid > 0? Cooperative kill before force-kill on Windows? - Paths — used
pathsEqualfor comparison?path.joinfor construction? No===, no hardcoded/or\? - Shell — no bash-isms (
&&chains,$(cat),$VAR,/dev/null)?&prefix for quoted-path PowerShell calls? Routed throughgetShell()or usedexecFile? - Networking — explicit
127.0.0.1instead oflocalhost? Validated session IDs before constructing pipe paths? - Runtimes — both
runtime-tmuxandruntime-processpaths covered?isProcessRunningworks for tmux TTY and PID signal-0 with EPERM handling? - Agent plugins —
setupPathWrapperWorkspaceinstead of bash hooks?getActivityFallbackStatefallback ingetActivityState? - New platform branching — went into
platform.ts(or another shared helper), not inline at call sites? - Tests — both Windows and POSIX branches covered (mock
process.platformif you can't run on both)?
If you can't say "yes" or "N/A" to all ten, your change probably breaks Windows.
// Platform check, runtime/shell/env defaults, process kill, port lookup
import {
isWindows, getDefaultRuntime, getShell,
killProcessTree, findPidByPort, getEnvDefaults,
shellEscape,
setupPathWrapperWorkspace, buildAgentPath,
registerWindowsPtyHost, unregisterWindowsPtyHost,
getWindowsPtyHosts, clearWindowsPtyHostRegistry,
appendActivityEntry, readLastActivityEntry,
checkActivityLogState, getActivityFallbackState,
classifyTerminalActivity, recordTerminalActivity,
readLastJsonlEntry,
} from "@aoagents/ao-core";
// Path comparison (CLI package)
import { pathsEqual, canonicalCompareKey }
from "../../src/lib/path-equality.js";
// Windows pty-host pipe protocol + sweep
import {
getPipePath, connectPtyHost, ptyHostSendMessage,
ptyHostGetOutput, ptyHostIsAlive, ptyHostKill,
MessageParser, encodeMessage,
sweepWindowsPtyHosts,
} from "@aoagents/ao-plugin-runtime-process";
// Web-side helpers
import { validateSessionId, resolvePipePath }
from "@/server/tmux-utils";
import { stopStaleWindowsPtyHosts }
from "@/lib/windows-pty-cleanup";
// CLI-only signal forwarding (POSIX only — guard with !isWindows())
import { forwardSignalsToChild } from "../lib/shell.js";If a helper you need isn't in this list, that's a strong signal you should add it to platform.ts (or the closest existing module) rather than write platform-branching at the call site.