+
+
+ Connecting GitHub and inspecting the results. Credentials never leave your machine.
+
+
+`vana` collects your data from platforms you use. You log in through a browser
+on your machine, and the CLI saves it locally as JSON.
+
+### Highlights
+
+- **Fully local**: credentials and collected data never leave your machine
+- **Any platform**: connects through a browser session, not a restricted API
+- **Inspectable**: collected data is JSON you can summarize, query, or pipe
+- **Agent-ready**: `--json` and `--no-input` flags for scripts and AI agents
+- **Session caching**: log in once, reconnect faster next time
+- **Extensible**: connectors are standalone modules; add new platforms without touching the core
+
+## Install
+
+macOS (Homebrew):
+
+```bash
+brew install vana-com/tap/vana
+```
+
+macOS and Linux:
+
+```bash
+curl -fsSL https://cli.vana.com/install.sh | sh
+```
+
+Windows (PowerShell):
+
+```powershell
+irm https://cli.vana.com/install.ps1 | iex
+```
+
+Verify with `vana --version`.
+
+## Quick start
+
+```console
+$ vana connect github
+
+ Connect GitHub
+
+ ✓ Signed in
+ ✓ Profile
+ ✓ Repositories — 8 found
+ ✓ Starred
+
+ ✓ Connected GitHub.
+ Collected your GitHub data and synced it to your Personal Server.
+
+ Next: vana data show github
+```
+
+Your data is on disk at `~/.vana/results/`.
+
+```bash
+vana data show github --json | jq '.summary'
+```
+
+Explore further:
+
+```bash
+vana sources # See all available platforms
+vana connect # Interactive source picker
+vana status # What's connected, what needs attention
+```
+
+## How it works
+
+1. `vana connect ` launches a browser on your machine
+2. You log in (credentials stay on your machine)
+3. The CLI collects your data and saves it to `~/.vana/`
+4. `vana data show ` summarizes what was collected
+
+Sessions are cached, so reconnecting is faster next time. Your data is files
+on disk. Inspect, move, or delete them whenever you want.
+
+## Commands
+
+| Command | What it does |
+| ----------------------------- | -------------------------------------------- |
+| `vana connect [source]` | Connect a platform and collect your data |
+| `vana connect --detach` | Connect in the background |
+| `vana sources` | List available platforms |
+| `vana data list` | Show all collected datasets |
+| `vana data show ` | Summarize a collected dataset |
+| `vana status` | Connection health and system overview |
+| `vana collect [source]` | Re-collect data from a connected source |
+| `vana schedule add` | Schedule daily collection (launchd/cron) |
+| `vana skills list` | List available agent skills |
+| `vana skills install ` | Install a skill for your AI agent |
+| `vana mcp` | Start MCP server (Claude Code, Cursor, etc.) |
+| `vana doctor` | Diagnose installation and runtime health |
+| `vana logs [source]` | View run logs |
+| `vana setup` | Install or repair the browser runtime |
+
+Run `vana --help` for detailed usage.
+
+### For scripts and AI agents
+
+Commands support structured output:
+
+```bash
+vana connect github --json --no-input # Machine-safe — never prompts
+vana data show github --json | jq # Pipe collected data anywhere
+vana sources --json # Discover platforms programmatically
+```
+
+`--json` writes structured output to stdout. `--no-input` guarantees no
+interactive prompts. The CLI exits `1` if input is needed. See the
+[exit code reference](docs/CLI-EXIT-CODE-MATRIX.md) for the full contract.
+
+## Sources
+
+`vana` connects to any platform that has a web login. Connectors handle the
+automation for each source.
+
+The CLI shares its connector format with
+[DataConnect](https://github.com/vana-com/data-connect) and the
+[data-connectors](https://github.com/vana-com/data-connectors) repository.
+
+Available: **GitHub**, **ChatGPT**, **Instagram**, **LinkedIn**, **Spotify**,
+**YouTube**, **Shop**, **Oura Ring**, **Uber**.
+
+Run `vana sources` to see what's available on your install.
+
+Missing a platform?
+[Request one](https://github.com/vana-com/cli/issues/new?template=source-request.yml)
+· [Build a connector](docs/building-connectors.md)
+
+## Ecosystem
+
+`vana` is the CLI for [Vana](https://vana.org)'s data portability network. It
+shares connectors and local storage with
+[DataConnect](https://github.com/vana-com/data-connect), the desktop app. For
+building apps that request user data, see the
+[Connect SDK](https://github.com/vana-com/vana-connect).
+
+## Privacy
+
+**Credentials**: You log in through a browser on your machine. Vana never
+sees your password, token, or session cookie.
+
+**Collected data**: Saved to `~/.vana/` as local files. Nothing is
+uploaded.
+
+**Browser sessions**: Cached in `~/.vana/browser-profiles/` for faster
+reconnects. Delete them any time.
+
+**Telemetry**: None.
+
+## Troubleshooting
+
+```bash
+vana doctor # Runtime, browser, and state health
+vana logs # Latest run log for a source
+```
+
+| Problem | Fix |
+| ----------------------- | ------------------------------------------ |
+| Browser runtime missing | `vana setup` |
+| Login expired | `vana connect ` to re-authenticate |
+| Connector fails | `vana logs ` for details |
+
+## Uninstall
+
+Remove the CLI:
+
+```bash
+brew uninstall vana # Homebrew
+rm -f ~/.local/bin/vana # Script install (macOS / Linux)
+```
+
+Remove collected data and state:
+
+```bash
+rm -rf ~/.vana
+```
+
+## Documentation
+
+- [Building connectors](docs/building-connectors.md)
+- [Exit code reference](docs/CLI-EXIT-CODE-MATRIX.md)
+- [Architecture](docs/architecture.md)
+
+## Community
+
+- [Issues](https://github.com/vana-com/cli/issues): bugs and source requests
+- [Discussions](https://github.com/vana-com/cli/discussions): questions and ideas
+- [Discord](https://discord.gg/vana): chat with the team
+- [Contributing](CONTRIBUTING.md)
+
+## License
+
+MIT
diff --git a/docs/CLI-AGENT-FRIENDLY.md b/docs/CLI-AGENT-FRIENDLY.md
new file mode 100644
index 00000000..24f36fef
--- /dev/null
+++ b/docs/CLI-AGENT-FRIENDLY.md
@@ -0,0 +1,212 @@
+# CLI Agent-Friendly Integration
+
+_March 16, 2026_
+
+Research and roadmap for making the `vana` CLI a first-class tool for AI agents
+(coding agents, automation pipelines, MCP clients).
+
+Based on 2026 benchmarks, production CLI patterns, and emerging standards.
+
+## Current state
+
+The CLI already ships several agent-friendly features:
+
+- `--json` flag on all commands (JSONL streaming events to stdout)
+- `--no-input` and `--yes` flags to skip interactive prompts
+- `--quiet` flag to suppress non-essential output
+- Typed `CliOutcomeStatus` codes in JSONL outcomes
+- Zod schemas for all output types (`cliEventSchema`, `cliStatusSchema`, etc.)
+- Consistent exit codes (0 success, 1 failure)
+
+## Why CLI > MCP for agents (the data)
+
+ScaleKit benchmark (Claude Sonnet 4, 5 GitHub tasks, 25 runs each):
+
+| Approach | Tokens/task | Reliability | Monthly cost @10k ops |
+| ---------------------------- | --------------- | ------------ | --------------------- |
+| CLI alone | 1,365 – 9,386 | 100% (25/25) | ~$3.20 |
+| CLI + SKILL.md (~800 tokens) | 2,816 – 12,210 | 100% (25/25) | ~$3.50 |
+| MCP (direct) | 32,279 – 82,835 | 72% (18/25) | ~$55.20 |
+| MCP (via gateway) | ~CLI-level | ~99% | ~$5.00 |
+
+**Key insight:** An 800-token skill file reduces agent tool calls by a third
+and latency by a third versus naive CLI usage. MCP costs 4–32x more tokens
+than CLI for equivalent tasks.
+
+The 2026 consensus: **CLIs are the most cost-effective and reliable transport
+for AI agents.** The winning pattern is CLI + Skills, with optional MCP for
+IDE/agent integrations that expect it.
+
+## Three competing discovery standards
+
+### AGENTS.md (Linux Foundation / Agentic AI Foundation)
+
+Plain Markdown at repo root. Consumed by 60,000+ projects. Supported by Claude
+Code, OpenAI Codex, Cursor, Gemini CLI, VS Code Copilot, Devin, Aider.
+Think of it as "README for agents."
+
+We already ship one at `/AGENTS.md`.
+
+### SKILL.md (Anthropic / agentskills.io)
+
+Structured format with YAML frontmatter (`name`, `description`, `license`,
+`compatibility`, `allowed-tools`) plus Markdown instructions. Progressive
+disclosure: ~100 tokens of metadata loaded at startup, full instructions
+(<5,000 tokens) loaded only when the skill activates.
+
+Works across Claude Code, OpenAI Codex, and OpenClaw.
+
+### MCP tool definitions (Anthropic MCP spec)
+
+JSON-schema tool definitions exposed via the Model Context Protocol. Most
+structured option but most token-expensive upfront (55,000 tokens for GitHub's
+43 tools).
+
+## Roadmap
+
+### Tier 1 — Low effort, high impact
+
+#### a) Ship a SKILL.md alongside the CLI
+
+```yaml
+---
+name: vana-connect
+description: >
+ Connect personal data from web platforms (GitHub, Spotify, Shop, etc.)
+ via headless browser automation. Use when collecting user data, checking
+ connection status, or managing data sources.
+allowed-tools: Bash(vana:*)
+---
+```
+
+Document each command with examples, expected JSON output shapes, and error
+recovery patterns. This is the single highest-ROI move — 800 tokens of
+guidance reduces agent errors by a third.
+
+#### b) `AGENT` environment variable detection
+
+When `AGENT` is set, auto-enable `--json` and `--no-input`. This lets any
+coding agent use `vana` without remembering flags. Mirrors the `CI=true`
+convention. Already implemented by Goose (`AGENT=goose`), Amp (`AGENT=amp`).
+
+Active proposal: github.com/agentsmd/agents.md/issues/136
+
+#### c) Enrich error JSON with `retryable` and `suggestion` fields
+
+```json
+{
+ "type": "outcome",
+ "status": "runtime_error",
+ "error_code": "runtime_error",
+ "retryable": true,
+ "suggestion": "run `vana setup --yes` to install the runtime"
+}
+```
+
+Follows RFC 9457 structured error pattern. Cloudflare's implementation is the
+gold standard — delivers 98% token reduction vs unstructured errors.
+
+#### d) Route human messages to stderr in JSON mode
+
+Currently `createEmitter.info()` suppresses messages in JSON mode. Instead,
+write them to stderr so agents get both the structured JSONL on stdout and
+human-readable context on stderr.
+
+### Tier 2 — Medium effort, strong differentiation
+
+#### e) Self-describing command (`vana describe`)
+
+Output machine-readable command metadata for just-in-time discovery:
+
+```json
+{
+ "name": "vana",
+ "version": "0.1.0",
+ "commands": [
+ {
+ "name": "connect",
+ "args": [{ "name": "source", "required": true }],
+ "flags": ["--json", "--no-input", "--yes", "--quiet"],
+ "description": "Connect a data source and collect personal data",
+ "exit_codes": { "0": "success", "1": "failure" },
+ "output_schema": "CliEvent | CliOutcome (JSONL)"
+ }
+ ]
+}
+```
+
+Agents use this instead of parsing `--help` text.
+
+#### f) MCP server mode (`vana mcp`)
+
+Follow the oclif-plugin-mcp-server pattern. Build a thin MCP wrapper that maps
+existing Commander commands to MCP tools. Key benefit: any MCP-compatible agent
+(Claude Code, Cursor, Gemini CLI, VS Code Copilot) discovers and uses `vana`
+natively.
+
+Since we use Commander (not oclif), this would be a custom implementation, but
+the pattern is well-documented.
+
+#### g) Semantic exit codes
+
+Map `CliOutcomeStatus` values to distinct exit codes so agents can branch
+without parsing JSON:
+
+| Exit code | Status | Meaning |
+| --------- | ----------------------- | -------------------------------- |
+| 0 | `connected_and_synced` | Success, data synced |
+| 0 | `connected_local_only` | Success, data saved locally |
+| 1 | `runtime_error` | General failure |
+| 2 | `needs_input` | Recoverable with interactive run |
+| 3 | `setup_required` | Recoverable with `vana setup` |
+| 4 | `auth_failed` | Authentication problem |
+| 5 | `connector_unavailable` | Source not supported yet |
+
+### Tier 3 — Forward-looking
+
+#### h) Publish JSON schemas for all output types
+
+Export Zod schemas as JSON Schema files alongside the package, so agents can
+validate output programmatically.
+
+#### i) `--dry-run` for `connect`
+
+Output what the connector would do without executing — useful for agents to
+preview before committing to a long-running operation.
+
+#### j) Session continuation tokens
+
+For multi-step flows (setup -> connect -> verify), emit a `next_command` field
+in outcomes so agents can chain commands without hardcoded logic:
+
+```json
+{
+ "type": "outcome",
+ "status": "connected_local_only",
+ "next_commands": ["vana data show github", "vana connect spotify"]
+}
+```
+
+## Production examples (2026)
+
+| Project | Pattern |
+| ------------------------ | ----------------------------------------------------------- |
+| oclif-plugin-mcp-server | Auto-discovers CLI commands, exposes as MCP tools |
+| Google Workspace CLI | Built-in MCP server for Drive, Gmail, Calendar, Docs |
+| MCPShim | Daemon that turns any MCP server into shell commands |
+| mcp-cli (Philipp Schmid) | Dynamic discovery: 47K tokens → 400 tokens (99% reduction) |
+| Stripe CLI v1.37.2 | AI agent detection via User-Agent headers |
+| Vercel agent-browser | Accessibility-tree snapshots, `--json`, ref-based selection |
+
+## Sources
+
+- [ScaleKit: MCP vs CLI Benchmarking](https://www.scalekit.com/blog/mcp-vs-cli-use)
+- [CLI is the New MCP](https://oneuptime.com/blog/post/2026-02-03-cli-is-the-new-mcp/view)
+- [Why CLI Tools Are Beating MCP](https://jannikreinhard.com/2026/02/22/why-cli-tools-are-beating-mcp-for-ai-agents/)
+- [SKILL.md Specification](https://agentskills.io/specification)
+- [AGENTS.md Standard](https://agents.md/)
+- [AGENT env var proposal](https://github.com/agentsmd/agents.md/issues/136)
+- [RFC 9457 Structured Errors](https://noise.getoto.net/2026/03/11/slashing-agent-token-costs-by-98-with-rfc-9457-compliant-error-responses/)
+- [mcp-cli Dynamic Discovery](https://www.philschmid.de/mcp-cli)
+- [oclif-plugin-mcp-server](https://github.com/npjonath/oclif-plugin-mcp-server)
+- [Writing CLI Tools AI Agents Want to Use](https://dev.to/uenyioha/writing-cli-tools-that-ai-agents-actually-want-to-use-39no)
diff --git a/docs/CLI-AUDIENCE-CONTRACT.md b/docs/CLI-AUDIENCE-CONTRACT.md
new file mode 100644
index 00000000..d9372af0
--- /dev/null
+++ b/docs/CLI-AUDIENCE-CONTRACT.md
@@ -0,0 +1,299 @@
+# `vana-connect` CLI Audience Contract
+
+_As of March 12, 2026_
+
+## Purpose
+
+This document defines what must be true for the `vana connect` CLI to serve both:
+
+- humans using the terminal directly
+- coding agents like Codex and Claude Code
+
+The product should not split into two different CLIs. It should present one command model with mode behavior that works well for both audiences.
+
+## Core decision
+
+`vana connect` should use:
+
+- **one command grammar**
+- **one underlying lifecycle**
+- **multiple output and prompt modes**
+
+The split between audiences should happen primarily in:
+
+- output formatting
+- prompt behavior
+- verbosity
+- non-interactive guarantees
+
+Not in top-level commands.
+
+## Audience 1: Humans
+
+### What humans need
+
+- commands they can guess
+- clean first-run guidance
+- clear trust boundaries
+- visible progress
+- human summaries of success
+- direct explanations of failure and recovery
+
+### What must be true in the UX
+
+- the first command should work without reading extensive docs
+- install/setup should be explained before any major local changes happen
+- credentials should never feel like they are disappearing into a black box
+- success should be summarized in plain language
+- the user should always know what to do next
+
+### What to avoid for humans
+
+- raw JSON by default
+- unexplained artifact paths as the main success message
+- ambiguous partial matches without confirmation
+- silent fallback behavior
+- prompts that assume repo knowledge
+
+## Audience 2: Coding agents
+
+### What agents need
+
+- stable command grammar
+- stable exit codes
+- stable machine-readable output
+- deterministic prompt behavior
+- minimal token waste
+- no hidden state transitions
+
+### What must be true in the UX
+
+- there must be a machine-readable mode
+- stdout/stderr behavior must be predictable
+- non-interactive mode must fail instead of hanging
+- event types and field names should be stable
+- help output and command behavior should be compact and unsurprising
+
+### What to avoid for agents
+
+- decorative output in machine mode
+- forced interactive flows when a flag could suppress them
+- hidden install side effects without explicit acknowledgement
+- inconsistent error shapes
+- long prose where structured output is possible
+
+## Shared requirements
+
+These are requirements for both audiences.
+
+### 1. Trust
+
+The CLI must make these things explicit:
+
+- what is being installed
+- where state is stored locally
+- when credentials are needed
+- where data ended up
+
+### 2. Legibility
+
+The CLI must make these things easy to answer:
+
+- what just happened?
+- what state am I in now?
+- what do I do next?
+
+### 3. Recovery
+
+The CLI must make common failure modes recoverable without reading source code or repo docs.
+
+### 4. Composability
+
+The CLI must support being embedded in:
+
+- shell scripts
+- agents
+- future desktop or service wrappers
+
+## Proposed mode model
+
+This is the intended starting point for MVP.
+
+### Default mode
+
+Audience:
+
+- humans
+
+Behavior:
+
+- concise human-readable output
+- prompts when needed
+- clear next-step hints
+- summarized success and failure
+
+### `--json`
+
+Audience:
+
+- agents
+- scripts
+- advanced users
+
+Behavior:
+
+- machine-readable output only
+- stable event objects
+- no decorative formatting
+
+### `--no-input`
+
+Audience:
+
+- automation
+- agents in non-interactive mode
+
+Behavior:
+
+- fail if input is needed
+- never block waiting for user input
+
+### `--yes`
+
+Audience:
+
+- automation
+- users who want one-shot bootstrap
+
+Behavior:
+
+- auto-approve safe install/setup prompts
+
+### `--quiet`
+
+Audience:
+
+- scripts
+- users who want reduced chatter
+
+Behavior:
+
+- reduce non-essential status output
+- preserve important warnings/errors
+
+## Why `--json` is probably the key agent mode
+
+At least for MVP, `--json` is likely more important than an `--agent` flag.
+
+Why:
+
+- it is easier to reason about
+- it matches strong prior art
+- it focuses on contract shape rather than audience labeling
+- it avoids creating a second quasi-interface
+
+An explicit `--agent` flag should only be introduced if it adds semantics that cannot be cleanly expressed through:
+
+- `--json`
+- `--no-input`
+- `--yes`
+- `--quiet`
+
+## Output contract expectations
+
+### In human mode
+
+Success should look like:
+
+- source connected
+- brief summary of what was collected
+- whether data was ingested or only stored locally
+- next step
+
+Failure should look like:
+
+- short problem statement
+- one suggested fix
+- whether retry is safe
+
+### In machine mode
+
+Output should:
+
+- use stable event types
+- include stable identifiers and file paths when relevant
+- clearly distinguish success, warning, and failure states
+- avoid free-form chatter
+
+The existing `run-connector` event model is a strong starting point and should be formalized rather than replaced.
+
+## Command behavior principles
+
+### Principle 1: Same command, different presentation
+
+Example:
+
+```bash
+vana connect steam
+vana connect steam --json
+vana connect steam --json --no-input
+```
+
+These should represent the same lifecycle, not different products.
+
+### Principle 2: Auto-do the obvious, but narrate it
+
+If runtime is missing, the CLI can install it as part of `connect`, but it should first explain:
+
+- what is missing
+- what it will install
+- where it will go
+
+### Principle 3: Never surprise automation
+
+If the caller asked for non-interactive behavior, the CLI must fail clearly instead of waiting for a prompt response indefinitely.
+
+### Principle 4: Success should be outcome-shaped
+
+Do not make the main success story:
+
+- “saved file to X”
+
+Prefer:
+
+- “connected Steam and collected your library and profile”
+
+Then include artifact paths as supporting detail.
+
+## MVP acceptance criteria by audience
+
+### Humans
+
+MVP is acceptable if:
+
+- they can connect one source from one command
+- they understand what was installed and where
+- they understand what data was collected
+- they can check status later without docs
+
+### Agents
+
+MVP is acceptable if:
+
+- they can run one command with stable machine output
+- they can detect missing setup cleanly
+- they can detect when input is required
+- they can distinguish local-only success from Personal Server ingest success
+
+## Conclusion
+
+The key to serving both audiences is not building two interfaces.
+
+It is:
+
+- one command model
+- one lifecycle
+- one explicit mode system
+- one strong machine-readable contract
+
+If those are done well, `vana connect` can feel polished for humans and efficient for coding agents at the same time.
diff --git a/docs/CLI-BEAUTY-AUDIT.md b/docs/CLI-BEAUTY-AUDIT.md
new file mode 100644
index 00000000..0ceedf2c
--- /dev/null
+++ b/docs/CLI-BEAUTY-AUDIT.md
@@ -0,0 +1,76 @@
+# CLI Beauty Audit
+
+_March 17, 2026_
+
+Three-axis design audit comparing every vana CLI surface against
+best-in-class CLIs (GitHub CLI, Vercel, Railway, Stripe).
+
+## What was fixed (this commit)
+
+| Fix | Before | After |
+| ------------------------------------- | ----------------------------------------------- | ------------------------------------ |
+| Implementation detail in descriptions | "...using Playwright browser automation" | Stripped from all displays |
+| Redundant "Flow:" lines | "Flow: prompts in this terminal..." | Removed (badge already says it) |
+| "Tip:" prefix | "Tip: Run `vana server set-url`..." | "Save with `vana server set-url`..." |
+| Excessive next steps | 5-6 bullet points per command | Max 3, most relevant only |
+| Session message | "Saved session available for faster reconnects" | "Session cached." |
+| Blank line before first section | 3 blank lines after title | 1 blank line |
+| Key-value label width | 17 characters (wide gaps) | 14 characters (tighter) |
+| Help text organization | Flat command list | Task-oriented groups |
+| Bare `vana server` | Status only | Status + subcommand hints |
+
+Net result: -46 lines of output across the CLI.
+
+## What needs Tim's input
+
+### 1. `status` vs `doctor` consolidation
+
+The audit found significant overlap. Recommendation: make `status` a
+3-5 line health check ("Is my system ready?"), move diagnostics to
+`doctor` ("What's wrong?"). Currently both try to be comprehensive.
+
+### 2. `[legacy]` relabeling
+
+"Legacy" sounds broken when it really means "browser-required auth."
+Options under consideration:
+
+- `[browser auth]` — describes what happens
+- `[manual login]` — describes user action
+- Remove the label entirely and explain in description
+
+### 3. Default `vana` behavior
+
+Currently shows `--help`. Audit suggests: show a guided onboarding
+message when no sources are connected, or show status.
+
+### 4. Connector descriptions upstream
+
+Descriptions like "Exports your X using Playwright browser automation"
+come from `data-connectors/registry.json`. We strip "using Playwright"
+at display time, but fixing upstream would be cleaner.
+
+### 5. `vana data show` hardcoded schema
+
+`summarizeResultData()` has hardcoded field names (`profile.username`,
+`repositories`, etc.) that don't scale to new connectors. Needs
+architectural decision: use connector metadata scopes, JSON schema
+introspection, or generic object walking.
+
+## Audit methodology
+
+Three parallel agents evaluated the CLI along independent axes:
+
+1. **Copy quality** — tone, outcome language, specificity, density
+2. **Visual structure** — spacing, color semantics, hierarchy, symbols
+3. **Progressive disclosure** — first run, empty states, detail-on-demand
+
+Each agent compared our transcripts against GitHub CLI, Vercel CLI,
+Railway CLI, and Stripe CLI, producing line-level findings with
+specific rewrites.
+
+## Remaining polish items (no Tim input needed)
+
+- Connect flow transcript pacing audit (spinner stacking)
+- `vana sources` should distinguish "why" between auth modes
+- Success output should feel more celebratory (outcome-first structure)
+- `vana data show` next steps should not suggest circular navigation
diff --git a/docs/CLI-BEAUTY-IMPLEMENTATION-PLAN.md b/docs/CLI-BEAUTY-IMPLEMENTATION-PLAN.md
new file mode 100644
index 00000000..454669d5
--- /dev/null
+++ b/docs/CLI-BEAUTY-IMPLEMENTATION-PLAN.md
@@ -0,0 +1,936 @@
+# `vana-connect` CLI Beauty Implementation Plan
+
+_As of March 14, 2026_
+
+## Purpose
+
+This document turns the CLI beauty research into an execution plan for the real
+`vana` product.
+
+It is not a brainstorm. It is the recommended implementation path for making
+the human-facing CLI feel best-in-class while preserving:
+
+- the existing command grammar
+- the existing agent/machine contract
+- the current runtime architecture direction
+- broad terminal compatibility
+
+It should be read together with:
+
+- [CLI-UX-QUALITY-BAR.md](CLI-UX-QUALITY-BAR.md)
+- [CLI-AUDIENCE-CONTRACT.md](CLI-AUDIENCE-CONTRACT.md)
+- [CLI-ONBOARDING-COPY.md](CLI-ONBOARDING-COPY.md)
+- [CLI-UX-SIMULATION.md](CLI-UX-SIMULATION.md)
+- [CLI-EXECUTION-PLAYBOOK.md](CLI-EXECUTION-PLAYBOOK.md)
+- Terminal CLI Beauty Memo (data-connectors/skills/vana-connect/docs/cli-beauty-research/terminal-cli-visual-and-emotional-beauty-memo.md)
+
+## Final Recommendation
+
+Build beauty as a **human-mode presentation layer** over the existing command
+model and event model.
+
+Do **not** turn `vana` into a full-screen TUI.
+
+Do **not** let beauty mutate the `--json` contract.
+
+Do **not** optimize for the most dramatic terminal techniques. Optimize for:
+
+- clean hierarchy
+- confident pacing
+- spinner-to-checkmark payoff
+- stronger success moments
+- small, composable read surfaces
+- compatibility-first richness
+
+The target feel is closer to:
+
+- Vercel for narrative pacing
+- `gh` for restraint
+- `@clack/prompts` for interactive continuity
+
+Not:
+
+- a Bubble Tea-style full-screen app
+- a coding-agent TUI clone
+- a flashy branded terminal demo
+
+Important:
+
+This plan by itself is **necessary but not sufficient** for best-in-class
+quality.
+
+Beauty work can make the CLI feel premium, but it does not automatically close:
+
+- cold-install trust
+- degraded/manual-flow excellence
+- post-success payoff quality
+- public-artifact truth
+
+Those are tracked explicitly in:
+
+- [CLI-EXECUTION-PLAYBOOK.md](CLI-EXECUTION-PLAYBOOK.md)
+ under **Batch 8A: Best-In-Class Finish**
+
+## Brand palette anchor
+
+The CLI should not invent its own theme. It should derive its semantic color
+choices from the shared Vana product palette in:
+
+- [shadcn.css](https://github.com/vana-com/vana-app/blob/main/packages/ui/src/styles/shadcn.css)
+
+Important tokens in that file:
+
+- neutral black/white base
+- `--accent`: Vana blue (`#4141fc`)
+- `--destructive`: product red
+- `--success`: product green
+- muted neutrals like `--foreground-secondary` and `--frostgray`
+
+In terminal form, this should be a **semantic downsampling**, not a literal
+translation.
+
+Use the palette like this:
+
+- headings and primary labels: neutral bold
+- active step / selected source / key emphasis: Vana blue when color allows it
+- success: green
+- warning: yellow or amber terminal-safe equivalent
+- error: red
+- supporting text: dim neutral
+
+Important:
+
+- do not make the whole CLI blue
+- do not require truecolor to feel like Vana
+- the brand should show up mostly in emphasis and pacing, not saturation
+
+## What The Research Implies
+
+The research points to five conclusions that matter here.
+
+### 1. Most beauty is not decoration
+
+The highest-leverage improvements are in:
+
+- spacing
+- section hierarchy
+- calm progress
+- spinner/checkmark transitions
+- clear start and end states
+
+Not in banners, gradients, or dense terminal chrome.
+
+### 2. Temporal design matters more than static styling
+
+`vana connect` is a long-running, state-changing command. That means the user
+experience is primarily about:
+
+- what appears first
+- what changes over time
+- whether progress feels trustworthy
+- whether success lands with a payoff moment
+
+The transition from “working” to “done” matters more than adding more color.
+
+### 3. Node has a practical ceiling
+
+The right architecture is **not** a custom terminal framework.
+
+For this CLI, the right stack is:
+
+- simple output rendering
+- render-in-place where needed
+- careful prompt selection
+- no alternate screen buffer
+- no heavy React/Ink dependency for v1 beauty
+
+### 4. Compatibility is part of the product
+
+The beauty layer must degrade cleanly across:
+
+- Homebrew installs on macOS
+- Linux terminals
+- tmux / SSH
+- CI / non-TTY
+- users with `NO_COLOR`
+
+This excludes a lot of terminal theatrics from the default path.
+
+### 5. Beauty should preserve composability
+
+Best-in-class for `vana` is not just “looks nice.”
+
+It also means:
+
+- `--json` stays deterministic
+- output can still be piped when explicitly requested
+- humans get a better default
+- advanced users can still compose with tools like `jq`
+
+## README demo strategy
+
+The README should become a reliable progress surface for the team, not just a
+reference page.
+
+To support that, add **VHS-based terminal recordings early**, not at the end.
+
+Why:
+
+- they make progress visible to people who are not running the branch locally
+- they force the human-mode CLI to stay coherent across releases
+- they create a deterministic review artifact instead of ad hoc screenshots
+- they make README updates a meaningful indicator of product quality
+
+The right model is:
+
+- checked-in `.tape` files under a dedicated folder such as `docs/vhs/`
+- deterministic fixture data and temp-home setup
+- generated SVG assets committed to the repo for README use
+- CI verification that the tapes still render
+
+Important:
+
+- do not rely on live credentials or real connector runs for README demos
+- do not make VHS the source of truth for behavior; tests remain the source of truth
+- do treat VHS as a first-class product artifact that should stay current
+
+## Product Decision
+
+The CLI should have **two rendering layers** over one command surface:
+
+### Layer 1: Machine mode
+
+Unaffected by beauty work.
+
+Rules:
+
+- `--json` remains pure structured output
+- no decorative lines
+- no spinners
+- no extra prose
+- stable field names and exit codes
+
+### Layer 2: Human mode
+
+This is where beauty work happens.
+
+Rules:
+
+- readable hierarchy
+- visually distinct phases
+- clear trust framing
+- minimal but meaningful motion
+- satisfying success summary
+- one useful next step
+
+## Non-Negotiable Constraints
+
+These are product constraints, not preferences.
+
+1. `vana connect ` remains the canonical first command.
+2. Human mode must remain calm and serious, not playful or noisy.
+3. `--json` mode must remain stable and decoration-free.
+4. Beauty must not depend on full-screen terminal control.
+5. Beauty must not materially degrade performance.
+6. No output design should assume Nerd Fonts, emoji correctness, or truecolor.
+7. Artifact paths remain supporting detail, not the story.
+8. A successful run must feel like an outcome, not a file write.
+
+## What To Build
+
+## 1. A real terminal presentation system
+
+Introduce a dedicated human-rendering layer inside the CLI.
+
+Suggested internal shape:
+
+- `src/cli/render/capabilities.ts`
+- `src/cli/render/theme.ts`
+- `src/cli/render/symbols.ts`
+- `src/cli/render/format.ts`
+- `src/cli/render/progress.ts`
+- `src/cli/render/prompts.ts`
+
+This should own:
+
+- color decisions
+- symbol decisions
+- section spacing
+- phase rendering
+- spinners / transitions
+- fallback behavior
+
+The command handlers should describe **meaning**, not styling:
+
+- phase started
+- phase completed
+- success
+- warning
+- failure
+- supporting detail
+
+## 2. Capability-aware rendering
+
+Define explicit rendering tiers.
+
+### Tier A: Plain
+
+Used for:
+
+- non-TTY
+- CI
+- `NO_COLOR`
+- future screen-reader/plain modes
+
+Rules:
+
+- no spinner animation
+- no render-in-place
+- ASCII-safe symbols only
+- plain line-by-line output
+
+### Tier B: Standard interactive
+
+Default target for most users.
+
+Rules:
+
+- 4-bit or 8-color hierarchy
+- Unicode symbols with ASCII fallback
+- simple render-in-place spinners
+- no exotic OSC features required
+
+### Tier C: Rich interactive
+
+Optional enhancement when capabilities support it.
+
+Rules:
+
+- slightly richer color hierarchy
+- optional hyperlinks later
+- same layout as Tier B, not a different product
+
+Important:
+
+`vana` should **not** require truecolor to feel polished.
+
+## 3. A semantic theme, not ad hoc styling
+
+Do not scatter raw color calls through command logic.
+
+Define semantic tokens like:
+
+- `accent`
+- `muted`
+- `success`
+- `warning`
+- `error`
+- `heading`
+- `dim`
+
+Define semantic symbols like:
+
+- `success`
+- `error`
+- `warning`
+- `info`
+- `bullet`
+- `arrow`
+- `spinner`
+
+Map those semantic tokens back to the shared Vana palette first, then downgrade
+them per capability tier.
+
+Do not require external icon sets.
+
+## 4. An event-to-progress bridge
+
+The current CLI is still underpowered here.
+
+Beauty work should **not** parse log files for progress. It should render from
+structured runtime events.
+
+That means the runtime event model should be extended to support human progress.
+
+Minimum new event types:
+
+- `phase-start`
+- `phase-update`
+- `phase-complete`
+- `status-update`
+- `count-update`
+
+Example payloads:
+
+```json
+{"type":"phase-start","source":"github","phase":{"key":"auth","label":"Signing in"}}
+{"type":"phase-update","source":"github","phase":{"key":"collect","label":"Collecting"},"message":"Fetched 2 repositories","count":2}
+{"type":"phase-complete","source":"github","phase":{"key":"collect","label":"Collecting"}}
+```
+
+This is the backbone for:
+
+- calm progress
+- spinner/checkmark transitions
+- meaningful counts
+- better success summaries
+
+Without this, the beauty layer will remain shallow.
+
+## 5. A stronger prompt model
+
+The current prompt layer is serviceable, not great.
+
+Recommended direction:
+
+- use `@clack/prompts` for human interactive flows
+- keep `--json` / `--no-input` behavior unchanged
+
+Why:
+
+- vertical narrative continuity is a high-value, low-risk beauty technique
+- prompt flows are part of the emotional arc
+- the current inquirer-based experience is functionally fine but visually thin
+
+Use it for:
+
+- `vana connect` source picker
+- setup confirmation
+- credential prompts
+- 2FA prompts
+
+Do not use it to create a mini-app. Use it to make existing interaction feel
+cohesive.
+
+## 6. Better static surfaces
+
+Before richer motion, the static command surfaces need to feel more deliberate.
+
+### `vana connect`
+
+Requirements:
+
+- `vana connect` with no source becomes a guided entrypoint
+- no raw Commander argument error
+- source picker shows:
+ - name
+ - auth maturity badge
+ - one-line description
+
+### `vana sources`
+
+Requirements:
+
+- more legible list hierarchy
+- install state and auth maturity stay visible
+- optional grouping later:
+ - connected
+ - available
+ - legacy/manual
+
+### `vana status`
+
+Requirements:
+
+- compact but more deliberate section styling
+- runtime / Personal Server / sources clearly separated
+- source lines should scan in one pass
+- one detail line max beneath a source unless expanded later
+
+### `vana data`
+
+This command family should exist as the first read surface for collected data.
+
+Minimum scope:
+
+- `vana data list`
+- `vana data show `
+- `vana data path `
+
+This is not only about utility. It creates the post-success payoff loop:
+
+- connect
+- inspect
+- trust
+
+## 7. A better connect narrative
+
+Human `vana connect ` should feel like one narrative with visible phase
+changes.
+
+Recommended flow:
+
+1. connector resolution
+2. setup/runtime preparation if needed
+3. trust framing before auth
+4. auth/input collection
+5. collection progress
+6. sync/local-save outcome
+7. success summary
+
+Each phase should:
+
+- start cleanly
+- not spam intermediate output
+- end with a visible state transition
+
+This is where spinners should live, but with restraint.
+
+## 8. A real success moment
+
+The current success state is still too weak.
+
+The human success moment should include:
+
+- source name
+- what was collected
+- where it went
+- one next step
+
+Recommended shape:
+
+```text
+Connected GitHub.
+
+Collected:
+- Profile: tnunamak
+- Repositories: 2
+- Starred: 0
+
+Saved locally:
+- /Users/tim/.vana/last-result.json
+
+Next:
+- Run `vana status`
+- Or inspect the data with `vana data show github`
+```
+
+Important:
+
+- the summary is the trophy moment
+- the artifact path is supporting detail
+- counts and examples are better than generic “done”
+
+## 9. Clear error beauty
+
+Failure states should get the same visual discipline as success states.
+
+Rules:
+
+- one-line diagnosis first
+- one actionable next step second
+- log path third
+- no stack traces in normal human mode
+
+This especially matters for:
+
+- setup failure
+- missing source
+- auth failure
+- legacy/manual source flows
+- connector broke / site changed
+
+## Explicit Non-Goals For v1 Beauty
+
+Do not build these into the first beauty pass:
+
+- full-screen TUI
+- alternate screen buffer
+- React/Ink interface
+- theme system exposed to users
+- ASCII banners / logo art
+- emoji as core semantics
+- truecolor-only visuals
+- OSC 8 hyperlinks as required UX
+- heavy box drawing everywhere
+- animated success banners
+
+These either increase risk, increase maintenance cost, or violate the quality
+bar.
+
+## Recommended Implementation Stack
+
+### Keep
+
+- current command handlers
+- current runtime/event architecture
+- current `--json` contract
+
+### Add
+
+- `@clack/prompts` for interactive human prompts
+- `picocolors` for lightweight color styling
+- `ora` for simple spinner/checkmark transitions
+
+### Build in-house
+
+- capability detection and downgrade rules
+- semantic theme
+- symbols with ASCII fallback
+- connect flow renderer
+- static section formatter
+- success summary formatter
+
+### Do not add now
+
+- Ink
+- Listr2
+- Blessed
+- alternate screen renderer
+
+## Why this stack
+
+`@clack/prompts` gives narrative continuity for prompts.
+
+`picocolors` keeps styling small and fast.
+
+`ora` gives the right primitive for spinner-to-checkmark transitions without
+dragging in a larger rendering framework.
+
+The rest should be custom because the product surface is small and the quality
+bar is specific.
+
+## Compatibility Strategy
+
+The beauty layer should respect:
+
+- `process.stdout.isTTY`
+- `NO_COLOR`
+- `CI`
+- `TERM=dumb`
+
+Later, consider explicit flags:
+
+- `--color=auto|always|never`
+- `--plain`
+- `--screen-reader`
+
+But do not block the first beauty pass on adding them.
+
+The important rule is:
+
+- if the environment is uncertain, downgrade gracefully
+
+## Performance Rules
+
+These are hard rules.
+
+1. No spinner for operations that finish below perception threshold.
+2. No progress update on every tiny event.
+3. No render loop driven by timers alone when there is meaningful event data.
+4. Prefer event-driven updates to artificial animation.
+5. Never let progress rendering materially slow setup/connect.
+
+Specific rule of thumb:
+
+- if an operation completes in under ~250ms, print nothing but the resulting
+ state
+- if it lasts longer, show one spinner or one phase line
+
+## Acceptance Criteria
+
+The beauty pass is not done until all of these are true.
+
+### Human mode
+
+1. `vana connect` without a source is a graceful, guided entrypoint.
+2. `vana connect github` has visible but calm phase transitions.
+3. A successful run ends with a strong success summary.
+4. A local-only success clearly differs from a Personal Server sync success.
+5. `vana status` is more legible without becoming verbose.
+6. `vana data show github` feels like a real payoff surface.
+7. Cancelling a prompt does not dump an exception stack.
+
+### Machine mode
+
+1. `--json` output is unchanged except for intentional schema additions.
+2. No decorative lines or ANSI output appears in `--json`.
+3. Exit codes remain stable.
+
+### Compatibility
+
+1. Output remains readable with color disabled.
+2. Output remains readable when piped or in CI.
+3. The CLI does not require truecolor or Nerd Fonts.
+
+### Performance
+
+1. The beauty layer does not introduce visible lag in common flows.
+2. No progress rendering causes measurable regressions similar to the npm
+ progress-bar failure mode.
+
+## Execution Order
+
+This is the recommended sequence.
+
+### Current Branch State And Revised Sequencing
+
+As of branch head `0afda69`, the plan above is no longer hypothetical.
+Substantial parts of the foundation are already present:
+
+- a human renderer/theme layer exists
+- `status`, `sources`, and `data` have been upgraded materially
+- `vana connect` has a guided no-source entrypoint with clearer cancellation
+ and direct-command copy
+- structured runtime `status-update` and `progress-update` events now exist
+- deterministic success demos, transcripts, and README-facing VHS assets are
+ publishing from CI
+- successful connects now land with a stronger payoff moment, including saved
+ session messaging
+
+That means the next phase should **not** start from Phase 1 again and should
+not repeat the already-finished product-truth work.
+The correct sequence from here is:
+
+### Batch 1: Product-truth and demo-proofing follow-through
+
+This batch is mostly complete. The remaining work should be follow-through only:
+
+- keep tightening any remaining rough edges surfaced by real acceptance tests
+- broaden acceptance coverage across:
+ - migrated/requestInput connectors
+ - legacy/manual connectors
+ - unsupported sources
+ - saved-session reuse cases
+- keep README, transcripts, and published demo assets aligned with the current
+ canary
+
+Important:
+
+- this is now a cleanup/follow-through lane, not the main product frontier
+- it should still be pushed as larger release cycles, not many tiny deployment
+ cycles
+
+### Batch 2: Deep beauty, static-first
+
+This is now the main frontier. Focus first on the surfaces that are already
+semantically stable.
+
+- refine spacing, hierarchy, and semantic color usage across:
+ - `status`
+ - `sources`
+ - `data list`
+ - `data show`
+ - guided `connect`
+- make the renderer feel distinctly Vana without saturating the terminal
+- improve line rhythm, section headings, bullets, and emphasis
+- upgrade README VHS assets so they reflect this calmer, more deliberate visual
+ language
+
+This is where the CLI should start to feel clearly above the current baseline,
+but without touching the machine contract.
+
+### Batch 3: Deep beauty, connect narrative
+
+After static surfaces are strong, apply the beauty work to the long-running
+human `connect` flow itself.
+
+- phase transitions that feel calm and intentional
+- better pacing from prepare -> connect -> continue -> success/failure
+- stronger trust framing before auth/input collection
+- tasteful spinner/checkmark transitions where terminal capabilities allow them
+- cleaner cancellation language
+- cleaner local-only vs Personal Server success distinction
+
+This batch should make `vana connect ` feel like a product journey, not
+just a sequence of log lines.
+
+### Operational polish after beauty
+
+Do not let the beauty work crowd out the less visible CLI quality bar.
+
+After the static and connect-narrative beauty batches are stable, run an
+explicit operational-polish pass covering:
+
+- `vana --version` / `vana version`
+- help quality
+- a diagnostics surface, likely `vana doctor`
+- exit-code matrix review
+- JSON contract audit
+- upgrade/uninstall/channel clarity
+
+This is part of the best-in-class bar even though it is not primarily visual.
+
+### Batch 4: Runtime event enrichment for beauty
+
+Only after the human connect narrative is visibly better should we deepen the
+runtime event model further.
+
+- add any remaining phase/count/completion metadata needed for better summaries
+- avoid log scraping entirely for human rendering
+- preserve a pristine `--json` contract while making human progress richer
+
+This keeps the event model in service of product quality rather than speculative
+framework-building.
+
+### Batch 5: Public polish and release hardening
+
+- keep README demos, transcripts, installer paths, and Homebrew output aligned
+ to the same canary
+- acceptance-test the published artifact as if discovering the project cold
+- only then decide what is ready to graduate from canary to a more stable lane
+
+### Batching rule from here
+
+The branch should now prefer:
+
+- larger locally validated batches of product/UI work
+- fewer deployment cycles
+- deployment-triggering pushes only when the batch is worth external proof
+
+Break that rule only for:
+
+- a release-path regression
+- a platform-specific packaging failure
+- a public artifact problem that needs immediate isolation
+
+This is the right optimization now that the runtime and installer paths are
+substantially real.
+
+### Deployment streamlining guidance
+
+The current bottleneck is no longer local implementation speed. It is repeated
+publish/verify latency.
+
+The branch should therefore optimize for:
+
+- one larger, coherent product/UI batch per release cycle
+- one canonical local preflight before pushing
+- one canonical post-publish verification path
+
+Recommended operating model:
+
+1. Local preflight should become one command that runs the entire release-ready
+ guardrail set:
+ - tests
+ - lint/format
+ - build
+ - transcript capture
+ - demo rendering verification
+ - release-asset assertions
+2. The publish path should become one watcher-driven flow:
+ - wait for CI/canary
+ - sync Homebrew
+ - verify hosted installer
+ - verify demo assets
+3. Avoid pushing while a release lane is still proving the previous batch unless
+ the current head is blocked by:
+ - a release-path failure
+ - a packaging/platform regression
+ - a public artifact issue
+
+In other words:
+
+- local work should continue optimistically
+- release validation should be automated
+- publish-triggering pushes should be less frequent and more substantial
+
+### Phase 1: Foundation
+
+- add capability detection
+- add semantic theme and symbol layer
+- add human renderer primitives
+- preserve exact `--json` behavior
+- add VHS scaffolding and one README-quality demo tape
+
+### Phase 2: Static surface upgrade
+
+- `sources`
+- `status`
+- `data list`
+- `data show`
+- `connect` no-source guided entrypoint
+- embed at least one generated VHS SVG in the README
+
+### Phase 3: Connect flow narrative
+
+- phase rendering
+- spinner/checkmark transitions
+- trust framing
+- prompt migration
+- success summary and next-step polish
+
+### Phase 4: Runtime event enrichment
+
+- structured progress events
+- count updates
+- richer completion metadata for summaries
+
+### Phase 5: Hardening and review
+
+- compatibility downgrade pass
+- transcript tests
+- VHS recordings for review
+- acceptance testing on:
+ - macOS Homebrew install
+ - Linux installer path
+ - SSH/tmux if possible
+
+## Testing Strategy
+
+Beauty work should be tested at three levels.
+
+### 1. Snapshot/transcript tests
+
+Golden transcript tests for:
+
+- `status`
+- `sources`
+- `connect` happy path
+- `connect` local-only success
+- `connect` needs input
+- `connect` legacy/manual
+- `data show`
+
+### 2. JSON contract tests
+
+Explicit regression tests that beauty work does not affect:
+
+- event names
+- event fields
+- exit codes
+
+### 3. Recorded review artifacts
+
+Use terminal recordings for human review:
+
+- VHS or equivalent
+- one tape per key journey
+
+This matters because terminal beauty is hard to review from code alone.
+
+## The Next Four Concrete Tasks
+
+From the current branch state, do these next:
+
+1. Add a deterministic successful `connect` fixture and README-quality tape
+ that shows real progress and a real success moment.
+2. Make the post-success loop feel complete:
+ - strengthen `connect` success summary
+ - strengthen `vana data show`
+ - strengthen `vana status`
+3. Add broader transcript and acceptance coverage across migrated, legacy, and
+ unsupported connector flows.
+4. Then start the deep beauty pass on static surfaces before touching the
+ long-running `connect` narrative again.
+
+That sequence gives the highest user-visible value from the current branch
+state while keeping release cycles efficient.
+
+## Final Standard
+
+The final beauty bar is not:
+
+- “the CLI looks fancy”
+
+It is:
+
+- the first command feels obvious
+- the connect flow feels calm and trustworthy
+- success feels earned
+- failure feels understandable
+- the CLI remains composable
+- the machine contract stays pristine
+
+If those are true, `vana` will feel much closer to the best references than it
+does now, without turning into a fragile terminal toy.
diff --git a/docs/CLI-BUILD-PLAN.md b/docs/CLI-BUILD-PLAN.md
new file mode 100644
index 00000000..717b6e12
--- /dev/null
+++ b/docs/CLI-BUILD-PLAN.md
@@ -0,0 +1,249 @@
+# `vana-connect` CLI Build Plan
+
+_As of March 12, 2026_
+
+## Purpose
+
+This document turns the v1 spec into a concrete implementation sequence and assigns work to the right repositories.
+
+The goal is to move from strategy to build with minimal confusion.
+
+## Repo responsibilities
+
+### `vana-connect`
+
+Primary implementation home for:
+
+- the new shared core
+- the new CLI
+- the runtime orchestration SDK layer
+- future monorepo structure
+
+Why:
+
+- it already has TypeScript SDK infrastructure
+- it already uses `pnpm` workspaces
+- it is already the public-facing `vana-connect` product repo
+
+Relevant files:
+
+- [package.json](../package.json)
+- [pnpm-workspace.yaml](../pnpm-workspace.yaml)
+
+### `data-connectors`
+
+Remain the source of truth for:
+
+- connector scripts
+- connector registry
+- schemas
+- current skill docs
+- current bootstrap/helper scripts until migrated or wrapped
+
+Relevant files:
+
+- [registry.json](https://github.com/vana-com/data-connectors/blob/main/registry.json)
+- [skills/vana-connect/scripts](https://github.com/vana-com/data-connectors/blob/main/skills/vana-connect/scripts)
+
+### `data-connect`
+
+Reference implementation and integration source for:
+
+- Personal Server ingest behavior
+- desktop app runtime assumptions
+- current local execution/ingest patterns
+
+Relevant file:
+
+- `data-connect/src/services/personalServerIngest.ts`
+
+## Practical conclusion
+
+Build the CLI in `vana-connect`.
+
+Do not move connectors or the skill immediately.
+
+In v1, the new CLI should wrap and reuse parts of the current `data-connectors` runtime flow where sensible, while establishing cleaner package boundaries in `vana-connect`.
+
+## Proposed package shape in `vana-connect`
+
+Initial target:
+
+- `src/core`
+ - shared types, outcomes, events, errors
+- `src/cli`
+ - command handlers, formatters, prompts
+- `src/runtime`
+ - runtime abstraction and v1 Playwright runtime adapter
+- `src/connectors`
+ - registry/source resolution helpers
+- `src/personal-server`
+ - target detection and ingest helpers
+
+If needed later, this can become true workspace packages:
+
+- `packages/connect-core`
+- `packages/connect-sdk`
+- `packages/connect-cli`
+- `packages/connect-runtime-playwright`
+
+For speed, v1 can start inside the existing `src/` tree and split into packages after the flow is proven.
+
+## Why not force workspace packages immediately
+
+Because the real risk is UX and command behavior, not package boundaries.
+
+Start with clean module boundaries inside `vana-connect`. Split into multiple publishable packages only when it helps materially.
+
+## Build phases
+
+### Phase 1: Core contracts
+
+Implement in `vana-connect`:
+
+- outcome types
+- event types for `--json`
+- state types
+- error model
+- config paths and log-path helpers
+
+This should be the first code layer because the CLI and SDK will both depend on it.
+
+### Phase 2: Runtime adapter
+
+Implement a v1 runtime adapter around the current Playwright-based flow.
+
+Scope:
+
+- runtime install check
+- runtime setup invocation
+- connector run invocation
+- event normalization from current runner output
+- headed fallback support preserved
+
+This layer can initially shell out to existing scripts in `data-connectors` / existing runner artifacts if needed.
+
+### Phase 3: Connector resolution
+
+Implement:
+
+- source lookup
+- list command data source
+- connector fetch/wrap behavior
+
+For v1, this can reuse existing registry and download logic rather than replacing it from scratch.
+
+### Phase 4: Personal Server target detection and ingest
+
+Implement:
+
+- Personal Server availability detection
+- ingest attempt when target is available
+- explicit local-only vs ingested outcomes
+
+For v1, keep this small and honest.
+
+### Phase 5: CLI command handlers
+
+Implement:
+
+- `vana connect `
+- `vana connect status`
+- `vana connect list`
+- `vana connect setup`
+
+Plus:
+
+- `--json`
+- `--no-input`
+- `--yes`
+
+### Phase 6: Human-mode polish
+
+Implement:
+
+- onboarding copy from the copy doc
+- compact progress formatting
+- log-path surfacing
+- concise success/failure summaries
+
+This phase matters more than it sounds. It is where the CLI stops feeling like a wrapper.
+
+### Phase 7: Tests
+
+Add tests for:
+
+- command parsing
+- outcome/event normalization
+- `--json` output contract
+- `--no-input` behavior
+- local-only vs ingested outcome distinction
+
+Focus on contract tests first, not deep end-to-end coverage.
+
+## Concrete first implementation slice
+
+The first shippable slice should be:
+
+1. `vana connect steam`
+2. inline setup if missing
+3. connector resolution
+4. run via existing runtime
+5. JSON event output
+6. human summary
+7. `vana connect status`
+
+That slice is enough to validate the architecture and the first-run UX.
+
+## Likely technical choices in `vana-connect`
+
+- TypeScript
+- Node runtime
+- `commander` for command grammar
+- `@inquirer/prompts` for interactive input
+- `zod` for event/state validation
+- file-backed logs under `~/.vana/logs`
+
+These choices match the current system boundary and optimize for speed.
+
+## What not to build first
+
+Do not start with:
+
+- a TUI
+- multi-runtime plugin marketplace
+- full package splitting ceremony
+- connect-all onboarding
+- scheduling
+- broad environment management
+
+Those would slow down the MVP and dilute the first-run quality bar.
+
+## Immediate next coding steps
+
+1. Create CLI/core/runtime module skeleton in `vana-connect`.
+2. Add command parser and placeholder commands.
+3. Implement core types and event contract.
+4. Wrap current runtime/setup/connector resolution behavior.
+5. Implement `vana connect ` happy path.
+6. Implement `status`.
+7. Add tests for the contract and modes.
+
+## Decision summary
+
+- Build in `vana-connect`
+- Reuse `data-connectors` assets rather than moving them now
+- Keep `data-connect` as reference for Personal Server ingest behavior
+- Optimize for one excellent source-connect flow first
+- Delay broader repo/package reorganization until the flow is proven
+
+## Conclusion
+
+The next work should be code in `vana-connect`, not more product exploration.
+
+The MVP path is:
+
+- thin architecture
+- strong contracts
+- one polished connect flow
+- one honest status command
diff --git a/docs/CLI-COMMAND-MODEL.md b/docs/CLI-COMMAND-MODEL.md
new file mode 100644
index 00000000..34fdebbd
--- /dev/null
+++ b/docs/CLI-COMMAND-MODEL.md
@@ -0,0 +1,342 @@
+# `vana-connect` CLI Command Model
+
+_As of March 12, 2026_
+
+## Purpose
+
+This document defines the intended command surface for the first version of the `vana connect` CLI.
+
+It is optimized for:
+
+- excellent first-run onboarding
+- one command model for humans and coding agents
+- fast MVP delivery using existing runtime primitives
+- future support for local and cloud Personal Server environments
+
+## Product stance
+
+The CLI should feel like one coherent product, not a bundle of repo scripts.
+
+That means:
+
+- public commands should be user-journey shaped
+- internal scripts should remain implementation details
+- the command surface should be small in v1
+
+## Command namespace
+
+Assume the CLI command family is:
+
+```bash
+vana connect ...
+```
+
+This keeps the door open for a broader `vana` command family while making “connect” the product surface for data portability.
+
+## MVP top-level commands
+
+For v1, the public surface should be limited to:
+
+- `vana connect `
+- `vana connect list`
+- `vana connect status`
+- `vana connect setup`
+
+Optional for v1 if it is cheap:
+
+- `vana connect inspect `
+
+Not v1:
+
+- large TUI mode
+- scheduling
+- “connect everything” as the default first-run path
+- extensive admin/config subtrees
+
+## Canonical first command
+
+The canonical first command should be:
+
+```bash
+vana connect
+```
+
+Example:
+
+```bash
+vana connect steam
+```
+
+Why:
+
+- it matches the user’s intent directly
+- it reduces onboarding ceremony
+- it allows the CLI to inline setup when safe
+- it creates a more Vercel/`uv`-like first impression than forcing `setup`
+
+## Command details
+
+### `vana connect `
+
+Primary job:
+
+- connect one source end-to-end
+
+Expected behavior:
+
+1. validate CLI/runtime prerequisites
+2. if needed, explain and perform setup
+3. fetch or resolve the connector
+4. check existing auth/session state
+5. run collection
+6. request input if needed
+7. ingest to the active Personal Server if available
+8. summarize outcome
+
+Expected output in human mode:
+
+- what is happening
+- what, if anything, is being installed
+- whether existing session is being reused
+- what data was collected
+- whether data was ingested or only stored locally
+- what to do next
+
+Expected output in machine mode:
+
+- stable event objects
+- stable outcome state
+- stable indication of whether ingest occurred
+
+### `vana connect list`
+
+Primary job:
+
+- show what can be connected
+
+Expected behavior:
+
+- list known/supported sources
+- indicate installed/not installed
+- optionally indicate previously connected
+
+This command should stay simple in v1.
+
+### `vana connect status`
+
+Primary job:
+
+- show the current health and state of the local setup
+
+Expected behavior:
+
+- report runtime installation status
+- report active Personal Server target status
+- report installed connectors
+- report saved session presence
+- report recent run outcomes
+- report local-only vs ingested state when known
+
+This is the key trust/recovery command.
+
+### `vana connect setup`
+
+Primary job:
+
+- explicit bootstrap / repair / preinstall
+
+Expected behavior:
+
+- install runtime prerequisites
+- verify expected artifacts exist
+- explain what was installed
+
+Important:
+
+- this should exist
+- but it should not be the default onboarding path
+
+## Optional v1 command
+
+### `vana connect inspect `
+
+Primary job:
+
+- inspect detailed status for one source
+
+Possible output:
+
+- connector installed?
+- session present?
+- last successful run?
+- last error?
+- last known scopes collected?
+- last result path?
+
+This is useful, but `status` matters more.
+
+## Mode model
+
+The mode model should be flag-based, not command-based.
+
+### Required flags
+
+- `--json`
+- `--no-input`
+- `--yes`
+
+Likely useful:
+
+- `--quiet`
+
+### Behavior expectations
+
+#### Human default
+
+- concise human-readable output
+- safe prompts allowed
+- progress visible
+- summarized outcome
+
+#### `--json`
+
+- no decorative output
+- stable machine-readable events/results
+
+#### `--no-input`
+
+- do not prompt
+- fail clearly if input is required
+
+#### `--yes`
+
+- auto-approve safe setup/install confirmations
+
+## Environment model
+
+The CLI should be environment-aware from the beginning, even if v1 only supports a narrow subset.
+
+### Concept
+
+Users should think:
+
+- “I have a Personal Server”
+
+Not:
+
+- “I have to reason about desktop app internals vs cloud infra internals”
+
+### Minimal MVP environment vocabulary
+
+The CLI should understand:
+
+- active Personal Server target available or unavailable
+- local-only data collection vs ingested data
+
+Possible future targets:
+
+- local desktop-bundled Personal Server
+- self-hosted Personal Server
+- cloud-hosted Personal Server
+
+For v1, do not expose a large environment-management surface. Just make sure commands and status output do not assume localhost forever.
+
+## State model expectations
+
+The command surface depends on a small number of visible states.
+
+### Runtime state
+
+- installed
+- missing
+- unhealthy
+
+### Source state
+
+- known
+- installed
+- authenticated/session-present
+- needs re-auth
+- run succeeded
+- run failed
+
+### Data state
+
+- collected locally
+- ingested to Personal Server
+- ingest unavailable
+- ingest failed
+
+These states should show up explicitly in status and machine-readable output.
+
+## Outcome model for `connect`
+
+This is important for both UX and SDK design.
+
+### Success classes
+
+- `connected_and_ingested`
+- `connected_local_only`
+
+### Recoverable failure classes
+
+- `needs_input`
+- `setup_required`
+- `personal_server_unavailable`
+- `auth_failed`
+- `connector_unavailable`
+- `ingest_failed`
+
+### Hard failure classes
+
+- `runtime_error`
+- `invalid_connector`
+- `unexpected_internal_error`
+
+These names do not need to be final, but the CLI should think in this shape.
+
+## How this maps to current primitives
+
+The existing scripts already provide useful internals:
+
+- setup bootstrap
+- connector fetch
+- connector run lifecycle
+- validation
+
+The CLI should wrap those behaviors, not expose them raw.
+
+Rough mapping:
+
+- `vana connect setup`
+ - wraps current setup script behavior
+- `vana connect `
+ - wraps connector fetch + run + ingest
+- `vana connect status`
+ - wraps local state inspection
+- `vana connect list`
+ - wraps registry/discovery behavior
+
+## What to defer
+
+To keep MVP sharp, defer:
+
+- multi-step onboarding wizards
+- full source marketplace UX
+- scheduling and daemonization
+- bulk connect-all default flow
+- broad config management trees
+- TUI-first interaction
+
+These may be useful later, but they are not necessary to create a strong first impression now.
+
+## Conclusion
+
+The MVP command surface should be intentionally small:
+
+- connect one source
+- list sources
+- inspect status
+- run setup explicitly when desired
+
+That is enough to ship quickly while still creating a product surface that feels deliberate, modern, and extensible.
diff --git a/docs/CLI-CONNECT-EXECUTION-PLAN.md b/docs/CLI-CONNECT-EXECUTION-PLAN.md
new file mode 100644
index 00000000..455c06bd
--- /dev/null
+++ b/docs/CLI-CONNECT-EXECUTION-PLAN.md
@@ -0,0 +1,179 @@
+# Connect Flow Execution Plan
+
+_March 17, 2026_
+
+Implementation plan for transforming `vana connect `.
+
+Reference: CLI-DESIGN-SKILL.md for principles, CLI-CONNECT-FLOW-DESIGN.md
+for the full path tree.
+
+## Phase 1: ConnectRenderer (the new rendering primitive)
+
+Build `src/cli/render/connect-renderer.ts` — a phase-aware renderer
+specific to the connect flow.
+
+```typescript
+interface ConnectRenderer {
+ title(source: string): void;
+ scopeActive(scope: string): void;
+ scopeDone(scope: string, detail?: string): void;
+ scopeFailed(scope: string, error: string): void;
+ success(message: string): void;
+ detail(message: string): void;
+ fail(message: string): void;
+ bell(): void;
+ cleanup(): void;
+}
+```
+
+Implementation:
+
+- Uses `ora` for the active scope spinner (already a dependency)
+- Scope lines rendered below the spinner via ANSI cursor control
+ (`\x1b[{n}A` to move up, `\x1b[2K` to clear)
+- On non-TTY: falls back to simple `console.log` per line
+- Track line count for cursor management
+- ~80–100 lines of code. No new dependencies.
+
+Test: unit test that captures stdout and verifies line content
+and order for happy path, failure, and mixed scenarios.
+
+## Phase 2: Refactor runConnect()
+
+Rewrite the rendering path of `runConnect()` to use ConnectRenderer
+instead of the generic emitter.
+
+Keep:
+
+- All event emission (unchanged)
+- All state updates (unchanged)
+- All `--json` behavior (unchanged)
+- The runtime event loop (unchanged)
+
+Change:
+
+- Title: `renderer.title("GitHub")` instead of `emit.title()`
+ - `emit.section("Preparing")`
+- Phase transitions: remove `emit.section("Connecting")`,
+ `emit.info("Collecting your data...")`
+- Scope progress: map `progress-update` events to
+ `renderer.scopeActive()` / `renderer.scopeDone()`
+- Success: `renderer.success("Connected GitHub.")` +
+ `renderer.detail(...)` instead of multi-section output
+- Bell: `renderer.bell()` on completion
+- Failure: `renderer.fail()` with recovery command
+
+Phases that should produce NO output when fast:
+
+- Runtime check (<100ms when installed)
+- Connector fetch (<1s when cached)
+- Session reuse check
+
+Phases that DO produce output:
+
+- Runtime install (needs user confirmation — use existing prompt)
+- Credential prompts (use @clack/prompts text/password)
+- Collection progress (scope manifest via ConnectRenderer)
+- Success/failure summary
+
+## Phase 3: Styled prompts
+
+Add `@clack/prompts` as a dependency. Use it ONLY for the interactive
+input components: text(), password(), confirm(), select().
+
+Wire into the connector runtime's `requestInput` callback:
+
+- Map connector field requests to clack text/password prompts
+- Map setup confirmation to clack confirm
+- Map source picker to clack select
+
+Keep our visual framing (no clack intro/outro/bars). Just the input
+components.
+
+## Phase 4: Polish details
+
+- Bold on title and success line only
+- Green ✓, red ✗, blue spinner, muted counts
+- One blank line before success (the pause)
+- Terminal bell on completion
+- "Next:" line uses journey-aware suggestion logic
+- Failure auto-retries connector update before giving up
+- Cancellation renders "Cancelled." and exits cleanly
+
+## Phase 5: Path tree coverage
+
+Test every branch from CLI-CONNECT-FLOW-DESIGN.md:
+
+Happy paths:
+
+- [ ] Session reuse, PS available, all scopes sync
+- [ ] Session reuse, PS available, partial sync
+- [ ] Session reuse, no PS
+- [ ] First time, credentials needed, PS available
+- [ ] First time, 2FA needed
+- [ ] First time, runtime setup needed
+
+Failure paths:
+
+- [ ] Source not in registry
+- [ ] Connector download fails
+- [ ] Checksum mismatch (auto-retry with latest)
+- [ ] Browser auth, no display available
+- [ ] Collection fails mid-way (auto-retry)
+- [ ] Collection timeout
+- [ ] --no-input + needs input
+- [ ] User cancels prompt
+- [ ] User cancels with ctrl+c
+
+Edge cases:
+
+- [ ] Non-TTY output (CI, piped)
+- [ ] --json mode unchanged
+- [ ] --quiet mode
+- [ ] Narrow terminal (<80 cols)
+
+## Phase 6: Consistency pass
+
+After connect is transformed, audit other commands for consistency:
+
+- `vana collect` should use the same scope manifest
+- `vana server sync` should use similar progress rendering
+- `vana sources` and `vana status` should use the same symbol
+ and color vocabulary (but NOT the ConnectRenderer — they're
+ static, not temporal)
+- Help text and error messages should follow the copy principles
+
+## Phase 7: Transcripts and demos
+
+- Update CLI-TRANSCRIPTS.md with new connect output
+- Update VHS tapes for connect flows
+- Update CLI-REVIEW-SURFACE.md
+- Run `pnpm validate`
+
+## Phase 8: Release
+
+- Commit as one coherent batch
+- Push to feat/connect-cli-v1
+- Verify canary release
+- Test via `pnpm dlx @opendatalabs/connect@canary connect github`
+
+## What NOT to change
+
+- Other commands' emitter (status, doctor, sources, data)
+- The event model
+- The state model
+- --json output
+- Exit codes
+- The CLI-DESIGN-SKILL.md principles
+
+## Estimated scope
+
+- ConnectRenderer: ~100 lines new code
+- runConnect refactor: ~200 lines changed (net reduction likely)
+- @clack/prompts integration: ~50 lines
+- Polish details: ~30 lines
+- Tests: ~150 lines new
+- Transcripts/docs: updates only
+
+This is a focused refactor of one command's rendering layer.
+The architecture, event model, and machine contract are untouched.
diff --git a/docs/CLI-CONNECT-FLOW-DESIGN.md b/docs/CLI-CONNECT-FLOW-DESIGN.md
new file mode 100644
index 00000000..1e884801
--- /dev/null
+++ b/docs/CLI-CONNECT-FLOW-DESIGN.md
@@ -0,0 +1,546 @@
+# `vana connect` Flow Design
+
+_March 17, 2026_
+
+Deep design document for transforming `vana connect ` into a
+best-in-class CLI experience. Informed by:
+
+- CLI-BEAUTY-IMPLEMENTATION-PLAN.md (temporal design > static styling)
+- CLI-UX-SIMULATION.md (approved success shapes)
+- CLI-ONBOARDING-COPY.md (tone and trust principles)
+- CLI-UX-QUALITY-BAR.md (beauty = clarity, restraint, confidence, pacing)
+- Donella Meadows' leverage points framework
+- Prior art from Vercel, gh, Railway, Stripe, Elm, Cargo, pnpm
+
+## The leverage point
+
+The CLI's beauty problem is at leverage point #5 (system rules): the
+emitter architecture forces all output through uniform primitives
+(`section`, `detail`, `keyValue`). This produces the same visual shape
+regardless of whether the moment calls for anticipation, vulnerability,
+progress, or celebration.
+
+The transformation: the connect flow's renderer should be
+**phase-aware**, not **line-aware**. Each phase has its own temporal
+behavior — some are instantaneous, some are long waits, some require
+user action. The rendering should reflect that.
+
+## Full path tree
+
+```
+vana connect [source]
+│
+├─ No source specified
+│ └─ Guided picker → user selects → continue with source
+│
+├─ Source not in registry
+│ └─ Error: "{source} is not available."
+│ Next: "vana sources" to see what's available
+│
+├─ Source found
+│ │
+│ ├─ Phase 1: Runtime check (<100ms if installed)
+│ │ ├─ Installed → silent, continue
+│ │ ├─ Missing + interactive
+│ │ │ ├─ User confirms → install flow (10-60s)
+│ │ │ │ ├─ Install succeeds → continue
+│ │ │ │ └─ Install fails → error + "vana setup"
+│ │ │ └─ User declines → clean exit
+│ │ ├─ Missing + --yes → auto-install
+│ │ └─ Missing + --no-input → fail exit
+│ │
+│ ├─ Phase 2: Connector fetch (<1s if cached)
+│ │ ├─ Cached + valid → silent, continue
+│ │ ├─ Needs download → download (1-5s)
+│ │ │ ├─ Download succeeds → continue
+│ │ │ ├─ Checksum mismatch → error
+│ │ │ └─ Network error → error
+│ │ └─ Not available → error: connector unavailable
+│ │
+│ ├─ Phase 3: Pre-connection validation
+│ │ ├─ Interactive auth → continue to prompts
+│ │ ├─ Browser auth + display available → continue to browser
+│ │ ├─ Browser auth + no display → error: needs display
+│ │ └─ Existing session found → try reuse (skip auth)
+│ │
+│ ├─ Phase 4: Authentication (0s if session reuse, 5-30s if manual)
+│ │ ├─ Session reuse succeeds → continue silently
+│ │ ├─ Session expired → prompt for re-auth
+│ │ ├─ First time → prompt for credentials
+│ │ │ ├─ User enters credentials → continue
+│ │ │ ├─ 2FA required → prompt for code
+│ │ │ │ ├─ Code accepted → continue
+│ │ │ │ └─ Code rejected → error
+│ │ │ └─ User cancels → clean exit
+│ │ ├─ Browser auth → browser opens
+│ │ │ ├─ User completes in browser → continue
+│ │ │ ├─ User doesn't complete → timeout
+│ │ │ └─ Browser can't open → error
+│ │ └─ --no-input + needs input → fail exit with needs_input
+│ │
+│ ├─ Phase 5: Collection (5-60s)
+│ │ ├─ All scopes collected → continue
+│ │ ├─ Partial collection → continue with warning
+│ │ ├─ Site changed / extraction fails → error
+│ │ ├─ Timeout → error
+│ │ └─ Runtime crash → error
+│ │
+│ ├─ Phase 6: Ingest (0-5s)
+│ │ ├─ PS available + all scopes synced → full success
+│ │ ├─ PS available + partial sync → qualified success
+│ │ ├─ PS available + all fail → local success + sync warning
+│ │ ├─ PS not available → local success
+│ │ └─ PS not configured → local success
+│ │
+│ └─ Phase 7: Summary
+│ ├─ Full success → trophy moment
+│ ├─ Local success → success + PS guidance
+│ ├─ Partial → qualified success + retry guidance
+│ └─ Failure → error + recovery guidance
+```
+
+## Emotional journey map
+
+Each moment in the flow has an emotional quality that the rendering
+should serve.
+
+| Moment | User feeling | Design goal | Duration |
+| ---------------------- | -------------------- | ------------------------------------------ | ---------------- |
+| Types command | Expectation | Acknowledge immediately | 0ms |
+| Runtime check | Mild anxiety | Invisible if fast, calm if slow | <100ms or 10-60s |
+| Connector fetch | Mild anxiety | Invisible if cached, brief if downloading | <1s or 1-5s |
+| Trust decision (setup) | Vulnerability | Explain clearly, respect the decision | User-paced |
+| Auth prompt | Vulnerability | Minimal, precise, local-first framing | User-paced |
+| 2FA prompt | Time pressure | Fast, no lag between prompt and submission | User-paced |
+| Collection start | Anticipation | Show something is happening | 0s |
+| Collection progress | Patience/relief | Show meaningful progress, not chatter | 5-60s |
+| Collection complete | Satisfaction | Clear transition from "working" to "done" | 0s |
+| Sync | Background | Mention only if noteworthy | 0-5s |
+| Success | Pride/accomplishment | Outcome-shaped, not task-shaped | 0s |
+| What now | Agency | One clear next action | 0s |
+
+### The critical insight: duration determines rendering
+
+- **<100ms**: Don't show anything. No spinner, no text. Just continue.
+- **100ms-1s**: Brief inline text, no spinner. "Connector ready."
+- **1-10s**: Spinner. One line, updating in place.
+- **10s+**: Spinner with progress detail. Show what's happening.
+- **User-paced**: Prompt. No spinner. Wait calmly.
+
+This means most phases should be INVISIBLE in the happy path.
+Runtime check? Invisible (<100ms). Connector fetch? Invisible (cached).
+Session reuse? Invisible. The only visible phases are:
+
+1. Collection progress (the long wait)
+2. Success summary (the payoff)
+
+Everything else should be silent unless it takes time or needs attention.
+
+## Prior art for each moment
+
+### The long wait (collection progress)
+
+**Vercel deploy**: Shows build steps appearing one at a time. Each step
+is a line that appears, works briefly, then gets a checkmark. The active
+step has a spinner. Completed steps stay visible but dimmed.
+
+**Cargo build**: Shows crate names as they compile. Fast crates flash
+by. Slow crates show a progress indicator. The pacing feels productive
+because real work is visible.
+
+**pnpm install**: Resolution bar fills, then packages appear as they
+download. The visual density communicates "a lot is happening."
+
+**What we should do**: Show scope names as they complete. Not a spinner
+with changing text — actual lines appearing as data arrives:
+
+```
+ Profile ✓
+ Repositories ✓ 8 found
+ Starred ✓
+```
+
+Each line appears when that scope starts, gets a spinner, then resolves
+to a checkmark with a count. This is the phase list from version 1 of
+the design, but ONLY for the collection phase (not runtime, not
+connector fetch — those are invisible).
+
+### The payoff (success summary)
+
+**Vercel deploy**: `✅ Production: https://my-app.vercel.app` — one
+line with the thing you wanted (the URL). Then `Inspect:` link.
+
+**gh pr create**: `https://github.com/org/repo/pull/123` — just the
+URL. The thing you wanted.
+
+**Railway deploy**: `Deployed to https://...` — the result.
+
+Pattern: **the success line contains the ONE THING the user wanted**.
+For Vercel, it's the URL. For gh, it's the PR link. For vana connect,
+it's: "your data is connected and accessible."
+
+**What we should do** (per the UX simulation):
+
+```
+Connected GitHub.
+Collected your GitHub data and synced it to your Personal Server.
+```
+
+Two lines. The outcome. Not a data table, not a file path, not a list
+of scopes. If the user wants details, `vana data show github`.
+
+### The trust moment (setup/auth)
+
+**macOS permission dialogs**: Explain what will happen, what stays
+private, let the user decide. No pressure. The CLI-ONBOARDING-COPY.md
+already has this right:
+
+```
+This will install:
+- the connector runner
+- a Chromium browser engine
+- local runtime files under ~/.vana/
+
+Your credentials stay on this machine.
+Continue? [Y/n]
+```
+
+**ssh-keygen**: Simple prompt, no preamble. "Enter passphrase:"
+No explanation of what a passphrase is or why you need one.
+
+The right approach depends on whether it's the user's first time or
+a re-run. First time: explain. Re-run: just prompt.
+
+### Failure recovery
+
+**Elm compiler**: The gold standard. Errors explain WHAT went wrong,
+WHY it's wrong, and HOW to fix it. Each error is a mini-tutorial.
+
+**Rust/Cargo**: Similar — errors include the fix inline.
+
+**gh**: "To authenticate, run: gh auth login" — one sentence, one
+command.
+
+**What we should do**: Every failure ends with exactly ONE command
+the user can run to move forward. Not three suggestions. Not "check
+the docs." One command.
+
+### Cancellation
+
+**ctrl+c in any best-in-class CLI**: Clean exit. No stack trace. No
+partial state corruption. Brief message: "Cancelled." or nothing.
+
+**What we should do**: If user cancels during collection, say:
+"Cancelled. No data was saved." If they cancel during auth: "Cancelled."
+One word.
+
+## The design
+
+### Principle: invisible unless noteworthy
+
+Most phases should produce NO output in the happy path. The user types
+`vana connect github` and sees:
+
+```
+Connecting GitHub...
+```
+
+One line. A spinner. That's it until something noteworthy happens.
+
+If the runtime needs setup (noteworthy — requires user decision):
+the spinner stops, the setup prompt appears. After setup, the spinner
+resumes.
+
+If credentials are needed (noteworthy — requires user action):
+the spinner stops, the prompt appears. After auth, the spinner resumes.
+
+If collection is progressing (noteworthy — takes time):
+scope lines appear below the spinner as they complete:
+
+```
+Connecting GitHub...
+ ✓ Profile
+ ✓ Repositories — 8 found
+ ● Starred
+```
+
+When everything completes, the spinner line resolves and the success
+summary appears:
+
+```
+✓ Connected GitHub.
+ Collected your GitHub data and synced it to your Personal Server.
+
+ Next: vana data show github
+```
+
+### The full happy path (session reuse, PS available)
+
+```
+$ vana connect github
+Connecting GitHub...
+ ✓ Profile
+ ✓ Repositories — 8 found
+ ✓ Starred — 0 found
+
+✓ Connected GitHub.
+ Collected your GitHub data and synced it to your Personal Server.
+
+ Next: vana data show github
+```
+
+Total visible output: 8 lines. The phase progression IS the collection
+progress. The success IS the outcome.
+
+### First-time with setup
+
+```
+$ vana connect github
+
+Vana Connect needs a local browser runtime.
+
+This will install:
+ • Connector runner
+ • Chromium browser engine
+ • Local files under ~/.vana/
+
+Your credentials stay on this machine.
+
+Continue? [Y/n] y
+
+Installing runtime...
+✓ Runtime ready.
+
+Connecting GitHub...
+ ✓ Profile
+ ✓ Repositories — 8 found
+ ✓ Starred — 0 found
+
+✓ Connected GitHub.
+ Collected your GitHub data and synced it to your Personal Server.
+
+ Next: vana data show github
+```
+
+Setup is a separate visual block. Once done, the connect flow
+continues as normal.
+
+### With credential prompt
+
+```
+$ vana connect github
+Connecting GitHub...
+
+GitHub needs your login.
+
+Username: alice
+Password: ********
+
+Connecting GitHub...
+ ✓ Profile
+ ✓ Repositories — 12 found
+ ✓ Starred — 3 found
+
+✓ Connected GitHub.
+ Collected your GitHub data and saved it locally.
+
+ Next: vana data show github
+```
+
+The spinner pauses for the prompt, then resumes. The prompt is minimal
+— no trust copy mid-flow (that belongs in first-time setup, not
+re-auth).
+
+### With 2FA
+
+```
+$ vana connect github
+Connecting GitHub...
+
+Verification code: 123456
+
+Connecting GitHub...
+ ✓ Profile
+ ✓ Repositories — 12 found
+ ✓ Starred — 3 found
+
+✓ Connected GitHub.
+ Collected your GitHub data and saved it locally.
+
+ Next: vana data show github
+```
+
+One line prompt. No preamble.
+
+### Local-only success (no PS)
+
+```
+✓ Connected GitHub.
+ Collected your GitHub data and saved it locally.
+ Start your Personal Server to sync: vana server set-url
+
+ Next: vana data show github
+```
+
+### Partial sync
+
+```
+✓ Connected GitHub.
+ Collected your GitHub data. 2/3 scopes synced, 1 failed.
+ Retry: vana server sync
+
+ Next: vana data show github
+```
+
+### Source not available
+
+```
+$ vana connect steam
+
+Steam is not available. See what's ready: vana sources
+```
+
+Two lines. One fact. One action.
+
+### Collection failure
+
+```
+$ vana connect github
+Connecting GitHub...
+ ✓ Profile
+ ✗ Repositories — GitHub returned an unexpected page.
+
+ The connector may need updating.
+ Log: vana logs github
+```
+
+The phase list shows WHERE it failed. The error explains WHY. The log
+command shows HOW to debug.
+
+### Needs input in --no-input mode
+
+```
+$ vana connect github --no-input
+
+GitHub needs credentials. Run without --no-input to authenticate.
+```
+
+One line. The user knows exactly what to do.
+
+### Legacy auth, no display
+
+```
+$ vana connect shop
+
+Shop requires a browser window, but no display is available.
+Run this command in a desktop terminal.
+```
+
+Two lines. One fact. One action.
+
+### User cancels (ctrl+c or prompt cancel)
+
+```
+Cancelled.
+```
+
+One word.
+
+## Implementation requirements
+
+### New: Phase-aware renderer for connect flow
+
+A `ConnectRenderer` that manages the temporal experience:
+
+```typescript
+interface ConnectRenderer {
+ // Start the main spinner
+ start(message: string): void;
+
+ // Pause spinner, show a prompt block, resume after
+ pauseForPrompt(): void;
+ resumeAfterPrompt(): void;
+
+ // Show a scope line (appears below spinner)
+ scopeStarted(scope: string): void;
+ scopeCompleted(scope: string, detail?: string): void;
+ scopeFailed(scope: string, error: string): void;
+
+ // Resolve the spinner to success/failure
+ succeed(message: string): void;
+ fail(message: string): void;
+
+ // Add post-success lines
+ detail(message: string): void;
+}
+```
+
+This is NOT a general-purpose renderer. It's specific to the connect
+flow's temporal needs. Other commands continue using the existing
+emitter.
+
+### Technical approach
+
+Use `ora` (already a dependency) for the spinner. Use raw ANSI cursor
+control (`\x1b[{n}A` to move up, `\x1b[2K` to clear line) to manage
+scope lines below the spinner. On non-TTY, fall back to simple
+line-by-line output (no cursor control, no spinner).
+
+No new dependencies needed. The cursor control is ~10 lines of code.
+
+### What doesn't change
+
+- `--json` mode: unchanged, emits structured events
+- `--quiet` mode: unchanged, suppresses visual output
+- `--no-input` mode: unchanged, fails on input
+- The event model: unchanged, all phases still emit events
+- The state model: unchanged, state updates happen as before
+- Other commands: unchanged, use existing emitter
+
+## Sticking points to resolve during implementation
+
+1. **Spinner + scope lines interaction**: When scope lines appear
+ below the spinner, the spinner needs to know how many lines are
+ below it to manage cursor position. This requires the renderer to
+ track line count.
+
+2. **Prompt interruption**: When a prompt appears mid-flow, the
+ spinner must stop, the prompt must render cleanly, and the spinner
+ must resume after. `ora.stop()` → prompt → `ora.start()` handles
+ this, but the scope lines need to be re-rendered after the prompt.
+
+3. **Terminal resize**: If the terminal is resized during rendering,
+ the cursor positions may be wrong. Accept this limitation for v1.
+
+4. **Non-TTY degradation**: On CI/pipe, disable cursor control
+ entirely. Use simple line output: "Connecting GitHub...",
+ "✓ Profile", "✓ Repositories — 8 found", etc.
+
+5. **Long scope names**: If a scope name is very long, it could wrap
+ and break cursor calculations. Truncate scope names to terminal
+ width.
+
+## Acceptance criteria
+
+1. Happy path (session reuse, PS available) produces ≤8 lines of output
+2. First-time setup feels like a separate, clear decision
+3. Auth prompts interrupt the flow cleanly, then flow resumes
+4. Collection shows per-scope progress as lines appearing
+5. Success is outcome-shaped: what was collected, where it went
+6. Every failure ends with exactly one recovery command
+7. Cancellation produces one word: "Cancelled."
+8. `--json` output is unchanged
+9. Non-TTY output is readable without cursor control
+10. The flow feels calm, not chatty
+
+## Prior art summary
+
+| Moment | Reference CLI | What they do |
+| -------------- | ----------------- | -------------------------------------- |
+| The long wait | Vercel deploy | Phase lines with spinners → checkmarks |
+| The payoff | gh pr create | One line with the thing you wanted |
+| Trust decision | macOS permissions | Clear, respectful, no pressure |
+| Auth prompt | ssh-keygen | Minimal, no preamble |
+| Failure | Elm compiler | Explains what, why, and how to fix |
+| Cancellation | Any good CLI | Clean exit, no garbage |
+| Overall pacing | Vercel deploy | Calm, confident, not chatty |
diff --git a/docs/CLI-DEMO-GUIDELINES.md b/docs/CLI-DEMO-GUIDELINES.md
new file mode 100644
index 00000000..84f8f58e
--- /dev/null
+++ b/docs/CLI-DEMO-GUIDELINES.md
@@ -0,0 +1,130 @@
+# CLI Demo Guidelines
+
+_March 16, 2026_
+
+Design guidelines for the VHS terminal demo GIFs in `docs/vhs/`.
+
+Based on research into what ships in production CLIs (Charm, Vercel, GitHub
+CLI, Railway, Starship) and the practices of Charm.sh — the team that built VHS
+specifically to solve this problem.
+
+## Principles
+
+1. **One concept per GIF.** Each GIF demonstrates a single command or flow.
+2. **Script everything.** Never record live. Tape files ensure zero typos,
+ consistent timing, and reproducibility across versions.
+3. **Choreograph the timing.** The rhythm matters more than realism:
+ - Type command at pleasant speed (50-100ms per keystroke)
+ - Brief pause before Enter (500ms) so viewer reads the command
+ - Hold on output long enough to absorb (2-3s short, 5s complex)
+4. **Size the terminal to the content.** Tight framing, no wasted space. Cap
+ height so text stays readable at display width. Let long output scroll
+ naturally — the GIF loops.
+5. **Visual polish.** Window chrome, rounded corners, and a consistent theme
+ make demos feel like a product, not a screenshot.
+6. **Hide the boring parts.** Use VHS `Hide`/`Show` to skip setup or cd
+ commands. Start the visible recording at the interesting moment.
+7. **Keep it short.** Aim for 5-15 seconds per GIF.
+8. **Consistent branding.** Same theme, font, margin style across all demos.
+ Each GIF should feel like part of a family.
+
+## Standard tape settings
+
+```tape
+Set Shell "bash"
+Set FontSize 22
+Set Width 1200
+Set Height 600
+Set TypingSpeed 50ms
+Set CursorBlink false
+Set Theme "Catppuccin Mocha"
+Set WindowBar Colorful
+Set Padding 20
+Set BorderRadius 10
+Set LoopOffset 50%
+```
+
+Adjust `Height` per demo to fit the **full expected output** without scrolling.
+Use the formula: `(output_lines + 4) * 28 + 70`. This ensures no content is
+cut off. For tall GIFs (height > 1000px), use `width="800"` in the `` tag
+instead of the standard `width="600"` to keep text readable.
+
+## Timing recipe
+
+```tape
+Hide
+# Any setup commands (cd, env, etc.)
+Show
+
+Type "vana status"
+Sleep 500ms
+Enter
+Sleep 3s
+```
+
+| Moment | Duration |
+| --------------------------- | -------- |
+| After typing, before Enter | 500ms |
+| Short output (< 15 lines) | 3-4s |
+| Complex output (15+ lines) | 4-6s |
+| Between commands (if multi) | 1-2s |
+
+**Minimum 3s post-Enter sleep.** Node.js cold start in VHS takes ~2s, so
+anything shorter risks the command output not appearing before the recording
+ends. Do **not** scale sleep proportionally to line count — execution time and
+reading time are different things.
+
+## Display in markdown
+
+Use `` tags with explicit width for consistent sizing:
+
+```html
+
+
+
+
+
+```
+
+GIFs are rendered by CI and attached to the canary release as assets. Use
+release download URLs, not relative paths, so images are always fresh.
+
+Use `width="600"` for most GIFs. Use `width="800"` for GIFs with height
+exceeding 1000px (e.g., `help`, `status`, `doctor`, `connect-github-success`).
+Raw `` markdown gives the browser full control over sizing, which makes
+tall GIFs render with tiny text.
+
+## Regeneration
+
+```bash
+pnpm demo:vhs
+```
+
+The render script (`scripts/render-vhs.mjs`) handles fixture preparation,
+environment variants, and ordering.
+
+## Tool chain
+
+- **VHS** (Charm) — scripted terminal recordings → GIF
+- **Fixture system** — deterministic demo data in `docs/vhs/fixtures/`
+- **Three environments**: seeded (has data), fresh (clean machine),
+ seeded-input (has data but no fast-success shortcut)
+
+## Sources
+
+- [Charm "This is How We Do It"](https://charm.land/blog/100k/) — why they
+ built VHS and their demo philosophy
+- [VHS README](https://github.com/charmbracelet/vhs) — canonical settings
+ reference
+- [Gum examples](https://github.com/charmbracelet/gum) — reference tape files
+ from Charm's own CLI tools
+- [agg (asciinema GIF generator)](https://github.com/asciinema/agg) —
+ alternative renderer defaults and theme guidance
diff --git a/docs/CLI-DESIGN-SKILL.md b/docs/CLI-DESIGN-SKILL.md
new file mode 100644
index 00000000..eeceacef
--- /dev/null
+++ b/docs/CLI-DESIGN-SKILL.md
@@ -0,0 +1,260 @@
+# CLI Design Skill
+
+_March 17, 2026_
+
+How to design and implement user-facing CLI surfaces for `vana`.
+
+Use this document when: building new commands, modifying existing
+output, evaluating CLI quality, or making design decisions about
+the terminal experience.
+
+## Thesis
+
+**The CLI's identity is the moment when personal data becomes visible
+and owned.**
+
+Everything else serves that moment. Test every design decision:
+does it serve the data moment? If yes, keep. If no, cut.
+
+## Leverage points (Donella Meadows)
+
+When improving the CLI, work at the highest leverage point possible.
+Lower-numbered points create more change with less effort.
+
+| # | Leverage point | CLI example | Effect |
+| --- | ---------------- | ------------------------------------------- | -------------- |
+| 12 | Parameters | Label text, padding widths | Almost none |
+| 11 | Buffer sizes | Line counts, truncation limits | Minimal |
+| 6 | Information flow | What the user sees and when | High |
+| 5 | Rules | The renderer/emitter architecture | High |
+| 3 | Goals | "Display information" vs "build confidence" | Transformative |
+| 2 | Paradigm | "Text printer" vs "temporal experience" | Foundational |
+
+If you're tweaking label text, you're at #12. Stop and ask whether
+the information should exist at all (#6) or whether the rendering
+model needs to change (#5).
+
+## Design process
+
+### 1. Map the path tree
+
+Before designing output, enumerate EVERY branch — not just the happy
+path. Each branch needs its own rendering. The hardest branches define
+the quality.
+
+### 2. Map the emotional journey
+
+For each moment in the flow, identify:
+
+- What the user is feeling (anxiety, anticipation, pride)
+- What the user needs to know (nothing, one fact, a decision)
+- How long the moment lasts (<100ms, seconds, user-paced)
+
+Duration determines rendering:
+
+- **<100ms**: Show nothing. Don't acknowledge what the user can't perceive.
+- **100ms–1s**: Brief text, no spinner.
+- **1–10s**: Spinner on the active line.
+- **10s+**: Spinner with meaningful progress detail.
+- **User-paced**: Prompt. No spinner. Wait calmly.
+
+### 3. Design from the goal down
+
+Start with: what should the user feel at the end?
+Then: what's the minimum information to produce that feeling?
+Then: what's the minimum rendering to present that information?
+
+Do NOT start with: what data do we have? How should we format it?
+
+### 4. Test against three criteria
+
+Every design must pass:
+
+1. **Thesis test**: Does it serve "data becomes visible and owned"?
+2. **Quality bar test**: Clarity, restraint, confidence, pacing,
+ signal-to-noise. (See CLI-UX-QUALITY-BAR.md)
+3. **Prior art test**: Compare to gh, Vercel, Stripe, Railway, Cargo.
+ Are we at least as good? Is our unique element (the scope manifest)
+ preserved?
+
+## Visual identity
+
+### Symbols
+
+- `✓` — completed (Vana green)
+- `✗` — failed (red)
+- Spinner on active line (accent blue, minimal frame set)
+- No other symbols in output. No arrows, diamonds, boxes, or bullets
+ outside of help text.
+
+### Color (5 decisions, no more)
+
+- `✓` in Vana green (#00D50B)
+- `✗` in Vana red (#E7000B)
+- Active spinner in Vana blue (#4141FC)
+- Supporting detail (counts, paths, labels) in muted gray
+- Everything else in default terminal color
+
+### Typography
+
+- Bold for two things only: the title ("Connect GitHub") and the
+ success line ("Connected GitHub."). Nothing else.
+- Muted for supporting detail (counts, paths, "Next:" label).
+- Default weight for everything else.
+- This creates a visual arc: bold intention → regular work → bold
+ resolution.
+
+### Spacing
+
+- One blank line between the title and the first content line.
+- No blank lines between scope manifest lines.
+- One blank line before the success line (the pause before resolution).
+- One blank line before "Next:" (separation of outcome from guidance).
+- Blank lines are design elements, not defaults. Every blank line
+ must justify its existence.
+
+## Copy principles
+
+### Tone
+
+- Calm, precise, concise.
+- Periods, not exclamation marks. Confidence is quiet.
+- Technically serious — outcomes, not mechanisms.
+
+### Rules
+
+- Never say "using Playwright browser automation" or any
+ implementation detail.
+- Never lead with file paths. Paths are supporting detail.
+- Never show more than one "Next:" suggestion (context-dependent).
+- Never hedge ("may need updating"). Either check or don't mention.
+- Never explain what the user already knows on re-run.
+ First time: explain. Second time: just do it.
+
+### Success messages
+
+The approved shape (from CLI-UX-SIMULATION.md):
+
+```
+Connected {Source}.
+Collected your {Source} data and synced it to your Personal Server.
+```
+
+Or local-only:
+
+```
+Connected {Source}.
+Collected your {Source} data and saved it locally.
+```
+
+Two lines. Outcome-shaped, not artifact-shaped.
+
+### Failure messages
+
+Every failure has three parts:
+
+1. What happened (one line)
+2. Why (one line, only if actionable)
+3. One recovery command
+
+No "check the docs." No multiple suggestions. One command.
+
+### Cancellation
+
+One word: `Cancelled.`
+
+## The scope manifest (our signature)
+
+The scope manifest is the CLI's unique visual element:
+
+```
+ ✓ Profile
+ ✓ Repositories — 8 found
+ ✓ Starred
+```
+
+Design rules for the manifest:
+
+- Lines appear as scopes complete (honest pacing).
+- Active scope has a spinner.
+- Completed scope has `✓` in green.
+- Failed scope has `✗` in red.
+- Counts follow the scope name with `—` separator when available.
+- Always show the manifest, even for fast collections. The lines
+ are a data inventory, not a progress indicator.
+
+## Prompt design
+
+Use `@clack/prompts` components for interactive inputs (text,
+password, select, confirm). They're genuinely better than readline
+for masking, validation, and visual quality.
+
+Frame prompts in our visual language, not clack's:
+
+- No vertical bars (`│`) wrapping the flow
+- No diamond symbols (`◆`, `◇`)
+- Prompts appear inline, minimal, like `ssh`:
+
+```
+ Username: alice
+ Password: ▪▪▪▪▪▪▪▪
+```
+
+For first-time setup (runtime install), explanation is warranted.
+For re-auth, just prompt.
+
+## Terminal bell
+
+Emit `\a` on completion of long-running operations (connect, collect).
+Users with notification-aware terminals get a system notification.
+Zero visual cost.
+
+## Degradation
+
+When capabilities are limited:
+
+- No TTY: no spinner, no color. Plain line-by-line output.
+- No color (`NO_COLOR`, `TERM=dumb`): symbols still work, just
+ uncolored.
+- CI: same as no TTY.
+- `--json`: no visual output at all. Structured events only.
+
+The CLI must be readable in all modes. Beauty degrades; function
+doesn't.
+
+## Anti-patterns
+
+Things that feel productive but don't improve beauty:
+
+- **Tweaking label text** — leverage point #12. Almost no effect.
+- **Adding more information** — violates restraint. Ask: does the
+ user need this HERE, NOW?
+- **Borrowing another CLI's visual identity** — clack bars, Vercel
+ triangles. Build our own.
+- **Decorating the success moment** — the data IS the decoration.
+ Don't add chrome to the scope manifest.
+- **Multiple "Next:" suggestions** — forces the user to choose.
+ Choose for them based on journey position.
+- **Trust copy on re-runs** — "Your credentials stay local" is
+ important the first time. On the 5th connect, it's noise.
+
+## Reference documents
+
+- [CLI-UX-QUALITY-BAR.md](CLI-UX-QUALITY-BAR.md) — beauty standards
+- [CLI-AUDIENCE-CONTRACT.md](CLI-AUDIENCE-CONTRACT.md) — human + agent
+- [CLI-UX-SIMULATION.md](CLI-UX-SIMULATION.md) — approved output shapes
+- [CLI-ONBOARDING-COPY.md](CLI-ONBOARDING-COPY.md) — tone and trust
+- [CLI-BEAUTY-IMPLEMENTATION-PLAN.md](CLI-BEAUTY-IMPLEMENTATION-PLAN.md) — execution plan
+- [CLI-CONNECT-FLOW-DESIGN.md](CLI-CONNECT-FLOW-DESIGN.md) — connect flow path tree
+- [CLI-BEAUTY-AUDIT.md](CLI-BEAUTY-AUDIT.md) — three-axis audit findings
+
+## Prior art
+
+| CLI | What to learn from it |
+| --------------- | ----------------------------------------------------- |
+| Vercel | Pacing. Deploy feels calm and inevitable. |
+| gh (GitHub CLI) | Restraint. Shows exactly what you need. |
+| Cargo (Rust) | Honest timing. Fast things flash, slow things linger. |
+| Elm compiler | Failure beauty. Errors teach, not blame. |
+| ssh | Prompt minimalism. `Password:` and nothing else. |
+| Stripe CLI | Factual tone. States facts, not feelings. |
diff --git a/docs/CLI-EXECUTION-PLAYBOOK.md b/docs/CLI-EXECUTION-PLAYBOOK.md
new file mode 100644
index 00000000..19743a70
--- /dev/null
+++ b/docs/CLI-EXECUTION-PLAYBOOK.md
@@ -0,0 +1,876 @@
+# `vana-connect` CLI Execution Playbook
+
+_As of March 14, 2026_
+
+This document turns the current CLI/runtime/release state into an execution
+playbook that lower-reasoning models can follow without needing to reconstruct
+all prior design context.
+
+It should be read after:
+
+- [CLI-FINAL-PRODUCT-SPEC.md](CLI-FINAL-PRODUCT-SPEC.md)
+- [CLI-BEAUTY-IMPLEMENTATION-PLAN.md](CLI-BEAUTY-IMPLEMENTATION-PLAN.md)
+- [CLI-RUNTIME-PORTABILITY-NOTES.md](CLI-RUNTIME-PORTABILITY-NOTES.md)
+
+If this document conflicts with casual conversational guidance, this document
+wins.
+
+## Current State
+
+Branch state at the time of this update:
+
+- local work has moved materially past the earlier canary checkpoints
+- use `git log --oneline --decorate -10` to confirm the exact local head before pushing
+
+Already true on this branch:
+
+- the in-process runtime is real
+- the installer and Homebrew paths are real
+- published canary assets work
+- `status`, `sources`, `data`, and guided `connect` have been materially
+ upgraded
+- `doctor`, `version`, and `logs` are now real first-class CLI surfaces
+- guided `connect` now has clearer entry, cancellation, and continuation copy
+- `status` now points users toward `vana data list` when that is the right next step
+- `data show` / `data path` JSON surfaces are more useful for shell tooling
+- `logs` now exposes stored run-log paths in both human and `--json` modes
+- `sources`, `status`, `data`, and success summaries now use more structured factual rows
+- source discovery surfaces now explain whether a source prompts in-terminal or requires a manual browser step
+- source discovery now exposes a recommended path in both JSON and human mode
+- `data` empty/missing states now include concrete next steps instead of dead-end copy
+- `sources --json`, `status --json`, and `data list --json` now expose more top-level guidance metadata
+- `doctor --json` now exposes runtime capabilities, lifecycle commands, summary counts, and recent source activity
+- successful connects now explicitly mention the saved browser session payoff
+- structured runtime `status-update` and `progress-update` events exist
+- display-path rendering is now centralized and regression-tested
+- CLI state writes now use a lock + atomic-write path with concurrency regression coverage
+- runtime footprint measurement now exists via `pnpm runtime:footprint`
+- README-facing VHS demos and transcripts are publishing from CI
+- local transcript/demo scripts now rebuild first so review artifacts cannot silently drift behind `dist`
+
+This means the next work is no longer “make the CLI exist.”
+It is:
+
+1. make the human product feel fully truthful and coherent
+2. deepen beauty on top of that stable surface
+3. keep release work efficient instead of churning many tiny deploy cycles
+
+## Immediate Local Sequence
+
+Until explicitly told otherwise, the next model should stay in the **local-only
+execution lane** and defer:
+
+- canary polling
+- Homebrew/tap sync
+- hosted installer verification
+- published artifact checks
+
+The immediate local sequence is:
+
+1. finish the remaining **Batch 8A: Best-In-Class Finish** local work
+2. continue any remaining **Batch 2 / Batch 3** connect-journey and static-surface polish
+3. continue local README / transcript / VHS alignment work
+4. continue bounded **Batch 5B** work only where it does **not** require external platform validation
+5. only switch to external validation once the known local backlog is genuinely exhausted
+
+When deciding what to do next locally, prefer this order:
+
+1. improve the human `connect` journey
+2. improve post-success payoff and `vana data`
+3. improve first-run/help/discovery coherence
+4. improve degraded/manual-flow grace
+5. improve operator affordances that are already justified by existing runtime data
+
+Do **not** start external validation just because it is possible.
+Only start it when it becomes an input to remaining work or when the local
+backlog is meaningfully exhausted.
+
+## Operating Rules
+
+These rules should govern all remaining work.
+
+### 1. Prefer larger local batches
+
+Do not push every small polish change.
+
+Preferred pattern:
+
+- do a coherent local batch
+- run the full local preflight
+- push once
+- let one publish/verification cycle prove the batch
+
+Break that rule only for:
+
+- release-path regressions
+- platform-specific packaging failures
+- a public artifact problem that needs immediate isolation
+
+### 2. Preserve machine mode
+
+Do not let human-mode beauty work alter:
+
+- `--json` field names
+- exit codes
+- JSON stdout cleanliness
+- machine-readable event contracts
+
+### 3. Treat the README as a product surface
+
+The README is not just documentation.
+It is a review surface for:
+
+- install quality
+- first-run quality
+- CLI beauty progress
+
+Visible demo commands must match what a real user would type.
+
+### 4. Keep demos deterministic
+
+Do not use live credentials or live external websites for README-facing demos.
+
+Use:
+
+- fixture homes
+- fake connector state
+- deterministic collected data
+- deterministic demo connectors where needed
+
+### 5. Primary-agent vs subagent boundary
+
+If Codex subagents are available, use them only for bounded slices with clear
+acceptance criteria.
+
+Keep these with the primary agent:
+
+- final product judgment
+- cross-batch sequencing changes
+- release orchestration
+- Homebrew / canary / installer publication decisions
+- any JSON contract changes
+
+Good subagent work:
+
+- fixture seeding
+- transcript tests
+- static surface rendering changes in one command area
+- README/demo asset plumbing
+- acceptance test harnesses
+
+### 6. Research before product judgment
+
+If a later batch depends on claims about:
+
+- best-in-class CLI prior art
+- current Playwright/Node/browser-install support
+- current platform behavior on Windows/macOS/Linux
+- public release-channel expectations or install norms
+
+then the model executing that batch should research current primary sources
+first instead of relying on memory.
+
+Examples:
+
+- official docs
+- maintained upstream repos
+- current release artifacts
+- current platform behavior observed directly
+
+Do not make "best-in-class" or portability decisions from stale assumptions.
+
+## Release Efficiency Lane
+
+This is a continuous lane, not a one-time batch.
+
+Goal:
+
+- reduce deployment tax
+- keep release validation automated
+- avoid idle waiting
+
+Tasks:
+
+1. Maintain one canonical local preflight command that runs:
+ - tests
+ - lint
+ - format check
+ - build
+ - transcript capture
+ - VHS/demo verification
+ - release asset assertions
+
+Current canonical local CLI preflight:
+
+- `pnpm preflight:cli`
+
+2. Maintain one watcher-driven post-publish flow that handles:
+ - CI/canary polling
+ - Homebrew sync
+ - hosted installer verification
+ - demo artifact verification
+3. Keep README/tap/published canary aligned to the same version.
+
+Good subagent fit:
+
+- script work in `scripts/`
+- transcript/demo verification harnesses
+- read-only release-asset audits
+
+Primary-agent responsibility:
+
+- deciding when a batch is large enough to justify a publish cycle
+- interpreting failed release jobs
+
+## Batch 1: Product-Truth And Demo-Proofing
+
+This is the next mandatory batch.
+Do not start deep beauty work before this is externally proven.
+
+### Goals
+
+- the human CLI should feel truthful after success
+- the README should be able to show a real successful connect flow
+- the public `connect` demo should end on visible payoff, not just progress
+ output or fallback guidance
+- `status`, `data`, and success summaries should agree with each other
+
+### Work items
+
+1. Add a deterministic successful `connect` demo fixture.
+ Likely files:
+ - `docs/vhs/fixtures/`
+ - `scripts/prepare-vhs-fixtures.mjs`
+ - a demo connector fixture under the fixture home
+
+2. Add a README-quality successful connect tape.
+ It should show a short success story, not just connector mechanics:
+ - `vana connect github`
+ - `vana data show github`
+ Likely files:
+ - `docs/vhs/*.tape`
+ - `README.md`
+ - `docs/vhs/README.md`
+
+3. Add transcript and regression coverage for that successful connect flow.
+ Likely files:
+ - `test/cli/index.test.ts`
+ - `docs/transcripts/`
+ - `scripts/capture-cli-transcripts.mjs`
+
+4. Tighten the final success summary.
+ It should consistently answer:
+ - what connected
+ - what was collected
+ - where it was saved or synced
+ - what to do next
+
+5. Improve `vana data show` and `vana data path`.
+ They should feel like the first payoff surface after success.
+
+6. Improve `vana status`.
+ It should better distinguish:
+ - runtime installed
+ - session present
+ - last successful collection
+ - local-only vs Personal Server state
+
+7. Broaden acceptance coverage across:
+ - migrated/requestInput connectors
+ - legacy/manual connectors
+ - unsupported source flows
+ - saved-session reuse cases
+
+### Exit criteria
+
+- README can show a deterministic successful connect demo
+- human success output feels complete without opening JSON
+- `status`, `data show`, and the success summary agree semantically
+- transcript and acceptance coverage lock the intended behavior
+
+### Good subagent slices
+
+1. Demo fixture and tape lane
+ Deliverables:
+ - new deterministic connect-success fixture
+ - new/updated `.tape`
+ - updated `docs/vhs/README.md`
+
+2. Transcript lane
+ Deliverables:
+ - transcript capture updates
+ - transcript assertions/tests
+
+3. `data` payoff lane
+ Deliverables:
+ - `data show` / `data path` polish
+ - tests for human and `--json` behavior
+
+4. `status` truth lane
+ Deliverables:
+ - richer state rendering
+ - tests for nuanced connected-state output
+
+Primary-agent integration:
+
+- final success-summary wording
+- deciding what is “truthful enough” for public README use
+- release push after local validation
+
+## Batch 2: Deep Beauty, Static Surfaces
+
+Start this only after Batch 1 is published and acceptance-tested from the
+real artifact path.
+
+### Goals
+
+- static CLI surfaces should feel deliberate, premium, and recognizably Vana
+- beauty should come from hierarchy and rhythm, not ornament
+
+### Work items
+
+1. Refine semantic theme usage.
+2. Improve section rhythm, spacing, and emphasis.
+3. Tighten `status`, `sources`, `data list`, `data show`, and guided `connect`.
+4. Make README demo assets reflect the new visual bar.
+
+### Exit criteria
+
+- static surfaces scan noticeably better than the current baseline
+- color-disabled and piped output remains readable
+- README demos visibly reflect the upgraded static language
+
+### Good subagent slices
+
+1. `status` surface refinement
+2. `sources` surface refinement
+3. `data` surface refinement
+4. theme/symbol cleanup in `src/cli/render/`
+5. README/demo embed refresh
+
+Primary-agent integration:
+
+- aesthetic consistency across commands
+- final judgment on whether the CLI feels more premium or just more decorated
+
+## Batch 3: Deep Beauty, Connect Narrative
+
+Only start after static surfaces are stable.
+
+### Goals
+
+- `vana connect ` should feel like one calm narrative
+- trust, progress, and success/failure pacing should feel intentional
+
+### Work items
+
+1. Improve phase transitions:
+ - prepare
+ - connect
+ - continue
+ - success/failure
+2. Improve trust framing before auth/input.
+3. Improve spinner/checkmark payoff where capabilities allow it.
+4. Improve cancellation language.
+5. Sharpen the distinction between:
+ - connected locally
+ - connected and synced
+ - manual/legacy flow
+
+### Exit criteria
+
+- connect runs feel like a product journey, not a stream of logs
+- human success and failure both land clearly
+- the machine contract remains untouched
+
+### Good subagent slices
+
+1. cancellation and interruption copy/tests
+2. success/failure summary rendering
+3. progress rendering utilities
+4. prompt continuity improvements
+
+Primary-agent integration:
+
+- overall connect flow pacing
+- deciding whether motion/spinners are helping or distracting
+
+## Batch 4: Runtime Event Enrichment For Beauty
+
+Only do this after Batch 3 exposes an actual event-model limitation.
+
+### Goals
+
+- make human rendering rely on structured events, not fallback heuristics
+- improve summaries and progress semantics without destabilizing JSON mode
+
+### Work items
+
+1. Add only the missing event metadata needed for better rendering.
+2. Avoid speculative framework-building.
+3. Keep event additions backward-compatible where possible.
+
+### Exit criteria
+
+- human rendering no longer needs ad hoc inference where better runtime events
+ would be more truthful
+- `--json` mode remains stable and test-covered
+
+### Good subagent slices
+
+1. runtime event type additions with tests
+2. renderer consumption of new event metadata
+3. JSON contract regression tests
+
+Primary-agent integration:
+
+- deciding which event additions are truly needed
+- preserving product semantics while changing internal event richness
+
+## Batch 5: Data Interaction And Composability
+
+This batch strengthens the CLI as a tool, not just a guided product surface.
+
+### Goals
+
+- `vana data` should feel useful in both human and shell workflows
+- the CLI should compose cleanly with tools like `jq`
+
+### Work items
+
+1. Strengthen `data list`, `data show`, and `data path`.
+2. Tighten JSON schemas and error behavior.
+3. Make human and machine modes both intentional.
+4. Consider compact summaries that are easy to skim and easy to pipe.
+
+### Exit criteria
+
+- `vana data` feels like a real read surface
+- `--json` output is stable and shell-friendly
+
+### Good subagent slices
+
+1. JSON-mode regression tests
+2. human-surface formatting
+3. transcript examples for shell composability
+
+## Batch 5A: Operational Polish And CLI Contract
+
+This batch exists to close the non-glamorous gaps that separate a strong CLI
+from a best-in-class one.
+
+### Goals
+
+- make versioning, diagnostics, and lifecycle operations obvious
+- make the shell contract explicit and reliable
+- improve help/discoverability without weakening the human product surface
+
+### Work items
+
+1. Add an explicit version surface:
+ - `vana --version`
+ - `vana version`
+ - version visibility in `vana --help`
+ - version in `status --json`
+2. Add a diagnostics surface:
+ - likely `vana doctor`
+ - runtime/browser/install checks
+ - actionable remediation output
+3. Define and verify the exit-code matrix:
+ - success
+ - cancel
+ - source required
+ - setup required
+ - needs input
+ - legacy/manual step required
+ - connector unavailable
+ - runtime/internal failure
+4. Audit and tighten the JSON contract:
+ - stable top-level shapes
+ - no noisy human output in `--json`
+ - predictable error payloads
+5. Improve lifecycle discoverability:
+ - upgrade instructions
+ - uninstall/cleanup instructions
+ - canary vs stable channel clarity
+6. Improve help quality:
+ - command descriptions
+ - examples
+ - first-step orientation
+
+### Exit criteria
+
+- a new user can discover version, help, diagnostics, and upgrade paths from the CLI itself
+- script authors have a documented and test-covered exit-code matrix
+- `--json` behavior is explicit, stable, and reviewed as a contract
+- uninstall/cleanup and channel guidance exist in docs
+
+### Good subagent slices
+
+1. version/help command work
+2. `doctor` command scaffolding and tests
+3. exit-code matrix tests
+4. JSON contract audit/tests
+5. install/upgrade/uninstall doc pass
+
+Primary-agent integration:
+
+- deciding what belongs in `doctor` vs `status`
+- deciding what version information belongs in normal human surfaces
+- protecting the CLI from “helpful” additions that bloat the contract
+
+## Batch 5B: Runtime And Portability Validation
+
+Do this after the main local feature/beauty work is coherent, but before stable
+promotion and before spending serious time on deployment polish.
+
+### Why this batch exists
+
+The current code review uncovered a few concerns that are more fundamental than
+copy or presentation:
+
+- `src/core/state-store.ts` currently does an uncoordinated read-modify-write of
+ `vana-connect-state.json`
+- `src/runtime/playwright/browser.ts` opportunistically shells out to
+ `sqlite3` for cookie import
+- `src/runtime/managed-playwright.ts` intentionally avoids user-facing `npx`,
+ but reaches into Playwright internals for browser installation
+
+Those should not derail the current CLI feature work, but they also should not
+be left to vague “later” follow-up.
+
+### Goals
+
+- validate correctness and portability risks before stable
+- distinguish real problems from speculative LLM concern
+- prefer bounded, defensible fixes over reactive dependency churn
+
+### Work items
+
+1. Lock the display-path invariant.
+ - Confirm that `~` is presentation-only.
+ - Add a narrow regression test or audit proving that display strings never
+ feed filesystem APIs.
+
+2. Add a concurrency regression for CLI state writes.
+ - Reproduce the failure mode, if any, against `updateSourceState(...)`.
+ - Choose the fix based on that reproduction.
+ - Prefer atomic-write discipline or a more fundamental state-model change
+ over reflexively adding a lockfile package.
+
+3. Audit `sqlite3` portability explicitly.
+ - Treat this as a Windows concern first.
+ - Current code already tolerates `sqlite3` absence, so the question is
+ product impact, not “does the CLI boot”.
+ - Decide whether opportunistic best-effort is acceptable for stable or
+ whether cookie import must move to an embedded JS/WASM path.
+
+4. Revalidate the Playwright browser-install strategy.
+ - Playwright’s official docs still present CLI-driven browser installation as
+ the normal path.
+ - Our current internal-registry approach may still be the right product
+ choice because we cannot require user-facing `npx`.
+ - Before stable, confirm whether a cleaner package-owned install path exists
+ on current Playwright.
+
+5. Measure browser/runtime asset growth before designing cleanup.
+ - Use actual size/update data.
+ - If cleanup is needed, design it as an intentional lifecycle feature, not a
+ reactionary installer workaround.
+
+### Exit criteria
+
+- the `~` concern is either dismissed with proof or fixed
+- state writes have a defended concurrency story
+- Windows/sqlite behavior is understood and intentionally accepted or replaced
+- the Playwright install path is defended for stable
+- size/bloat concerns are based on measurement, not guesswork
+
+### Good subagent slices
+
+1. path invariant audit + tests
+2. state-store concurrency reproduction
+3. Windows/sqlite portability audit
+4. Playwright install-strategy note with code references
+5. runtime/browser size measurement script or report
+
+Primary-agent integration:
+
+- deciding whether a concern changes architecture, needs a bounded fix, or can
+ remain an explicit non-goal
+
+## Batch 6: Debuggability And Operator Affordances
+
+This batch is for connector authors, agents, and support/debug workflows.
+
+### Goals
+
+- improve insight into what a run is doing without leaking raw browser objects
+
+### Work items
+
+1. expose more structured run-state inspection
+2. expose screenshot/state capture where already supported
+3. improve failure diagnostics and next-step guidance
+
+### Exit criteria
+
+- failed runs are easier to understand and recover from
+- operator workflows improve without contaminating the normal human path
+
+### Good subagent slices
+
+1. run-state reporting
+2. screenshot artifact plumbing
+3. failure transcript coverage
+
+## Batch 7: Source Discovery And Maturity UX
+
+### Goals
+
+- users should know what to expect before they connect
+
+### Work items
+
+1. improve maturity grouping and labeling in `sources`
+2. clarify expectations for automated vs manual flows
+3. consider better source ordering and recommendation cues
+
+### Exit criteria
+
+- users can tell which sources are smooth, manual, or legacy before connecting
+
+### Good subagent slices
+
+1. `sources` grouping/rendering
+2. maturity-label tests
+3. transcript updates
+
+## Batch 8: Public Surface Hardening
+
+### Goals
+
+- README, release assets, Homebrew, installer paths, and demos should all tell
+ the same story
+
+### Work items
+
+1. keep README demo embeds aligned to the current canary
+2. keep Homebrew and hosted installer paths aligned
+3. keep transcripts and demo assets current
+4. ensure “discovering this project cold” feels polished
+
+### Exit criteria
+
+- a cold-start user can discover the repo, install the CLI, and feel impressed
+ without extra context
+
+### Good subagent slices
+
+1. README polish
+2. demo asset sync
+3. install-doc consistency checks
+
+Primary-agent integration:
+
+- final public-facing quality bar
+- deciding when canary quality is good enough to promote
+
+## Batch 8A: Best-In-Class Finish
+
+This batch exists because "strong CLI" and "best-in-class CLI" are not the
+same thing.
+
+Earlier batches improve components:
+
+- command surfaces
+- machine contracts
+- diagnostics
+- demos
+- install paths
+
+But best-in-class quality only exists when those parts feel excellent **as one
+product**.
+
+### Goals
+
+- the installed CLI should feel premium to a cold user, not just correct
+- the human journey should feel great in both the happy path and the degraded path
+- the CLI should feel unusually complete compared with typical product CLIs
+
+### Work items
+
+1. Close the cold-start delight gap.
+ - install / first-run / first-value path should feel tight and intentional
+ - help, version, doctor, and status should reinforce trust immediately
+
+2. Close the connect-journey excellence gap.
+ - migrated/requestInput flows should feel calm and premium
+ - legacy/manual flows should feel gracefully supported, not second-class
+ - success, cancel, unavailable, and runtime-error states should all land well
+
+3. Close the post-success payoff gap.
+ - `vana data` should feel like a real reward surface, not just a path printer
+ - the first successful run should create obvious momentum for the second
+
+4. Close the public-artifact truth gap.
+ - Homebrew / installer / README / demos should reflect the same quality bar
+ - no meaningful gap should remain between local branch quality and published experience
+
+5. Compare against real best-in-class expectations, not just the old branch.
+ - use the existing beauty brief/research as a bar
+ - judge the CLI against `gh` / Vercel / Stripe-style expectations:
+ - confidence
+ - restraint
+ - clarity
+ - quality of degraded states
+
+### Exit criteria
+
+- a cold evaluator can discover, install, connect, inspect, and troubleshoot
+ without extra context and come away impressed
+- the CLI feels premium in:
+ - help
+ - connect
+ - status
+ - data inspection
+ - diagnostics
+- manual/legacy connectors are handled gracefully enough that they do not
+ materially undermine the product impression
+- published artifact quality matches local branch quality closely enough that
+ the README can be trusted as a live product surface
+
+### Good subagent slices
+
+1. cold-start acceptance script and checklist
+2. post-success payoff transcript/demo review
+3. degraded-state transcript review and polish
+4. README/demo/public-surface consistency review
+5. prior-art / official-doc research packet for final judgment
+
+Primary-agent integration:
+
+- deciding whether the CLI is merely "good" or actually "best-in-class"
+- deciding whether degraded paths are graceful enough
+- deciding when the product impression is strong enough to promote beyond canary
+
+## Batch 9: Stable-Release Readiness
+
+Do not start this early.
+This is the final readiness lane after the product feels right.
+
+### Goals
+
+- define and prove the criteria for promoting beyond canary
+
+### Work items
+
+1. lock a release-readiness checklist
+2. run a full acceptance matrix on published artifacts
+3. resolve downgrade/platform/documentation gaps
+
+### Exit criteria
+
+- there is a clear, defensible reason to promote beyond canary
+
+## Deferred Validation Concerns
+
+These are not active batch redirects. Revisit them deliberately once the main
+feature/UX work is complete or if current implementation work touches the same
+area.
+
+### 1. Display-path tilde handling
+
+Current read:
+
+- likely overstated as a current bug
+- functional paths already come from `os.homedir()`-backed helpers in
+ `src/core/paths.ts`
+- `~` currently appears mainly in human-facing display rendering via
+ `formatDisplayPath(...)`
+
+What to validate later:
+
+- confirm no filesystem write/read path is ever sourced from a display string
+- keep `~` strictly as a presentation concern
+
+### 2. Concurrent state-file writes
+
+Current read:
+
+- real concern
+- today `updateSourceState(...)` does a read-modify-write on
+ `vana-connect-state.json` with no coordination
+- this can plausibly lose updates under concurrent CLI runs
+
+What to validate later:
+
+- write a concurrency regression test first
+- prefer deciding between atomic-write discipline, sharded state, or locking
+ based on the actual failure mode
+- do not add a lockfile dependency by reflex
+
+### 3. Playwright/browser asset growth
+
+Current read:
+
+- concern may be real, but is unmeasured right now
+- not a reason to regress to user-visible `npx`/system-Node assumptions
+
+What to validate later:
+
+- measure installed size and update churn across a few releases
+- if bloat is real, prefer managed cache cleanup/lifecycle commands over
+ reinstall-heavy behavior
+
+### 4. External `sqlite3` dependency
+
+Current read:
+
+- real cross-platform portability concern
+- `src/runtime/playwright/browser.ts` opportunistically shells out to
+ `sqlite3` for cookie import
+- the code already tolerates failure, so this is not a universal blocker, but
+ it may create uneven behavior across machines
+
+What to validate later:
+
+- confirm actual behavior on Windows/macOS/Linux
+- decide whether opportunistic best-effort is acceptable or whether the feature
+ should move to an embedded JS/WASM approach
+
+### 5. Playwright browser-install API usage
+
+Current read:
+
+- worth re-validating before stable
+- current implementation intentionally avoids user-facing `npx`
+- it currently reaches into Playwright internals via the registry module in
+ `src/runtime/managed-playwright.ts`
+
+What to validate later:
+
+- confirm this remains the best supported path on current Playwright/Node
+- if Playwright exposes a cleaner package-owned install entrypoint, prefer that
+ over private internals
+- do not reintroduce user prerequisites just to become “more official”
+
+## Recommended Execution Pattern
+
+For lower-reasoning models:
+
+1. pick one bounded slice from the current batch
+2. state the assumption you are making
+3. implement only that slice
+4. run the relevant local tests/checks
+5. summarize:
+ - what changed
+ - what was verified
+ - what remains for integration
+
+Do not independently:
+
+- redesign the sequence of batches
+- change JSON contracts casually
+- trigger release work without a coherent batch ready
+- decide public README/demo quality alone
diff --git a/docs/CLI-EXIT-CODE-MATRIX.md b/docs/CLI-EXIT-CODE-MATRIX.md
new file mode 100644
index 00000000..7968a5db
--- /dev/null
+++ b/docs/CLI-EXIT-CODE-MATRIX.md
@@ -0,0 +1,89 @@
+# CLI Exit Code Matrix
+
+_March 14, 2026_
+
+This document defines the current `vana` exit-code contract.
+
+The guiding rule is simple:
+
+- `0` means the requested command completed successfully
+- `1` means the requested command did not complete successfully, including
+ guided/recoverable cases like missing source input, setup required, or manual
+ action still needed
+
+The CLI does **not** currently use a large family of bespoke nonzero exit
+codes. The machine-readable distinction comes from:
+
+- the JSON payload for command surfaces like `status`, `sources`, `data`, and
+ errors
+- streamed `outcome` / runtime events for connect flows
+
+## Top-level Commands
+
+| Command | Success | Non-success |
+| --------------------- | ---------------------- | ----------- |
+| `vana` | `0` when help is shown | n/a |
+| `vana --help` | `0` | n/a |
+| `vana --version` | `0` | n/a |
+| `vana version` | `0` | n/a |
+| `vana version --json` | `0` | n/a |
+| `vana status` | `0` | n/a |
+| `vana status --json` | `0` | n/a |
+| `vana doctor` | `0` | n/a |
+| `vana doctor --json` | `0` | n/a |
+| `vana sources` | `0` | n/a |
+| `vana sources --json` | `0` | n/a |
+
+## Connect
+
+| Command / Outcome | Exit code | Notes |
+| ------------------------------------------------------ | --------- | ------------------------------------------------------ |
+| `vana connect ` success | `0` | Includes local-only and synced success |
+| `vana connect` guided picker success | `0` | When a source is selected and the connect run succeeds |
+| `vana connect --json` without source | `1` | Returns `source_required` JSON |
+| `vana connect` without source in non-interactive shell | `1` | Prints guidance |
+| Guided picker cancelled | `1` | No connection was made |
+| Setup required / setup declined | `1` | Recoverable via `vana setup` or rerun |
+| `needs_input` in `--no-input` mode | `1` | Recoverable by rerunning without `--no-input` |
+| `legacy_auth` / manual browser step still required | `1` | Recoverable by rerunning interactively |
+| Connector unavailable | `1` | Recoverable if/when the connector becomes available |
+| Runtime/internal failure | `1` | Inspect logs / doctor output |
+
+## Data
+
+| Command / Outcome | Exit code | Notes |
+| ----------------------------------------- | --------- | ----------------------------------------------- |
+| `vana data` | `0` | Shows help |
+| `vana data list` | `0` | Even when no data exists yet |
+| `vana data list --json` | `0` | Returns an empty list when nothing is collected |
+| `vana data show ` success | `0` | Prints summary and next steps |
+| `vana data show --json` success | `0` | Returns structured dataset payload |
+| `vana data show ` missing dataset | `1` | Recoverable via `vana connect ` |
+| `vana data path ` success | `0` | Human mode prints the path only |
+| `vana data path ` missing dataset | `1` | Recoverable via `vana connect ` |
+
+## Logs
+
+| Command / Outcome | Exit code | Notes |
+| ----------------------------------- | --------- | ----------------------------------------- |
+| `vana logs` | `0` | Even when there are no stored logs yet |
+| `vana logs --json` | `0` | Returns an empty log list when none exist |
+| `vana logs ` success | `0` | Human mode prints the path only |
+| `vana logs --json` success | `0` | Returns structured log metadata |
+| `vana logs ` missing log | `1` | Recoverable by running the source again |
+
+## Setup
+
+| Command / Outcome | Exit code | Notes |
+| -------------------------- | --------- | ------------------------------------------ |
+| `vana setup` success | `0` | Includes the already-installed case |
+| `vana setup --yes` success | `0` | Includes runtime install completion |
+| `vana setup` failure | `1` | Runtime could not be installed or repaired |
+
+## Design Notes
+
+- Help surfaces return `0` because they are a successful user outcome.
+- Guided/recoverable states still return `1` when the requested action did not
+ actually complete.
+- If the CLI later needs richer nonzero codes, it should add them
+ intentionally, document them here, and keep the JSON/event contract aligned.
diff --git a/docs/CLI-FINAL-PRODUCT-SPEC.md b/docs/CLI-FINAL-PRODUCT-SPEC.md
new file mode 100644
index 00000000..b0cd6ffb
--- /dev/null
+++ b/docs/CLI-FINAL-PRODUCT-SPEC.md
@@ -0,0 +1,732 @@
+# `vana-connect` Final Product Spec
+
+_As of March 13, 2026_
+
+If this document conflicts with earlier `CLI-*` planning docs, this document wins.
+
+## Purpose
+
+This document defines the target end state for:
+
+- the `vana` CLI
+- the local connector runtime
+- the installer and release pipeline
+- the skill/onboarding contract that should sit on top of the CLI
+
+This is not a transitional plan for user-visible intermediate phases.
+It is the implementation spec for the final product we want to ship.
+
+## Final Product From The User's Point Of View
+
+### What a human experiences
+
+1. They install `vana` with one obvious command.
+2. They do not need to know or care that the implementation uses Node/TypeScript.
+3. They run:
+
+```bash
+vana connect github
+```
+
+4. If a browser runtime is needed, `vana` handles it directly.
+5. If login or 2FA is needed, `vana` asks clearly.
+6. The user gets a concrete outcome:
+ - data connected locally
+ - data ingested to the Personal Server
+ - manual input needed
+ - source unavailable
+7. The user never has to install `node`, `npm`, or Playwright themselves.
+
+### What a coding agent experiences
+
+1. It can rely on an installed `vana` binary.
+2. It can run:
+
+```bash
+vana sources --json
+vana status --json
+vana connect --json --no-input
+```
+
+3. It gets stable structured events and outcomes.
+4. It can fall back to interactive reruns when the runtime emits `needs_input`.
+5. It can access explicit debugging capabilities when needed, without depending on raw Playwright internals.
+
+### What is explicitly not true in the final product
+
+- the user is not required to have system `node`
+- the user is not required to have system `npm`
+- `vana setup` does not run `npm install`
+- `vana connect` does not shell out to `node run-connector.cjs`
+- the CLI does not depend on copied runtime scripts under `~/.vana/playwright-runner/`
+
+## Product Decisions Locked By This Spec
+
+These are now the working decisions unless a later discovery proves they are wrong.
+
+1. The final product is one `vana` CLI plus an SDK/runtime core, not separate human and agent CLIs.
+2. The runtime rewrite should target the clean end state directly, not an incremental user-visible bridge.
+3. The first runtime host should be **in-process**, not worker-first.
+4. The runtime core must be **transport-agnostic** so a worker host or app host can be added later without redesigning the core.
+5. The runtime must expose **capabilities**, not raw Playwright objects.
+6. Existing connectors must continue to work during and after the rewrite.
+7. Headed fallback must remain supported.
+8. The default profile strategy remains **Vana-managed isolated profiles**.
+9. Chromium may still be a one-time managed download during setup.
+10. Installer/release work must only claim a standalone experience once runtime execution and setup no longer depend on external `node` or `npm`.
+
+## Non-Goals For This Rewrite
+
+These are intentionally out of scope for the final product defined here:
+
+- existing browser profile support
+- remote/cloud connector execution
+- embrowse or webview execution backends
+- a TUI-first CLI
+- a plugin system
+- a public browser automation SDK
+- redesigning connector authoring as a new format
+
+These futures must remain possible, but they are not required to complete this spec.
+
+## Critical Current Problems This Spec Must Eliminate
+
+The current system still has transitional behavior that is not acceptable in the final product:
+
+1. `ensureInstalled()` runs `npm install --ignore-scripts`.
+2. connector execution still depends on `run-connector.cjs`.
+3. the SEA binary still needs `node` on `PATH` for connector execution in some paths.
+4. the current runtime state model assumes a copied sidecar under `~/.vana/playwright-runner/`.
+5. installer/release work currently looks stronger than the runtime truth unless we finish the runtime rewrite.
+
+The final product is not done until all five are removed.
+
+## Final Architecture
+
+### High-level shape
+
+The system should end up as:
+
+- `connect-core`
+ - shared types, events, state, paths, errors
+- `connectors`
+ - registry resolution and connector discovery
+- `runtime-core`
+ - connector run contracts
+ - capability contracts
+ - event contracts
+ - state machine
+- `runtime-playwright`
+ - Playwright-based implementation of the runtime core
+- `cli`
+ - command grammar, prompts, output formatting, JSON mode
+- `install/release`
+ - artifact generation, checksums, installer scripts, release metadata
+
+The CLI should call the runtime core directly.
+The runtime core should not depend on CLI-specific assumptions.
+
+### Runtime host model
+
+The first final host should be **in-process**:
+
+- no external `node` child process
+- no deployed `run-connector.cjs`
+- no deployed copied `playwright-runner` package
+
+Important nuance:
+
+- the runtime core must still be written so a worker-based host can be added later
+- the host boundary is an internal implementation detail
+- the API surface is capability-based, not host-based
+
+### Why in-process first
+
+- simplest final product
+- easiest to test
+- easiest to make honestly standalone
+- avoids committing early to message-passing/orchestration complexity that may not be necessary
+- still compatible with adding a worker host later if isolation becomes necessary
+
+## Runtime Core Contract
+
+### Principle
+
+The runtime must expose **runs**, **events**, and **debug capabilities**.
+
+The runtime must not expose raw Playwright `Browser`, `Context`, or `Page` objects outside the runtime implementation.
+
+### Required runtime interfaces
+
+Illustrative shape:
+
+```ts
+interface ConnectorRuntime {
+ ensureReady(request: RuntimeSetupRequest): Promise;
+ startRun(request: ConnectorRunRequest): Promise;
+ listCapabilities(source: string): Promise;
+}
+
+interface ConnectorRunHandle {
+ id: string;
+ events(): AsyncIterable;
+ provideInput(input: Record): Promise;
+ stop(reason?: string): Promise;
+ getState(): Promise;
+ takeScreenshot(): Promise;
+ inspect(): Promise;
+}
+```
+
+This interface is illustrative, not exact naming.
+The important constraint is the shape of the interaction:
+
+- create run
+- consume events
+- provide input
+- inspect or debug
+- stop
+
+### Required runtime events
+
+The runtime core must produce events rich enough for:
+
+- CLI human output
+- CLI `--json` mode
+- skill orchestration
+- Desktop mediation later
+
+Required events:
+
+- `run-started`
+- `state-changed`
+- `needs-input`
+- `headed-required`
+- `legacy-auth`
+- `artifact-created`
+- `collection-complete`
+- `ingest-started`
+- `ingest-complete`
+- `runtime-error`
+- `run-stopped`
+
+Not every event must surface directly to end users, but they must exist in the internal contract.
+
+### Required runtime debug capabilities
+
+These are required in the runtime API even if the first CLI only surfaces a subset:
+
+- `takeScreenshot`
+- `getCurrentUrl`
+- `getRunState`
+- `inspect`
+ - current step
+ - current page title if available
+ - current URL if available
+ - whether browser is headed/headless
+ - whether login/session is already established
+- `stopRun`
+
+Future-facing but not required in v1 CLI output:
+
+- limited DOM/visible text inspection
+- network capture inspection
+- explicit headed handoff
+
+### Why these debug capabilities matter
+
+They are needed for:
+
+- agent-assisted connector development
+- richer Desktop mediation
+- debugging failed auth flows
+- future recovery/inspection surfaces
+
+They are **not** a reason to expose raw Playwright objects outside the runtime.
+
+## Browser And Profile Strategy Contracts
+
+The runtime core must separate:
+
+- execution backend
+- browser strategy
+- profile strategy
+
+### Browser strategy
+
+Required shape:
+
+```ts
+interface BrowserStrategy {
+ ensureBrowserReady(): Promise;
+ launch(request: LaunchRequest): Promise;
+}
+```
+
+Initial implementation:
+
+- Playwright
+- managed Chromium download
+- headless by default
+- headed fallback supported
+
+### Profile strategy
+
+Required shape:
+
+```ts
+interface ProfileStrategy {
+ resolveProfile(run: ConnectorRunRequest): Promise;
+}
+```
+
+Initial implementation:
+
+- isolated Vana-managed profile under `~/.vana/browser-profiles/`
+
+The runtime must not hardcode assumptions that make these impossible later:
+
+- existing browser profile
+- ephemeral sandbox profile
+- connector-specific profile behavior
+
+## Connector Compatibility Requirements
+
+### Core requirement
+
+Existing connectors from `data-connectors` must continue to work without requiring mass rewrites.
+
+### What must be preserved
+
+- current connector registry resolution
+- current connector file loading model
+- current result shape
+- `requestInput` / input-driven flow support
+- legacy auth detection for connectors using older patterns
+- headed fallback capability
+
+### Compatibility adapter
+
+The runtime rewrite must provide a compatibility layer that adapts current connector expectations into the new runtime core.
+
+That adapter is responsible for:
+
+- loading connector modules
+- constructing the page/runtime API the connector expects
+- translating connector input requests into runtime `needs-input` events
+- translating legacy auth into `legacy-auth`
+- translating completion/errors into runtime events
+
+### What is explicitly not allowed
+
+The compatibility adapter must not require:
+
+- shelling out to `run-connector.cjs`
+- spawning system `node`
+- copying connector runner source into the user home
+
+## Runtime Setup In The Final Product
+
+### Final meaning of `vana setup`
+
+`vana setup` should only do runtime work the user actually needs.
+
+Allowed responsibilities:
+
+- create required state directories
+- ensure browser cache location exists
+- ensure managed Chromium is installed
+- verify runtime health
+- optionally clean/repair caches
+
+Disallowed responsibilities:
+
+- `npm install`
+- `pnpm install`
+- copying `run-connector.cjs` into `~/.vana/`
+- copying `playwright-runner` into `~/.vana/`
+
+### Final runtime state check
+
+The runtime should be considered installed/healthy based on:
+
+- runtime core availability inside the binary/package
+- browser availability / browser install state
+- required state directories
+
+It must no longer depend on:
+
+- `~/.vana/playwright-runner/index.cjs`
+- `~/.vana/run-connector.cjs`
+
+### Files that should disappear from fresh installs
+
+Fresh installs of the final product should not create:
+
+- `~/.vana/playwright-runner/`
+- `~/.vana/run-connector.cjs`
+
+Required files/directories that may remain:
+
+- `~/.vana/connectors/`
+- `~/.vana/browser-profiles/`
+- `~/.vana/browsers/`
+- `~/.vana/logs/`
+- `~/.vana/last-result.json`
+- `~/.vana/vana-connect-state.json`
+
+## Chromium Installation In The Final Product
+
+### Decision
+
+Chromium remains a one-time managed download during setup.
+
+### Requirements
+
+- the download must be initiated by `vana` itself
+- the user must not need `npx playwright install`
+- the user must not need external `node`
+- the process must log clearly and recover cleanly
+
+### Implementation constraint
+
+Browser installation must be triggered through internal runtime code, not shelling out to:
+
+- `npx playwright install`
+- `npm exec playwright install`
+
+If using Playwright internals is required, that is acceptable.
+If a small vendored installer helper is required, that is acceptable.
+
+What is not acceptable is preserving an external Node/npm requirement for browser installation.
+
+## CLI Contract That Must Be Preserved
+
+The runtime rewrite must not change the public command model:
+
+```bash
+vana connect
+vana sources
+vana status
+vana setup
+```
+
+### Human-mode contract
+
+Must remain:
+
+- calm
+- concise
+- explicit about setup/downloads
+- explicit about local-only vs Personal Server ingest
+- explicit about next steps
+
+### Machine-mode contract
+
+The following commands must remain first-class:
+
+```bash
+vana sources --json
+vana status --json
+vana connect --json --no-input
+```
+
+### Required outcome preservation
+
+The runtime rewrite must preserve these user-visible outcomes:
+
+- `connected_and_ingested`
+- `connected_local_only`
+- `needs_input`
+- `legacy_auth`
+- `setup_required`
+- `connector_unavailable`
+- `runtime_error`
+- `unexpected_internal_error`
+
+The runtime implementation may change; the product contract should not.
+
+## Installer And Release Final Product
+
+### Final install channels
+
+The final product should ship with all of these:
+
+1. `install.sh`
+2. `install.ps1`
+3. GitHub Release assets
+4. Homebrew formula/tap
+5. Winget manifest
+
+The package-manager channels should be generated from the same release artifact truth, not maintained by hand indefinitely.
+
+### Artifact matrix
+
+Required release assets:
+
+- `vana-linux-x64.tar.gz`
+- `vana-darwin-x64.tar.gz`
+- `vana-darwin-arm64.tar.gz`
+- `vana-win32-x64.zip`
+- matching `.sha256` files for each
+
+### Installer contract
+
+Installers must:
+
+- resolve the correct release asset
+- download the asset
+- download the checksum
+- verify the checksum
+- install `vana` into a normal user location
+- print the next step
+
+Installers must not:
+
+- ask the user to install Node
+- install npm packages globally
+- expose `@opendatalabs/connect` as the primary user-facing concept
+
+### Install locations
+
+Default install targets:
+
+- macOS/Linux binary:
+ - `~/.local/bin/vana`
+- macOS/Linux runtime root:
+ - `~/.local/share/vana/`
+- Windows binary:
+ - user-level bin directory suitable for `PATH`
+- Windows runtime root:
+ - user-local app data under `Vana`
+
+### Upgrade contract
+
+Re-running the installer must:
+
+- install a new version under a versioned release directory
+- update the `current` pointer/symlink
+- preserve local data under `~/.vana/`
+
+### Uninstall contract
+
+The repo must document how to uninstall:
+
+- the `vana` binary
+- installed release directories
+
+It should not tell users to manually guess where files live.
+
+## Release Pipeline Requirements
+
+### Build truth
+
+The release pipeline must produce the same artifact shape used by installers and package-manager channels.
+
+### Build inputs
+
+Build-time use of Node 25 SEA is acceptable.
+Runtime dependence on user-installed Node is not.
+
+### Required release jobs
+
+1. build artifact matrix
+2. smoke-test each artifact at least minimally
+3. attach assets to GitHub releases
+4. publish npm package only as a secondary distribution channel
+5. generate/update package-manager metadata
+
+### Canary requirements
+
+Canary releases must still work, but canary should not be the main product install story once the final installer path is live.
+
+### Stable release requirements
+
+Stable release must not be cut until:
+
+- no external `node` is required at runtime
+- no external `npm` is required at setup time
+- installed `vana` works on a clean machine
+
+## Skill And Onboarding Requirements
+
+The skill and agent guidance should end at:
+
+1. prefer installed `vana`
+2. fall back to release-channel package only when appropriate
+3. use local dev path only for explicit debugging/development
+
+The skill should not point users at raw scripts once the final runtime rewrite is complete.
+
+The skill should continue to use:
+
+- `vana sources --json`
+- `vana status --json`
+- `vana connect --json --no-input`
+
+## Required Test Matrix
+
+### Runtime correctness tests
+
+Required automated tests:
+
+- registry discovery from repo and non-repo working directories
+- source listing from installed binary
+- missing connector outcome
+- `needs_input` outcome
+- `legacy_auth` outcome
+- successful local collection
+- successful Personal Server ingest when target exists
+- headed fallback path where applicable
+
+### Standalone truth tests
+
+Required proof before calling the product standalone:
+
+- installed `vana` runs with `node` absent from `PATH`
+- installed `vana` runs with `npm` absent from `PATH`
+- `vana status --json` works
+- `vana sources --json` works
+- `vana connect github --json --no-input` works and returns `needs_input`
+
+This must be enforced in CI, not just proven once manually.
+
+### Installer tests
+
+Required automated tests:
+
+- Unix installer smoke test
+- checksum verification failure test
+- upgrade test
+- Windows installer smoke test
+
+### Release artifact tests
+
+Per-platform smoke tests must verify:
+
+- artifact starts
+- `--help` works
+- `status --json` works
+
+At least one platform in CI must also verify a full installer path.
+
+## Required Deletions Before Marking Complete
+
+The following transitional mechanisms must be removed or retired from the product path:
+
+- `run-connector.cjs` as a user/runtime dependency
+- copied `playwright-runner` installation in the user home
+- runtime `npm install`
+- runtime `npx playwright install`
+- runtime execution through system `node`
+
+These files may remain temporarily in the repo for reference during migration, but they must not be part of the final product path.
+
+## Execution Order
+
+This is the order implementation should follow.
+
+### 1. Freeze the runtime contract
+
+Create the runtime core interfaces and event/capability contracts first.
+
+Required deliverables:
+
+- runtime contract types
+- run lifecycle state model
+- capability model
+- event model
+
+### 2. Port orchestration logic into TypeScript runtime modules
+
+Port the logic currently living in `run-connector.cjs` into runtime modules.
+
+Required deliverables:
+
+- direct orchestration module
+- no stdio JSON parsing in the main product path
+- direct in-memory input flow in the runtime core
+
+### 3. Port the Playwright runner into runtime modules
+
+Port the logic currently living in `playwright-runner/index.cjs`.
+
+Required deliverables:
+
+- Playwright host module
+- browser launch module
+- page API / connector compatibility adapter
+- debug capability implementation
+
+### 4. Replace CLI runtime calls
+
+Replace `ManagedPlaywrightRuntime.runConnector()` so it uses runtime modules directly.
+
+Required deliverables:
+
+- no `spawn(node, run-connector.cjs ...)`
+- preserved CLI events/outcomes
+
+### 5. Replace setup path
+
+Replace `ensureInstalled()` so it only manages:
+
+- browser availability
+- state directories
+- runtime health
+
+Required deliverables:
+
+- no `npm install`
+- no copied runner source
+
+### 6. Prove standalone truth
+
+Run installed-binary tests with `node` and `npm` absent from `PATH`.
+
+This is the gate that decides whether the installer/release story is honest.
+
+### 7. Finalize installer/release channels
+
+Only after standalone truth is proven:
+
+- installer scripts become the canonical story
+- Homebrew and winget metadata become required
+- skill/onboarding can fully prefer installed `vana`
+
+### 8. Remove transitional runtime assets from the product path
+
+After parity is proven, remove or retire:
+
+- script wrappers
+- copied runner assumptions
+- obsolete runtime state checks
+
+## What Must Not Happen During Execution
+
+To protect the final product quality, do not:
+
+- reintroduce external `node` as an implicit runtime dependency
+- reintroduce external `npm` as a setup dependency
+- hardcode Playwright internals into the public runtime contract
+- expose raw browser/page objects outside the runtime implementation
+- couple the runtime core to CLI prompt behavior
+- hardcode isolated-profile assumptions into the runtime core
+- change the public CLI grammar during the runtime rewrite
+
+## Completion Criteria
+
+The final product is done only when all of these are true:
+
+1. A clean machine can install `vana` without preinstalled Node/npm.
+2. `vana setup` completes without calling external package managers.
+3. `vana status --json` works from the installed binary.
+4. `vana sources --json` works from the installed binary.
+5. `vana connect github --json --no-input` returns `needs_input` from the installed binary on a clean machine.
+6. Existing connectors still work through the compatibility adapter.
+7. Headed fallback still exists.
+8. Installer scripts are real and verified.
+9. GitHub release assets, Homebrew, and winget all point at the same artifact truth.
+10. The skill can honestly prefer installed `vana` as the main path.
+
+Until all ten are true, the final product defined by this spec is not complete.
diff --git a/docs/CLI-FOUNDATION-ASSESSMENT.md b/docs/CLI-FOUNDATION-ASSESSMENT.md
new file mode 100644
index 00000000..19597831
--- /dev/null
+++ b/docs/CLI-FOUNDATION-ASSESSMENT.md
@@ -0,0 +1,472 @@
+# `vana-connect` CLI Foundation Assessment
+
+_As of March 12, 2026_
+
+## Executive summary
+
+The current `vana-connect` foundation is real. It is not just hand-wavy prototype text. There is a working setup path, a working runner interaction model, a working connector fetch path, a working validator, and a working output contract.
+
+But compared against the best prior art in CLI and SDK UX, the current state is closer to:
+
+- **strong internal tooling**
+- **promising runtime architecture**
+- **weak product shell**
+
+That is a good starting point for shipping an MVP quickly, because the engine is more real than the surface. The risk is not "this is fake." The risk is "if we ship this mostly as-is, it will feel improvised rather than world-class."
+
+## Rating summary
+
+Scored against the standards implied by `uv`, `gh`, Vercel CLI, Stripe DX, and the product goals in the PRD.
+
+| Component | Current rating | Why |
+| :--------------------------- | :------------- | :-------------------------------------------------------------------------- |
+| Setup flow | C+ | Functional, but reads like bootstrap ops |
+| Local state model | B | Coherent and understandable, good foundation |
+| Runner protocol | B+ | Strongest asset; supports both human and agent flows |
+| Validator | B- | Valuable internal quality gate, not yet polished user-facing diagnostics |
+| Connector discovery | C | Works, but weak discoverability and trust UX |
+| Result contract | B- | Directionally good, still under-specified as a public SDK contract |
+| Overall onboarding readiness | C+ | Can work, but not yet likely to create a `uv`/Vercel-level first impression |
+
+## What is genuinely real and reusable
+
+### 1. Runner protocol
+
+The best part of the current system is [run-connector.cjs](https://github.com/vana-com/data-connectors/blob/main/skills/vana-connect/scripts/run-connector.cjs).
+
+What is strong:
+
+- explicit event types on stdout
+- explicit exit codes
+- machine-readable output by default
+- optional human-readable output via `--pretty`
+- input continuation model using files instead of forced restarts
+
+Relevant code:
+
+- usage and event framing at [run-connector.cjs:3](https://github.com/vana-com/data-connectors/blob/main/skills/vana-connect/scripts/run-connector.cjs#L3)
+- output contract at [run-connector.cjs:13](https://github.com/vana-com/data-connectors/blob/main/skills/vana-connect/scripts/run-connector.cjs#L13)
+- human-readable formatting at [run-connector.cjs:72](https://github.com/vana-com/data-connectors/blob/main/skills/vana-connect/scripts/run-connector.cjs#L72)
+- request-input handling at [run-connector.cjs:196](https://github.com/vana-com/data-connectors/blob/main/skills/vana-connect/scripts/run-connector.cjs#L196)
+- file-based continuation at [run-connector.cjs:214](https://github.com/vana-com/data-connectors/blob/main/skills/vana-connect/scripts/run-connector.cjs#L214)
+
+Why this matters:
+
+- this is the core reason a single CLI can serve both humans and coding agents
+- it already encodes the right architectural instinct: one lifecycle, multiple presentations
+
+Compared to prior art:
+
+- not as refined as Stripe CLI or `gh`
+- but architecturally stronger than a lot of one-off CLIs because it already thinks in events, modes, and resumability
+
+Verdict:
+
+- keep this
+- formalize it
+- build the new CLI around it
+
+### 2. Local state model
+
+The current `~/.vana/` layout is a meaningful asset.
+
+What is good:
+
+- runner location
+- connector cache
+- persistent browser profiles
+- single obvious last-result artifact
+
+Relevant docs:
+
+- [SETUP.md](https://github.com/vana-com/data-connectors/blob/main/skills/vana-connect/SETUP.md)
+
+Why this matters:
+
+- world-class CLIs make state legible
+- Vercel, Supabase, and Doppler all benefit from explicit local context
+
+Current limitations:
+
+- state is documented, but not yet exposed through a coherent CLI surface
+- there is no `status`, `doctor`, `inspect`, or `auth list` kind of command
+
+Verdict:
+
+- keep the state model
+- make it visible and inspectable
+
+### 3. Validator
+
+The validator is more substantial than a vibe-coded placeholder.
+
+Relevant code:
+
+- report model at [validate.cjs:23](https://github.com/vana-com/data-connectors/blob/main/skills/vana-connect/scripts/validate.cjs#L23)
+- metadata checks at [validate.cjs:185](https://github.com/vana-com/data-connectors/blob/main/skills/vana-connect/scripts/validate.cjs#L185)
+- script pattern checks at [validate.cjs:227](https://github.com/vana-com/data-connectors/blob/main/skills/vana-connect/scripts/validate.cjs#L227)
+
+Why it matters:
+
+- quality gates are one of the strongest things you can borrow from Stripe-style DX
+- the system already has a place where correctness can accumulate
+
+Current limitation:
+
+- it is aimed primarily at connector creators, not end users
+- it needs better categorization, remediation guidance, and friendlier summaries
+
+Verdict:
+
+- keep it
+- evolve it into the nucleus of `doctor`, `inspect`, and trust/debug surfaces
+
+## What currently feels improvised
+
+### 1. Setup flow
+
+[setup.sh](https://github.com/vana-com/data-connectors/blob/main/skills/vana-connect/scripts/setup.sh) is useful but nowhere near `uv` / Vercel quality.
+
+Relevant code:
+
+- bootstrap flow at [setup.sh:19](https://github.com/vana-com/data-connectors/blob/main/skills/vana-connect/scripts/setup.sh#L19)
+- cross-repo clone at [setup.sh:27](https://github.com/vana-com/data-connectors/blob/main/skills/vana-connect/scripts/setup.sh#L27)
+- dependency install at [setup.sh:35](https://github.com/vana-com/data-connectors/blob/main/skills/vana-connect/scripts/setup.sh#L35)
+- browser install at [setup.sh:38](https://github.com/vana-com/data-connectors/blob/main/skills/vana-connect/scripts/setup.sh#L38)
+
+What’s weak:
+
+- no single packaged install artifact
+- no versioned install / upgrade story
+- no environment detection beyond "try it"
+- no polished failure recovery
+- no clear explanation of local state after install
+- no post-install success verification beyond file existence
+
+Compared to prior art:
+
+- `uv` compresses install + usage into a nearly frictionless mental model
+- Vercel turns first run into a guided, trustworthy workflow
+- current setup feels like a repo maintenance script
+
+Verdict:
+
+- do not expose this as the final product onboarding
+- it is acceptable as an internal bootstrap while a real CLI installer is built
+
+### 2. Connector discovery
+
+[fetch-connector.cjs](https://github.com/vana-com/data-connectors/blob/main/skills/vana-connect/scripts/fetch-connector.cjs) works, but it is not yet a strong user-facing discovery experience.
+
+Relevant code:
+
+- raw GitHub registry fetch at [fetch-connector.cjs:24](https://github.com/vana-com/data-connectors/blob/main/skills/vana-connect/scripts/fetch-connector.cjs#L24)
+- partial match search at [fetch-connector.cjs:57](https://github.com/vana-com/data-connectors/blob/main/skills/vana-connect/scripts/fetch-connector.cjs#L57)
+- download flow at [fetch-connector.cjs:70](https://github.com/vana-com/data-connectors/blob/main/skills/vana-connect/scripts/fetch-connector.cjs#L70)
+
+What’s weak:
+
+- ambiguous partial matching
+- no browse/list/search UX
+- no confidence signals
+- no versioning or channel model
+- no checksum or authenticity UX at the command surface
+
+Compared to prior art:
+
+- `gh` and Vercel make context clear before acting
+- this currently behaves more like a private helper utility
+
+Verdict:
+
+- keep the logic
+- replace the product surface around it
+
+### 3. Result contract
+
+The scoped result format is good enough to build on, but not fully mature.
+
+Strengths:
+
+- composable
+- reasonably simple
+- aligns with Personal Server storage by scope
+
+Weaknesses:
+
+- needs a tighter public contract
+- unclear semantics for partial success
+- unclear long-term metadata conventions
+- not yet expressed as a formal SDK boundary
+
+Verdict:
+
+- keep the scoped model
+- formalize types and lifecycle semantics in the SDK
+
+## Assessment against product goals
+
+### Fast to first value
+
+Current state: **partially met**
+
+Why:
+
+- the mechanics exist
+- the journey still has too much "read docs, run setup script, fetch connector, run wrapper, understand files"
+
+This is not a five-minute magic experience yet.
+
+### Invisible once running
+
+Current state: **not yet met**
+
+Why:
+
+- no first-class sync/schedule story
+- no polished re-auth loop
+- no status surface
+
+### Trustworthy data
+
+Current state: **partially met**
+
+Why:
+
+- local-first helps
+- validator helps
+- but the UX does not yet make provenance, freshness, and success status legible enough
+
+### Composable output
+
+Current state: **mostly met**
+
+Why:
+
+- structured JSON exists
+- scoped keys are useful
+- the runner already thinks in events
+
+This is one of the strongest areas.
+
+### Graceful failure
+
+Current state: **mixed**
+
+Why:
+
+- the protocol has the bones
+- the user-facing remediation layer is still thin
+
+## What `vana-com/vana-connect` changes
+
+Public repo reference:
+
+- https://github.com/vana-com/vana-connect
+
+As of March 12, 2026, the public `vana-connect` repo presents itself as a **Vana Connect SDK** focused on app-side session creation, grant handling, and Personal Server data access.
+
+What that means for this CLI work:
+
+- it is **not** the same thing as the local connector runner / headless scraping CLI
+- but it provides a very useful future boundary
+
+The likely product split is:
+
+- **local collection runtime / CLI**
+ - setup
+ - connector install
+ - connect / sync / auth / inspect
+ - local result and Personal Server population
+- **app-facing SDK**
+ - create sessions
+ - request scopes
+ - poll grants
+ - fetch granted data
+
+This is good news. It means the CLI does not need to absorb all SDK responsibilities. It can be excellent at collection and local orchestration while the public app SDK handles the developer integration side.
+
+The key caution:
+
+- do not let the existence of the SDK muddy the CLI’s first-run story
+- the CLI should still optimize around "get my data in locally fast"
+
+## How to get an MVP out quickly without blowing first impression
+
+This is the most important practical question.
+
+### Core principle
+
+Do **not** try to make the MVP complete.
+
+Do make the MVP feel:
+
+- intentional
+- trustworthy
+- fast
+- legible
+
+That is enough to leave an excellent impression even if deeper features are missing.
+
+### The fastest credible MVP
+
+Build a thin product shell over the current primitives.
+
+That MVP should focus only on:
+
+- install
+- connect one source
+- list available sources
+- inspect local status
+- re-auth / reconnect
+- clear machine-readable mode
+
+Not on:
+
+- full TUI
+- full scheduling
+- full multi-environment sync
+- advanced Personal Server operations
+- long-range blockchain/token use cases
+
+### MVP qualities that matter disproportionately
+
+#### 1. One obvious first command
+
+Examples:
+
+- `vana-connect setup`
+- `vana-connect connect steam`
+
+The user should not need to learn the internal script model.
+
+#### 2. One excellent happy path
+
+The first run should be highly polished for:
+
+- install if missing
+- explain what will be installed
+- fetch connector
+- run connector
+- request credentials only if needed
+- summarize what was collected
+- say what to do next
+
+One journey polished deeply beats ten half-finished commands.
+
+#### 3. One strong machine-readable mode
+
+Agents need:
+
+- `--json`
+- stable event types
+- stable exit codes
+- no surprise prompts when non-interactive
+
+You already have much of this. Preserve it.
+
+#### 4. One clear trust message
+
+The first-run copy should make three things explicit:
+
+- credentials stay local
+- what gets installed locally
+- where the data is stored
+
+That alone will materially improve onboarding.
+
+#### 5. One basic diagnostics surface
+
+Even for MVP, ship a minimal:
+
+- `vana-connect status`
+
+It should answer:
+
+- installed or not
+- connectors present
+- sessions present
+- last run result
+- last error if known
+
+This is cheap and high leverage.
+
+## Recommended MVP line
+
+If speed matters, the target should be:
+
+### MVP v1
+
+- `vana-connect setup`
+- `vana-connect list`
+- `vana-connect connect `
+- `vana-connect status`
+- `vana-connect inspect ` or `vana-connect logs `
+- `--json`
+- `--yes`
+- `--no-input`
+
+Backed initially by the existing scripts and runtime.
+
+### Defer to v1.1+
+
+- scheduling
+- bulk connect all
+- richer auth/session management
+- doctor / repair
+- richer sync to Personal Server
+- interactive/TUI mode
+
+This gets you to a productized MVP quickly without pretending the system is complete.
+
+## Confidence: how much can we confidently initialize now?
+
+Quite a lot, if you are disciplined about scope.
+
+You can confidently initialize a world-class trajectory now by locking:
+
+- the first-run journey
+- the command grammar
+- the mode model for human vs agent
+- the trust copy
+- the local state visibility model
+
+You do **not** need to solve every future feature to do that well.
+
+In other words:
+
+- you can absolutely ship an MVP soon
+- and still strongly influence whether the product later feels `uv`/Vercel-like or forever patched together
+
+The risk is not moving too early. The risk is exposing raw primitives before the first-run experience is designed.
+
+## What to keep, what to replace
+
+Keep:
+
+- runner event model
+- requestInput continuation model
+- local state layout
+- validator core
+- scoped result approach
+
+Replace or wrap:
+
+- setup bootstrap UX
+- direct script-oriented command surface
+- connector discovery UX
+- user-facing diagnostics UX
+- first-run copy and help model
+
+## Conclusion
+
+The current foundation is strong enough to justify confidence, but not strong enough to ship as a polished CLI without a product pass.
+
+The correct strategy is:
+
+- treat the current scripts and contracts as the engine
+- build a thin but very intentional CLI shell for MVP
+- optimize the first-run journey ruthlessly
+- keep agent and human support within one command model via output and prompt modes
+
+That path is both the fastest route to MVP and the best route to a long-term world-class UX.
diff --git a/docs/CLI-ONBOARDING-COPY.md b/docs/CLI-ONBOARDING-COPY.md
new file mode 100644
index 00000000..34095429
--- /dev/null
+++ b/docs/CLI-ONBOARDING-COPY.md
@@ -0,0 +1,390 @@
+# `vana-connect` CLI Onboarding Copy
+
+_As of March 12, 2026_
+
+## Purpose
+
+This document captures the intended onboarding copy for the first version of the `vana connect` CLI.
+
+It is not final marketing copy. It is product copy for:
+
+- first-run trust
+- install prompts
+- auth prompts
+- success summaries
+- failure messages
+
+The goal is to make the first experience feel:
+
+- clear
+- fast
+- trustworthy
+- local-first
+- technically serious
+
+## Tone
+
+The CLI should sound:
+
+- calm
+- precise
+- concise
+
+It should not sound:
+
+- cute
+- overly corporate
+- vague
+- overly verbose
+
+## First-run principles
+
+Before the CLI installs or writes anything significant, it should explain:
+
+- what is missing
+- what it will install or create
+- where it will store it
+- that credentials stay local
+
+After success, it should explain:
+
+- what data was collected
+- whether it was ingested to the Personal Server
+- what the next useful action is
+
+## Canonical first-run example
+
+Command:
+
+```bash
+vana connect steam
+```
+
+### If runtime is missing
+
+Suggested copy:
+
+```text
+Vana Connect needs a local browser runtime before it can connect Steam.
+
+This will install:
+- the connector runner
+- a Chromium browser engine
+- local runtime files under ~/.vana/
+
+Your credentials stay on this machine. Nothing is sent anywhere except the platform you’re connecting to.
+
+Continue? [Y/n]
+```
+
+### If `--yes` is present
+
+Suggested copy:
+
+```text
+Installing local runtime for Vana Connect...
+```
+
+## Install progress copy
+
+Suggested copy:
+
+```text
+Installing runner...
+Installing browser engine...
+Preparing local runtime...
+Runtime ready.
+```
+
+This should stay short. It should not dump raw dependency noise unless the install fails or verbose mode is requested.
+
+## Connector fetch copy
+
+### Human mode
+
+Suggested copy:
+
+```text
+Finding a connector for Steam...
+Connector ready.
+```
+
+### If connector is not found
+
+Suggested copy:
+
+```text
+No connector is available for Steam yet.
+```
+
+Optional next step later:
+
+```text
+You can create one with Vana Connect tooling, but that is not part of the default connect flow.
+```
+
+For MVP, avoid dragging the user into connector creation unless that is the explicit task.
+
+## Session reuse copy
+
+### If a saved session may exist
+
+Suggested copy:
+
+```text
+Found an existing Steam session. Trying that first...
+```
+
+### If re-auth is needed
+
+Suggested copy:
+
+```text
+Your saved Steam session needs to be refreshed.
+```
+
+This should feel like a normal repair path, not a mysterious failure.
+
+## Auth prompt copy
+
+### Base trust message
+
+Suggested copy:
+
+```text
+To connect Steam, Vana Connect will open a local browser session on this machine.
+Your credentials stay local.
+```
+
+### Credentials request
+
+Suggested copy:
+
+```text
+Steam needs your login details to continue.
+Enter the requested fields below.
+```
+
+### Two-factor request
+
+Suggested copy:
+
+```text
+Steam asked for a verification code.
+Enter the current code to continue.
+```
+
+### If non-interactive mode blocks prompting
+
+Suggested copy:
+
+```text
+Steam needs additional input, but prompting is disabled in --no-input mode.
+Run again without --no-input, or provide the required inputs explicitly.
+```
+
+## Collection progress copy
+
+Human mode should communicate phases, not raw events.
+
+Suggested copy:
+
+```text
+Connecting to Steam...
+Collecting your data...
+Still working...
+```
+
+If counts are known:
+
+```text
+Collecting your data...
+Fetched 124 items so far...
+```
+
+Avoid fake precision. Only show counts when they are meaningful.
+
+## Personal Server copy
+
+The CLI should speak in terms of the user’s Personal Server, not internal app architecture.
+
+### If Personal Server is available
+
+Suggested copy:
+
+```text
+Personal Server detected. Syncing your Steam data...
+```
+
+### If Personal Server is unavailable
+
+Suggested copy:
+
+```text
+No Personal Server is available right now, so your Steam data was saved locally.
+```
+
+### If ingest fails
+
+Suggested copy:
+
+```text
+Your Steam data was collected, but syncing to your Personal Server did not complete.
+The local result was saved successfully.
+```
+
+This distinction is critical. The CLI should never blur local success and ingest success.
+
+## Success copy
+
+### Best-case success
+
+Suggested copy:
+
+```text
+Connected Steam.
+Collected your Steam data and synced it to your Personal Server.
+```
+
+### Local-only success
+
+Suggested copy:
+
+```text
+Connected Steam.
+Collected your Steam data and saved it locally.
+```
+
+### Supporting detail
+
+After the summary, provide one concise supporting line:
+
+```text
+Next: run `vana connect status` to inspect your current connection state.
+```
+
+Optional supporting detail:
+
+```text
+Local result: ~/.vana/last-result.json
+```
+
+Artifact paths should be supporting information, not the main success story.
+
+## Status copy
+
+Example shape for `vana connect status`:
+
+```text
+Vana Connect status
+
+Runtime: installed
+Personal Server: available
+
+Steam: connected, synced
+GitHub: connected, local only
+Spotify: not connected
+```
+
+If detail is needed, `--json` should carry the richer structure.
+
+## Failure copy
+
+### Install failed
+
+Suggested copy:
+
+```text
+Vana Connect could not finish installing the local runtime.
+Check your network connection and try `vana connect setup` again.
+```
+
+### Login failed
+
+Suggested copy:
+
+```text
+Steam login did not complete.
+Check your credentials and try again.
+```
+
+### Connector unavailable
+
+Suggested copy:
+
+```text
+Vana Connect does not have a Steam connector available right now.
+```
+
+### Site changed / extraction failed
+
+Suggested copy:
+
+```text
+Steam connected, but Vana Connect could not collect the expected data.
+The site may have changed since this connector was last updated.
+```
+
+### Personal Server unavailable
+
+Suggested copy:
+
+```text
+Your data was collected, but no Personal Server is currently available for sync.
+```
+
+Each failure should ideally include one clear next step, not a wall of debugging detail.
+
+## Machine-mode guidance
+
+In `--json` mode, the CLI should not emit this human copy as prose.
+
+Instead, it should emit structured events and outcomes corresponding to:
+
+- setup started / completed / failed
+- connector resolved / not found
+- needs input
+- collection started / progressed / completed
+- ingest started / completed / failed
+- final outcome
+
+The human copy in this document exists so the default mode feels excellent. Machine mode should express the same lifecycle structurally.
+
+## Help copy principles
+
+Help output should stay compact.
+
+Example shape:
+
+```text
+vana connect Connect one data source
+vana connect list List supported sources
+vana connect status Show local and Personal Server status
+vana connect setup Install or repair local runtime
+```
+
+Then flags:
+
+```text
+--json Output machine-readable JSON
+--no-input Fail instead of prompting for input
+--yes Approve safe setup prompts automatically
+```
+
+## Copy rules to preserve
+
+- never lead with file paths
+- never imply cloud upload when only local save happened
+- never imply Personal Server sync when ingest failed or was skipped
+- never make credentials handling sound vague
+- never over-explain common success paths
+
+## Conclusion
+
+If the command model is the skeleton, this copy is the voice.
+
+For MVP, the copy should make three things feel true immediately:
+
+- this is safe
+- this is working
+- I know what happened
diff --git a/docs/CLI-ONBOARDING-WORKPLAN.md b/docs/CLI-ONBOARDING-WORKPLAN.md
new file mode 100644
index 00000000..ad2d8de0
--- /dev/null
+++ b/docs/CLI-ONBOARDING-WORKPLAN.md
@@ -0,0 +1,282 @@
+# `vana-connect` CLI Onboarding Workplan
+
+_As of March 12, 2026_
+
+## Why this exists
+
+The current `vana-connect` flow is operationally real, but the experience is still "skill-driven" rather than "product-driven." This document defines how to approach the CLI intentionally so it can serve two audiences well:
+
+- humans using the terminal directly
+- coding agents like Codex and Claude Code
+
+The goal is not to design two separate products. The goal is to design one command surface with excellent defaults, clear machine-readable behavior, and an onboarding path that reaches first value in under five minutes.
+
+## What we already have
+
+These are not small assets. They should be treated as the foundation, not throwaway prototypes.
+
+### Operational building blocks already present
+
+- one-shot setup flow in [SETUP.md](https://github.com/vana-com/data-connectors/blob/main/skills/vana-connect/SETUP.md)
+- connector discovery in `skills/vana-connect/scripts/fetch-connector.cjs`
+- execution wrapper in `skills/vana-connect/scripts/run-connector.cjs`
+- validator in `skills/vana-connect/scripts/validate.cjs`
+- scaffold / schema / register scripts for connector creation
+- a clear result artifact at `~/.vana/last-result.json`
+- a durable local state model in `~/.vana/`
+- a documented runner interaction model based on:
+ - progress events
+ - `requestInput`
+ - explicit exit codes
+ - file-based interactive continuation
+
+### Product truths already visible in the current design
+
+- local-first matters
+- credentials staying on device matters
+- connectors are failure-prone and need diagnostics
+- data output needs to be composable
+- onboarding success is "data collected and understandable"
+
+This means the hard problem is no longer "can this work?" It is "what command model and first-run flow best expresses what already works?"
+
+## The central design problem
+
+You are not designing a generic scraping CLI. You are designing a local data ingestion runtime for builders.
+
+That creates a dual-audience requirement:
+
+- humans need trust, guidance, and visibility
+- agents need token-efficient commands, deterministic output, and minimal ambiguity
+
+The correct response is usually **one interface with two output behaviors**, not two completely separate interfaces.
+
+## Working thesis
+
+`vana-connect` should be:
+
+- SDK-first
+- CLI-second
+- interactive when useful
+- scriptable by default
+- local-state aware
+- optimized around onboarding and repeat sync
+
+The CLI should expose the same underlying lifecycle to both humans and agents. The difference should mostly be in:
+
+- output formatting
+- prompting behavior
+- verbosity level
+- serialization mode
+
+Not in command taxonomy.
+
+## First design decision to settle
+
+Before command names, settle the primary onboarding journey.
+
+### The ideal first-run journey
+
+For a new user, the CLI should answer these questions in order:
+
+1. What is this and why should I trust it?
+2. What will it install or write locally?
+3. Which sources can it connect?
+4. What is the shortest path to getting my first data in?
+5. Where did the data go?
+6. What do I do next with it?
+
+If those six questions are answered cleanly, the CLI will feel much more like `uv` or Vercel.
+
+## The main thoughtwork we need
+
+This is the minimum useful sequence. Do not jump straight to command implementation.
+
+### 1. Define the canonical user journeys
+
+Write the top 3 to 5 journeys in concrete step form.
+
+The most important ones are:
+
+- first-time setup via CLI
+- connect one source now
+- connect all defaults
+- inspect what was collected
+- re-run / sync later
+- recover from auth failure
+- add a non-default connector
+
+Each journey should specify:
+
+- starting state
+- user intent
+- happy path
+- failure branches
+- end state
+
+This should be the anchor artifact for everything else.
+
+### 2. Define the audience contract
+
+Write down exactly what must be true for:
+
+- humans
+- coding agents
+
+For humans, examples:
+
+- natural command discovery
+- clear progress
+- obvious trust boundaries
+- helpful next steps
+
+For agents, examples:
+
+- machine-readable output
+- stable field names
+- deterministic exit codes
+- low-token responses
+- no surprise prompts in automation mode
+
+This avoids accidental optimization for one audience at the expense of the other.
+
+### 3. Define the single command grammar
+
+Only after the user journeys are clear should command design start.
+
+The command grammar should answer:
+
+- what are the top-level nouns or verbs?
+- what is the default command?
+- what is the shortest common invocation?
+- where do advanced operations live?
+
+This is where `gh` and `uv` are good references.
+
+### 4. Define the mode model
+
+This is probably the real solution to the human/agent split.
+
+Instead of two CLIs, define modes such as:
+
+- default mode: human-friendly
+- `--json`: machine-readable
+- `--quiet`: minimal chatter
+- `--yes`: non-interactive approval
+- `--no-input`: fail instead of prompting
+
+Possibly also:
+
+- `--agent` if there is meaningful behavior beyond `--json`
+
+But this should only exist if it changes semantics in a useful, principled way. A dedicated `--agent` flag should not become a dumping ground for "make it weird for LLMs."
+
+### 5. Define onboarding states and artifacts
+
+Map the local state explicitly:
+
+- install state
+- auth/session state per connector
+- last run state
+- result artifact locations
+- logs / screenshots / debug artifacts
+
+Then design commands that expose this state simply.
+
+If users cannot answer "what has been installed, connected, and collected?" the CLI will not feel trustworthy.
+
+### 6. Define failure UX before polishing happy paths
+
+This product lives or dies on graceful failure.
+
+Document the primary failure classes:
+
+- missing setup
+- connector not found
+- login expired
+- site changed
+- anti-bot / CAPTCHA
+- partial collection
+- schema validation failure
+
+For each, define:
+
+- exit code
+- stderr message
+- suggested fix
+- whether retry is safe
+
+## Recommended organization
+
+Yes, use `.md` files. Keep them small, opinionated, and decision-oriented.
+
+Recommended working set:
+
+- `skills/vana-connect/CLI-ONBOARDING-WORKPLAN.md`
+ - this file
+- `skills/vana-connect/CLI-USER-JOURNEYS.md`
+ - concrete step-by-step journeys
+- `skills/vana-connect/CLI-AUDIENCE-CONTRACT.md`
+ - human vs agent requirements
+- `skills/vana-connect/CLI-COMMAND-MODEL.md`
+ - command grammar, modes, flags, output conventions
+- `skills/vana-connect/CLI-STATE-MODEL.md`
+ - local artifacts, sessions, logs, result files
+- `skills/vana-connect/CLI-ONBOARDING-COPY.md`
+ - actual first-run messages, prompts, help text, examples
+
+Optional later:
+
+- `skills/vana-connect/CLI-V1-SPEC.md`
+ - the implementation target after the exploratory docs settle
+
+## What to avoid
+
+Avoid these traps:
+
+- mapping each existing script directly into a public command surface
+- designing separate human and agent CLIs
+- over-indexing on distant future blockchain/token use cases
+- introducing a TUI before proving it helps first-run success
+- importing too much Personal Server scope before the connector flow is crisp
+
+Those are all likely to blur the product before the core loop is excellent.
+
+## Scope guidance for future context
+
+Some future context is useful. Too much is distracting.
+
+Useful now:
+
+- the Personal Server is the immediate downstream destination
+- permissions and sync will likely matter soon
+- local-first trust is central
+- richer operations may come later
+
+Not useful for v1 onboarding design:
+
+- tokenization / monetization flows
+- broad blockchain integration
+- speculative long-range ecosystem surfaces
+
+The scraping-transport work Kahtaf is doing is relevant at the SDK/runtime boundary, but probably not at the CLI grammar boundary. Treat it as a future execution backend, not a reason to redesign onboarding.
+
+## Recommended next sequence
+
+1. Lock the first-run user journeys.
+2. Lock the audience contract for human vs agent.
+3. Design one command model that serves both.
+4. Write the actual onboarding copy and help text.
+5. Only then design the SDK and CLI package layout.
+
+That order matters. If you skip the user journeys and jump to code structure, the resulting CLI will likely mirror the current scripts instead of the user experience you want.
+
+## Opinionated conclusion
+
+The right question is not "what commands should `vana-connect` have?"
+
+The right first question is:
+
+**"What exact first-run experience should make both a human and an agent feel: this is obvious, trustworthy, fast, and composable?"**
+
+Once that is written down concretely, the command model becomes much easier.
diff --git a/docs/CLI-OPEN-ISSUES.md b/docs/CLI-OPEN-ISSUES.md
new file mode 100644
index 00000000..50066847
--- /dev/null
+++ b/docs/CLI-OPEN-ISSUES.md
@@ -0,0 +1,496 @@
+# CLI Open Issues
+
+_March 16, 2026_
+
+Tracked issues for the CLI, organized by what kind of work each requires.
+
+**Task types:**
+
+- **Iterate** — we have enough docs/research; apply findings to code or copy
+- **Research** — need investigation, benchmarking, or code archaeology first
+- **Brainstorm** — open-ended design question; multiple plausible paths
+- **Tim + Claude** — needs a decision from Tim, possibly with Claude's input
+
+---
+
+## Iterate
+
+Issues where we already know enough to act. Existing docs
+(CLI-UX-QUALITY-BAR.md, CLI-DEMO-GUIDELINES.md, agent-friendly research)
+provide the reference frame.
+
+### Connect flow transcript quality audit
+
+The connect flow has the right overall intention but hasn't had close
+attention to details. Multiple paused spinner lines stack up line-by-line
+during `vana connect`, making it look like a log dump rather than a
+choreographed experience. But the spinners are just the symptom — the
+deeper issue is that the full connect transcript (preparing → connecting →
+progress → outcome → next steps) hasn't been reviewed line-by-line against
+what best-in-class CLIs produce.
+
+**What to do:** Take each connect transcript variant (success, no-input,
+legacy, unavailable) and compare them side-by-side against production CLI
+transcripts from Vercel deploy, Railway up, `gh run watch`, Stripe CLI.
+Look at: how do they handle multi-phase progress? How do completed steps
+render? What does the "waiting" state look like? How much context does each
+line earn its place?
+
+This is more than a spinner fix — it's a line-by-line quality pass on
+the most important user journey in the CLI.
+
+**Ref:** CLI-UX-QUALITY-BAR.md, CLI-TRANSCRIPTS.md (connect sections)
+
+### State labeling and the headed/headless/agent mental model
+
+The labels "needs attention", "legacy", and "manual step" in `vana status`
+and `vana sources` are confusing — but relabeling them is only the surface
+issue. The deeper question is: what mental model should users have for how
+connector states flow across different execution contexts?
+
+**The confusion:** "Legacy" means the connector doesn't call
+`requestInput` — so when it needs auth, it calls `showBrowser`/`promptUser`
+instead, which requires a headed display. This makes "legacy" functionally
+equivalent to `--no-input` in a headless environment. But "legacy" sounds
+like "old and broken" when it really means "browser-required auth flow."
+
+**Questions that need answers:**
+
+- When `vana status` shows "needs attention" for Shop, does that mean an
+ agent tried `--no-input` and it needed a browser? Will it auto-resume if
+ run again headed? What's the user's next action?
+- If a connector is "legacy" and the user is in a headed desktop session,
+ should the CLI just open the browser automatically? The label "manual
+ step" implies the user has to do something — but what, exactly?
+- How should an agent interpret these states? Can it recover, or does it
+ need to hand off to a human?
+- Are "legacy" and "interactive" permanent properties of a connector, or
+ can a connector support both modes?
+
+**What to do:** First, map out the actual state transitions across
+contexts (headed interactive, headless interactive, headless no-input,
+agent-driven). Then design labels that help users understand what happened
+and what to do next. The labels should be context-aware if needed.
+
+**Ref:** CLI-TRANSCRIPTS.md (status, connect-shop sections)
+
+### "What I would do next" specificity → **Done**
+
+Next-step suggestions now use specific source names and copy-pasteable
+commands throughout. See Done section.
+
+**Ref:** CLI-UX-QUALITY-BAR.md, CLI-TRANSCRIPTS.md
+
+### Agent demo GIFs/transcripts
+
+Show a coding agent (Claude Code) using the CLI end-to-end. The SKILL.md
+already exists in data-connectors. Script a VHS-style demo that shows an
+agent running `vana connect github --json --no-input` and processing the
+output.
+
+**Ref:** CLI-AGENT-FRIENDLY.md, data-connectors/skills/vana-connect/SKILL.md
+
+---
+
+## Research
+
+Issues where we need to investigate code, compare implementations, or gather
+data before we can act.
+
+### `vana data show` schema assumptions → **Research complete, needs design**
+
+`summarizeResultData()` in `src/cli/index.ts:3453-3507` is **entirely
+hardcoded**. It checks for specific field names: `profile.username`,
+`repositories`, `starred`, `orders`, `playlists`, and
+`exportSummary.details`. `summarizeNamedItems()` assumes array items have
+a `name` field. Zero runtime validation against schemas.
+
+**Findings:**
+
+- If a connector returns data in an unexpected shape, summary lines are
+ **silently skipped**. No warning to the user — they just see less info.
+- `data-connectors/schemas/` has 25+ JSON Schema files that define
+ connector output (e.g. `github.repositories.json` defines `name`, `url`,
+ `description`, etc.) — but the CLI doesn't read or validate against them.
+- Every new connector requires adding hardcoded field names to
+ `summarizeResultData()`. This won't scale.
+
+**Brittle assumptions (every instance):**
+
+| Line | Assumption | Risk |
+| ---- | --------------------------------------------- | ------------------------------ |
+| 3466 | `data.profile?.username` string | Silently skipped if missing |
+| 3470 | `data.repositories` is array | Silently skipped if wrong type |
+| 3472 | Repo items have `.name` string | Preview line disappears |
+| 3478 | `data.starred` is array | Silently skipped |
+| 3482 | `data.orders` is array | Silently skipped |
+| 3486 | `data.playlists` is array, items have `.name` | Silently skipped |
+| 3495 | `data.exportSummary?.details` string | Fallback only |
+
+**What needs to happen:** Design a mechanical summary system. Options:
+
+1. Use `exportSummary` from connectors (already returned by some)
+2. Read JSON Schema metadata to auto-generate summaries
+3. Define a `displayHints` field in registry.json per connector
+4. Walk the JSON generically (count arrays, show top-level keys)
+
+**This is now a design question, not a research question.** Moves to
+Tim + Claude for the approach decision.
+
+### Connector metadata utilization → **Mostly done**
+
+The CLI now fetches and uses connector metadata extensively. Scopes,
+versions, checksums, export frequency, and icons are used in
+`vana sources`, `vana sources --detail`, `vana status`, and the new
+`vana collect` command.
+
+**What's implemented:**
+
+| Field | Status | Where used |
+| ---------------------- | ------- | ------------------------------------------------------------ |
+| `scopes[].label` | Done | Sources detail view, status badges, Personal Server sync |
+| `scopes[].description` | Done | `vana sources --detail` view |
+| `version` | Done | Displayed in sources/status views |
+| `checksums` | Done | Displayed in sources views |
+| `iconURL` | Done | Rendered in sources/status views for capable terminals |
+| `exportFrequency` | Done | Shown in sources detail view |
+| `connectURL` | Not yet | Could open the correct login page during `vana connect` |
+| `connectSelector` | Not yet | Could verify login state before running the connector |
+| `runtime` | Not yet | Could validate runtime compatibility before attempting a run |
+
+**Remaining items:**
+
+- `connectURL` / `connectSelector` — smarter connect flows (pre-auth)
+- `runtime` — pre-run compatibility validation
+- Update-available detection (version comparison against installed)
+
+**Research docs:**
+
+- [Version Tracking](research/VERSION-TRACKING-RESEARCH.md)
+- [Freshness UX](research/FRESHNESS-UX-RESEARCH.md)
+- [Pre-Auth Patterns](research/PRE-AUTH-PATTERNS-RESEARCH.md)
+- [Scope Display](research/SCOPE-DISPLAY-RESEARCH.md)
+
+### Color palette verification → **Done (destructive aligned)**
+
+The CLI theme lives in `src/cli/render/theme.ts`. Brand colors were
+compared against `vana-app/packages/ui/src/styles/shadcn.css`.
+
+**Current state:**
+
+| Role | CLI hex | Brand hex | Match? |
+| ------------------- | --------- | ---------------- | -------------------------------------------------- |
+| Accent / primary | `#4141fc` | `#4141fc` | Yes |
+| Success | `#00d50b` | `#00d50b` | Yes |
+| Destructive / error | `#E7000B` | `#E7000B` | Yes — updated to Vana brand vivid red |
+| Warning | `#BA8B00` | _(not in brand)_ | Acceptable — functional color with no brand equiv. |
+
+- The **VHS Catppuccin Mocha** theme is a generic dark theme with no Vana
+ brand colors. It's fine for recording demos but shouldn't be cited as
+ "brand-accurate."
+
+**Remaining:** Decide whether warning needs a brand-sanctioned color or
+if `#BA8B00` is acceptable long-term.
+
+### `--no-input` vs input-up-front → **Research complete, gap confirmed**
+
+`--no-input` propagates through the CLI as `allowHeaded = !request.noInput`
+in the runtime. Three distinct code paths are affected:
+
+| API | With `--no-input` | Without |
+| ---------------- | ---------------------------------------------------- | -------------------------- |
+| `requestInput()` | Throws `NeedsInputError` | Prompts user interactively |
+| `promptUser()` | Emits `legacy-auth` event, returns early | Shows prompt |
+| `showBrowser()` | Emits `legacy-auth` event, returns `{headed: false}` | Opens browser |
+
+**Key findings:**
+
+- **No mechanism exists for pre-supplying input.** There are no env vars,
+ flags, config files, or stdin pipes that let an agent provide credentials
+ ahead of time. The only options are "interactive" or "skip entirely."
+- **Auth mode is detected by regex on connector script content**, not by a
+ declared property. The runtime scans for `requestInput` calls to classify
+ connectors as "interactive" vs "legacy."
+- **The gap between "skip all prompts" and "provide answers ahead of time"
+ is real and unaddressed.** An agent that has GitHub credentials cannot
+ pass them to `vana connect github --no-input` — it can only fail.
+
+**This is now a product decision, not a research question.** See the
+Tim + Claude section for the product model discussion.
+
+### Personal server integration → **Partially done**
+
+Significant progress on the CLI's Personal Server integration. Scope-aware
+ingest, per-scope state tracking, honest sync badges, and a full
+`vana server` command group are now implemented. Auth and tunnel awareness
+remain as gaps.
+
+**What's implemented:**
+
+- Scope-aware ingest with proper scope resolver (maps connector output
+ fields to PS scopes)
+- Personal Server client with per-scope POST to `/v1/data/{scope}`
+- Per-scope sync state tracking (which scopes synced, which failed)
+- Honest sync badges in status views
+- `vana server status` — shows server URL (with source clarity: auto-detected
+ vs saved vs env var), connection state, and sync status
+- `vana server data` — shows what data is stored on the server
+- `vana server sync` — manual sync retry for previously collected data
+- Server status URL source labeling (auto-detected vs saved vs env var)
+
+**Remaining gaps:**
+
+| Gap | Why it matters |
+| ---------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
+| **No persistent URL config** | `VANA_PERSONAL_SERVER_URL` is env-only. Users must set it every session. No `~/.vana` or `~/.vana` config file stores the URL. |
+| **No auth on ingest** | POST `/v1/data/{scope}` is open on the server side today. For public/tunnel URLs, anyone who knows the URL can write data. The server supports Web3Signed auth (EIP-191) for reads and dev tokens for dev — the CLI uses neither. |
+| **No tunnel awareness** | DataConnect creates FRP tunnels and shows the public URL in its UI. The CLI can't discover or display this URL. The tunnel config lives at `~/.vana/personal-server/tunnel/frpc.toml`. |
+| **No gateway registration** | The personal server self-registers with the Data Gateway via EIP-712 signed messages through an account signing service. The CLI doesn't participate in this flow. |
+| **No grant management** | Can't view/revoke data access grants from CLI. |
+
+**Auth architecture (from personal-server-ts):**
+
+- **Web3Signed**: `Authorization: Web3Signed {base64url_payload}.{signature}` —
+ EIP-191 signed, includes audience/method/uri/bodyHash/expiry. Used for
+ read endpoints. The CLI would need a private key to sign requests.
+- **Dev token**: `Authorization: Bearer {token}` — 32-byte random hex,
+ generated per session, emitted to DataConnect UI. Bypasses Web3Signed.
+ Ephemeral, not persisted.
+- **Ingest endpoint**: Currently **no auth** on POST `/v1/data/{scope}`.
+ This needs to change before public URLs are standard.
+
+**What still needs to happen:**
+
+1. **Config persistence** — `vana server set-url ` that writes to
+ `~/.vana/personal-server-url` or similar. Fall back to env var,
+ then port scan.
+2. **Auth integration** — decide whether CLI uses Web3Signed (needs
+ private key management) or dev tokens (needs discovery from
+ DataConnect). For cloud-hosted, likely Web3Signed.
+3. **Tunnel URL discovery** — read from FRP config at
+ `~/.vana/personal-server/tunnel/frpc.toml`, or query the
+ running server for its public URL.
+
+**This is a Brainstorm → Tim + Claude pipeline.** The auth model and
+tunnel discovery need product decisions before implementation.
+
+---
+
+## Brainstorm
+
+Open-ended design questions with multiple plausible approaches. Need creative
+exploration before converging.
+
+### "Steam not available" — the extensibility experience
+
+This is one of the deepest UX questions for the CLI. When a user asks for
+a source we don't support yet, the current response is a dead end: "Steam
+is not available yet." That's the moment where a best-in-class CLI turns a
+limitation into an opportunity.
+
+**Why this matters:** The connector creation agent skill already exists
+(`data-connectors/skills/vana-connect/CREATE.md`) and can build a working
+connector from scratch. The infrastructure is there. The question is: how
+does the CLI bridge the gap between "we don't have this" and "let's make
+it happen right now"?
+
+**The design space:**
+
+1. **Agent-assisted creation (highest ambition).** Detect if the user has
+ a coding agent (Claude Code, Codex, etc.). If so, offer to launch
+ connector creation in a parallel terminal. The agent reads the SKILL.md,
+ scaffolds the connector, tests it, and the user comes back to a working
+ `vana connect steam`. Questions: how does the CLI hand off to the agent?
+ How does the agent signal completion? Can the user's original terminal
+ wait and resume? What if the agent needs interactive approval?
+
+2. **One-liner handoff.** Print a command the user can paste into another
+ terminal that kicks off the creation flow. The original terminal waits
+ or the user comes back when ready. Lower magic, higher transparency.
+
+3. **Request submission.** Collect the platform URL, desired data types,
+ and auth method, then submit a structured request (GitHub issue, API
+ call, local file). Someone or something builds the connector later.
+
+4. **File-based handoff.** Write a structured request file (JSON/YAML)
+ that any agent can pick up asynchronously. The user's next Claude Code
+ session could detect it and offer to build the connector.
+
+5. **Graceful degradation.** If no agent is available and the user doesn't
+ want to request, at minimum show what data the platform likely has and
+ what a connector would do, so the user understands the value.
+
+**Key constraints:**
+
+- User may or may not have a coding agent installed
+- Agent may need interactive approval from the user
+- Don't lose the user's progress in the current terminal
+- The SKILL.md + CREATE.md already define the full connector creation flow
+- Different user personas: developer (can code), power user (has an agent),
+ casual user (just wants their data)
+
+**What makes this best-in-class:** The CLI that turns "not supported" into
+"let's build it together" is qualitatively different from one that just
+says "sorry." This is a Vana differentiator — the data portability
+protocol is open, connectors are open, and the tooling to create them is
+agent-ready. The UX should reflect that.
+
+**Tim's input needed on:** Which persona to design for first. How much
+magic vs explicit user action. Whether this is a v1 or v2 feature.
+
+### Bundled skills / agent doc installation
+
+Should `vana` install a SKILL.md into the user's agent directory (e.g.
+`~/.claude/skills/`)? Or is hosting good agent docs online sufficient?
+
+**Possible paths:**
+
+- `vana setup` also installs the SKILL.md for detected agents
+- `vana agent-setup` as a separate command
+- Just host docs at a well-known URL and rely on llms.txt / web discovery
+- Ship the SKILL.md in the npm package, let agents find it
+
+**Ref:** CLI-AGENT-FRIENDLY.md (Tier 1a)
+
+---
+
+## Tim + Claude
+
+Decisions that need Tim's input. These block other work or set direction.
+
+### Source selection and multi-connect interaction patterns
+
+`vana connect` has a guided picker and `vana sources` lists what's
+available. But the interaction patterns for "I want to connect several
+things" haven't been designed.
+
+**Questions:**
+
+- Should `vana sources` support multi-select? What does that look like
+ in a terminal (checkboxes, space-to-toggle, like `gum choose --no-limit`)?
+- Should `vana connect` with no args offer to connect everything
+ available, or just pick one?
+- What do best-in-class CLIs do for multi-resource operations? (e.g.
+ `gh repo clone` doesn't batch, but `brew install` does)
+- How does multi-connect interact with the progress UX — parallel or
+ sequential? What if one source needs input and another doesn't?
+- For agents: should `vana connect --all --no-input` be a thing?
+
+**What needs to happen:** Research best-in-class multi-select and
+batch-operation patterns in production CLIs. Then Tim + Claude decide
+whether this is right for v1 or a later iteration, and if so, what the
+interaction model is.
+
+**Tim's input needed on:** Is this a real user need now, or premature
+complexity? What's the expected usage pattern — connect one source at
+a time, or batch onboarding?
+
+### `--no-input` vs providing input up front (product model)
+
+Even after the research agent reports back on what's technically possible,
+there's a product question: how should users think about the spectrum from
+fully interactive → fully automated?
+
+**The spectrum:**
+
+1. Headed + interactive (user watches browser, types credentials)
+2. Headless + interactive (CLI prompts for credentials, browser hidden)
+3. Headless + pre-supplied input (agent passes creds via flags/env/config)
+4. Headless + `--no-input` (fail if input needed)
+
+**What Tim needs to decide:** Do we want to support #3? If so, what's the
+interface — env vars, a config file, CLI flags, stdin JSON?
+
+### Connector description copy
+
+Current: "Exports your X using Playwright browser automation." This is
+verbose and leaks implementation details. But descriptions come from the
+**data-connectors registry**, not the CLI.
+
+**What Tim needs to decide:** Fix upstream in data-connectors? What's the
+right copy pattern? Suggestions:
+
+- "Your GitHub profile, repositories, and starred repos" (drop verb + method)
+- "Collects your GitHub data via Playwright" (shorter)
+- "GitHub profile, repos, and stars" (ultra-terse)
+
+---
+
+## Upstream dependencies
+
+Issues that require changes in other repos first.
+
+### Connector descriptions (data-connectors)
+
+Blocked on Tim's copy decision above. Once decided, change
+`registry.json` in data-connectors. Demo fixtures here will follow
+automatically via `prepare-vhs-fixtures.mjs`.
+
+### Personal server ingest idempotency (personal-server-ts)
+
+`POST /v1/data/{scope}` creates a new versioned file on every call
+(new `collectedAt` timestamp each time). If the CLI retries a sync
+for already-posted data, the server stores a duplicate version. The
+server needs a deduplication mechanism — e.g., accept a client-supplied
+`collectedAt` or content hash and return 200 instead of 201 if it
+already has that version.
+
+---
+
+## Done
+
+- [x] ~~Purple box around GIFs~~ — removed MarginFill from all VHS tapes
+- [x] ~~GIF CI automation~~ — CI renders GIFs and attaches to canary release;
+ all markdown now uses release URLs
+- [x] ~~Personal Server integration~~ — scope-aware ingest with proper scope
+ resolver, PS client, per-scope state tracking, honest sync badges,
+ `vana server status/data/sync` commands (auth and tunnel gaps remain,
+ tracked above)
+- [x] ~~Connector metadata utilization~~ — scopes, versions, checksums, export
+ frequency, icons used in sources/status/detail views, `vana collect`
+ command (connectURL/connectSelector/runtime remain, tracked above)
+- [x] ~~Next-step specificity~~ — suggestions now use specific source names and
+ copy-pasteable commands
+- [x] ~~Color palette alignment~~ — destructive color updated to Vana brand
+ `#E7000B`
+- [x] ~~VHS demos for new commands~~ — collect, sources detail, server
+ status/sync/data tapes created
+- [x] ~~Server status URL source clarity~~ — auto-detected vs saved vs env var
+ labeled clearly
+- [x] ~~Clean error handling for command typos~~ — no stack traces on unknown
+ commands
+
+---
+
+## New (March 18, 2026)
+
+### Test MCP server with Claude Code
+
+Configure Claude Code to use `vana mcp` as an MCP server and verify: tools appear in the agent's tool list, `check_status` and `list_sources` return correct data, `connect_source` correctly rejects legacy sources and works for interactive sources via IPC, `show_data` returns collected datasets. Test with both local dev build and canary install.
+
+### Release pipeline: hosted installer and demo assets
+
+Homebrew formula auto-sync is now in CI. Remaining:
+
+- Hosted installer (`install.sh`) may reference stale versions
+- Demo transcripts/VHS are CI artifacts, not committed back
+
+### Checkpoint-resume for --ipc (v2)
+
+Current IPC mode requires background task coordination. A cleaner agent UX: exit on needs-input with special exit code, agent writes response, reruns with --resume. Deferred — current IPC works with proper skill instructions.
+
+### SEA binary stack trace on unknown commands
+
+`vana disconnect` shows a full `CommanderError` stack trace in the SEA (single executable) build. The local dev build handles this correctly. The SEA packaging needs its own top-level catch. (GitHub issue #59)
+
+### Stale checked-in transcripts
+
+`docs/CLI-TRANSCRIPTS.md` is stale. CI generates fresh transcripts as release artifacts but doesn't commit them back. Either auto-commit from CI or accept that checked-in transcripts are reference-only. (GitHub issue #60)
+
+### Mechanical date filtering for collected data
+
+To support `vana next` properly, need the ability to extract "what changed in the last 24 hours" mechanically. Approaches: diff between collection runs (requires result versioning), connector-level timestamp annotations, or upstream activity feeds. For now, the `next-prompt` skill teaches the agent to parse timestamps from raw data. (GitHub issue #57)
+
+### Data versioning and result history
+
+Currently each source stores only the latest result. For diffing and trend analysis, need historical snapshots with a retention policy. (GitHub issue #58)
diff --git a/docs/CLI-PERSONAL-SERVER-PLAN.md b/docs/CLI-PERSONAL-SERVER-PLAN.md
new file mode 100644
index 00000000..2c5ebd4a
--- /dev/null
+++ b/docs/CLI-PERSONAL-SERVER-PLAN.md
@@ -0,0 +1,335 @@
+# CLI ↔ Personal Server Integration Plan
+
+_March 17, 2026_
+
+## Context
+
+The CLI claims "Synced to Personal Server" after `vana connect`, but no data
+is actually sent. The `ingestResult()` function filters for dotted scope keys
+(`key.includes(".")`), but connectors output flat keys (`profile`,
+`repositories`). The loop runs zero iterations and reports success.
+
+DataConnect Desktop already solves this pipeline in production. This plan
+aligns the CLI with DataConnect's proven patterns and adds CLI-specific
+capabilities (verification, retry, introspection, config persistence).
+
+## Prior art: how DataConnect does it
+
+DataConnect's `personalServerIngest.ts` handles two connector output formats:
+
+1. **New format** — output keys are already dotted (`github.profile`) →
+ extract and POST each scope separately
+2. **Legacy format** — output keys are flat (`profile`) → use a platform
+ registry to look up the default scope, POST entire blob
+
+Auth: none (localhost trust). Verification: HTTP 2xx check only. Retry:
+queues pending exports, delivers on PS startup.
+
+The connector metadata (`github-playwright.json`) declares scopes like
+`github.profile`, `github.repositories` — this is the canonical mapping.
+
+## Phase 0: Fix ingest (stop lying)
+
+**Goal:** Data actually reaches the personal server after `vana connect`.
+
+### 0a. Scope resolver
+
+Create `src/personal-server/scope-resolver.ts`:
+
+```typescript
+export interface ScopeMapping {
+ scope: string; // "github.profile"
+ data: unknown; // the value from the connector output
+}
+
+export function resolveScopes(
+ source: string,
+ result: Record,
+ metadata: ConnectorMetadata | null,
+): ScopeMapping[];
+```
+
+Logic (matches DataConnect):
+
+1. Extract keys containing `.` — if any found, use them directly as scopes
+2. Otherwise, use connector metadata scopes to map: for each metadata scope
+ `{source}.{key}`, look for `key` in the result object
+3. If no metadata available, fall back to `{source}.{key}` for every
+ non-metadata key (exclude `exportSummary`, `timestamp`, `version`,
+ `platform`)
+
+### 0b. Fix `ingestResult()`
+
+Replace the current scope-guessing logic with `resolveScopes()`. POST each
+resolved scope individually to `POST /v1/data/{scope}`.
+
+Track per-scope success/failure:
+
+```typescript
+interface IngestScopeResult {
+ scope: string;
+ status: "stored" | "failed";
+ error?: string;
+}
+```
+
+Return `ingest-complete` only if ALL scopes succeed. Return `ingest-partial`
+if some succeed. Return `ingest-failed` if all fail.
+
+### 0c. Update state model
+
+Add per-scope tracking to `StoredSourceState`:
+
+```typescript
+ingestScopes?: Array<{
+ scope: string;
+ status: "stored" | "failed";
+ syncedAt?: string;
+}>;
+```
+
+Replace the binary `ingested_personal_server` / `ingest_failed` with
+granular per-scope state. The top-level `dataState` remains for backward
+compat but is derived from the scope array.
+
+### 0d. Verification
+
+After posting all scopes, call `GET /v1/data?scopePrefix={source}` (requires
+builder auth or devToken) to confirm scopes appear in the server's index.
+
+If verification isn't possible (no auth configured), log a warning and trust
+the HTTP 2xx responses (matching DataConnect's behavior).
+
+**Files changed:**
+
+- New: `src/personal-server/scope-resolver.ts`
+- Modified: `src/personal-server/index.ts` (ingestResult)
+- Modified: `src/core/state-store.ts` (scope tracking)
+- Modified: `src/core/cli-types.ts` (new event types)
+- Tests: `test/personal-server/scope-resolver.test.ts`
+
+## Phase 1: Personal server client
+
+**Goal:** A proper HTTP client for the personal server, replacing raw fetch
+calls scattered through ingestResult.
+
+### 1a. `createPersonalServerClient()`
+
+Create `src/personal-server/client.ts`:
+
+```typescript
+export interface PersonalServerClient {
+ health(): Promise;
+ ingestScope(scope: string, data: unknown): Promise;
+ listScopes(prefix?: string): Promise;
+ listVersions(scope: string): Promise;
+}
+
+export function createPersonalServerClient(config: {
+ url: string;
+ auth?: { type: "devToken"; token: string } | { type: "none" };
+}): PersonalServerClient;
+```
+
+Auth strategy:
+
+- **Local (localhost):** no auth (matches DataConnect and PS design)
+- **DevToken:** `Authorization: Bearer {token}` for dev UI and tunneled
+ access
+- **Web3Signed:** future, for cloud-hosted PS — not in this plan
+
+### 1b. Update `ingestResult()` to use client
+
+Replace raw fetch with `client.ingestScope()`. The client handles retries,
+error normalization, and response parsing.
+
+**Files changed:**
+
+- New: `src/personal-server/client.ts`
+- Modified: `src/personal-server/index.ts`
+- Tests: `test/personal-server/client.test.ts`
+
+## Phase 2: Config persistence + `vana server` commands
+
+**Goal:** Users can configure, inspect, and manage their personal server
+from the CLI.
+
+### 2a. Persist PS URL
+
+Add `personalServerUrl` to the CLI config (already exists in state-store
+schema but only read from env var today). Add `vana server set-url `
+to persist it.
+
+Detection priority (matches current code):
+
+1. Persisted config (`~/.vana/vana-connect-state.json` → `config.personalServerUrl`)
+2. `VANA_PERSONAL_SERVER_URL` env var
+3. Localhost port scan (8080-8085)
+
+### 2b. `vana server status`
+
+Show:
+
+```
+Personal Server
+ URL: http://localhost:8080 (auto-detected)
+ Status: healthy
+ Version: 0.0.1-canary.93673d7
+ Owner: not configured
+ Scopes: 3 (github.profile, github.repositories, github.starred)
+```
+
+JSON mode returns the full health response + scope list.
+
+### 2c. `vana server data [scope]`
+
+Without argument — list all scopes with version counts:
+
+```
+github.profile 1 version collected 2h ago
+github.repositories 1 version collected 2h ago
+github.starred 1 version collected 2h ago
+```
+
+With argument — show scope detail:
+
+```
+github.profile
+ Versions: 1
+ Latest: 2026-03-17T06:47:14Z
+ Size: 1.2 KB
+```
+
+### 2d. `vana server sync`
+
+Retry failed/pending ingests from local data. Reads state to find sources
+with `dataState === "collected_local"` or partially-synced scope arrays.
+Re-runs `ingestResult()` with the stored `lastResultPath`.
+
+```
+Syncing 1 pending dataset...
+ github: 3/3 scopes synced ✓
+```
+
+**Files changed:**
+
+- Modified: `src/cli/index.ts` (new commands)
+- Modified: `src/core/state-store.ts` (config persistence)
+- Modified: `src/personal-server/index.ts` (detection uses persisted config)
+
+## Phase 3: Honest UX
+
+**Goal:** The CLI never claims a state it can't prove.
+
+### 3a. Status labels
+
+| State | Badge | Meaning |
+| -------------------------- | ----------------------- | ------------------------------- |
+| All scopes synced | `synced` (green) | Every scope POSTed successfully |
+| Some scopes synced | `partial sync` (yellow) | Some scopes failed |
+| Sync attempted, all failed | `sync failed` (red) | POST errors on all scopes |
+| No PS detected | `local` (muted) | Data saved locally only |
+| Not yet collected | `new` (muted) | Never connected |
+
+### 3b. Post-connect messaging
+
+After `vana connect github`:
+
+**With PS available:**
+
+```
+Connected GitHub. 3 scopes synced to Personal Server.
+ github.profile ✓
+ github.repositories ✓
+ github.starred ✓
+```
+
+**Without PS:**
+
+```
+Connected GitHub. Data saved locally.
+ Path: ~/.vana/last-result.json
+ Run `vana server sync` after starting your Personal Server.
+```
+
+**Partial failure:**
+
+```
+Connected GitHub. 2/3 scopes synced, 1 failed.
+ github.profile ✓
+ github.repositories ✓
+ github.starred ✗ (400: schema not registered)
+ Run `vana server sync` to retry.
+```
+
+### 3c. `vana status` shows per-source sync detail
+
+```
+→ Connected (1)
+ GitHub [synced] 3/3 scopes · collected 2h ago
+
+→ Personal Server
+ http://localhost:8080 · healthy · 3 scopes stored
+```
+
+## Phase 4: Tests
+
+### Unit tests
+
+- `scope-resolver.test.ts` — dotted keys, flat keys with metadata, flat
+ keys without metadata, metadata-key mismatch, empty result
+- `client.test.ts` — health, ingestScope success/failure, listScopes,
+ listVersions, auth header generation
+- `ingestResult` integration — full pipeline with mocked client
+
+### E2E smoke test (manual)
+
+```bash
+# Start personal server
+cd ~/code/personal-server-ts && npm start
+
+# Connect and verify
+vana connect github
+vana server status
+vana server data
+vana server data github.profile
+```
+
+## Implementation order
+
+1. Phase 0a-0b — scope resolver + ingest fix (stops the lying)
+2. Phase 0c — per-scope state tracking
+3. Phase 1 — PS client extraction
+4. Phase 2a-2b — config persistence + server status
+5. Phase 3 — honest UX labels and messaging
+6. Phase 2c-2d — server data + server sync commands
+7. Phase 4 — tests throughout
+8. Phase 0d — verification (depends on auth story with PS team)
+
+## Open questions for PS team
+
+1. **Schema registration** — are all connector scopes registered on
+ Gateway? If `github.profile` isn't registered, `POST /v1/data/github.profile`
+ returns 400. Who registers schemas — connector authors or PS team?
+2. **Auth for non-localhost** — tunneled PS uses FRP with a public URL.
+ Should ingest require a devToken? Web3Signed? Or is the tunnel itself
+ the auth boundary?
+3. **Bulk ingest** — DataConnect POSTs scopes one at a time. For CLI with
+ many scopes, would a batch endpoint help? Or is per-scope fine?
+4. **Scope format** — confirm `{source}.{key}` is canonical. The PS regex
+ allows 2-3 segments (`a.b` or `a.b.c`). Do any connectors use 3?
+
+## Risk notes
+
+- **Gateway schema validation** is the biggest unknown. If schemas aren't
+ registered, ingest will fail even with correct scope names. We should
+ test against the live PS before shipping.
+- **The `as string` cast on `terminal-image` import** is a hack that
+ should be revisited when we decide the icon rendering story.
+- **Per-scope state increases state file complexity.** Keep the schema
+ additive — old state files without `ingestScopes` should still work.
+
+## Release
+
+Push to `feat/connect-cli-v1` for canary release after each phase.
+Phase 0 (stop lying) ships first, even before the polished UX.
diff --git a/docs/CLI-REVIEW-SURFACE.md b/docs/CLI-REVIEW-SURFACE.md
new file mode 100644
index 00000000..195c8a5e
--- /dev/null
+++ b/docs/CLI-REVIEW-SURFACE.md
@@ -0,0 +1,313 @@
+# CLI Review Surface
+
+_March 15, 2026_
+
+This document is the quickest way to review the current `vana` CLI as a
+product.
+
+Use it when you want to answer:
+
+- what commands exist?
+- what should a human see?
+- what should automation see?
+- which transcript or demo should I open?
+- which acceptance commands should I run first?
+
+## Core Command Tree
+
+Top level:
+
+- `vana`
+- `vana --help`
+- `vana --version`
+- `vana version`
+- `vana status`
+- `vana doctor`
+- `vana sources`
+- `vana sources `
+- `vana connect`
+- `vana connect `
+- `vana collect`
+- `vana collect `
+- `vana data`
+- `vana data list`
+- `vana data show `
+- `vana data path `
+- `vana logs`
+- `vana logs `
+- `vana setup`
+- `vana server`
+- `vana server status`
+- `vana server set-url `
+- `vana server clear-url`
+- `vana server sync`
+- `vana server data`
+- `vana server data `
+
+JSON / agent-safe surfaces:
+
+- `vana version --json`
+- `vana status --json`
+- `vana doctor --json`
+- `vana sources --json`
+- `vana sources --json`
+- `vana connect --json --no-input`
+- `vana collect --json`
+- `vana collect --json --no-input`
+- `vana data list --json`
+- `vana data show --json`
+- `vana data path --json`
+- `vana logs --json`
+- `vana logs --json`
+- `vana server --json`
+- `vana server status --json`
+- `vana server set-url --json`
+- `vana server clear-url --json`
+- `vana server sync --json`
+- `vana server data --json`
+- `vana server data --json`
+
+## Review Order
+
+If you only have a few minutes, review the
+[CLI transcripts](CLI-TRANSCRIPTS.md) in this order:
+
+1. [`vana --help`](CLI-TRANSCRIPTS.md#vana---help)
+2. [`vana doctor`](CLI-TRANSCRIPTS.md#vana-doctor)
+3. [`vana status`](CLI-TRANSCRIPTS.md#vana-status)
+4. [`vana sources`](CLI-TRANSCRIPTS.md#vana-sources)
+5. [Successful connect](CLI-TRANSCRIPTS.md#successful-interactive-path)
+6. [`vana collect`](CLI-TRANSCRIPTS.md#vana-collect)
+7. [`vana data show github`](CLI-TRANSCRIPTS.md#vana-data-show-github)
+8. [`vana server status`](CLI-TRANSCRIPTS.md#vana-server-status)
+9. [`vana server sync`](CLI-TRANSCRIPTS.md#vana-server-sync)
+10. [`vana logs`](CLI-TRANSCRIPTS.md#vana-logs)
+
+That sequence covers:
+
+- first impression
+- trust and diagnostics
+- source discovery
+- successful collection
+- re-collection of existing sources
+- post-success payoff
+- personal server integration
+- operator/debug follow-through
+
+## Human Review Surfaces
+
+All transcripts are in [CLI-TRANSCRIPTS.md](CLI-TRANSCRIPTS.md), organized by
+category: foundational, state/diagnostics, discovery, data surfaces, connect
+flows, collect flows, and server management.
+
+## Machine Review Surfaces
+
+Use these when reviewing shell composability and agent behavior:
+
+- `vana version --json`
+- `vana status --json`
+- `vana doctor --json`
+- `vana sources --json`
+- `vana sources github --json`
+- `vana data list --json`
+- `vana data show github --json`
+- `vana data path github --json`
+- `vana logs --json`
+- `vana connect github --json --no-input`
+- `vana connect shop --json --no-input`
+- `vana collect --json`
+- `vana collect github --json --no-input`
+- `vana server --json`
+- `vana server status --json`
+- `vana server set-url https://ps-abc123.server.vana.org --json`
+- `vana server clear-url --json`
+- `vana server sync --json`
+- `vana server data --json`
+- `vana server data github --json`
+
+Related contract docs:
+
+- [CLI-EXIT-CODE-MATRIX.md](CLI-EXIT-CODE-MATRIX.md)
+- [CLI-EXECUTION-PLAYBOOK.md](CLI-EXECUTION-PLAYBOOK.md)
+
+## Demo Media
+
+Animated recordings of every CLI surface. Regenerate with `pnpm demo:vhs`.
+GIFs are rendered by CI and attached to the
+[canary release](https://github.com/vana-com/vana-connect/releases/tag/canary-feat-connect-cli-v1).
+
+[release]: https://github.com/vana-com/vana-connect/releases/download/canary-feat-connect-cli-v1
+
+### Foundational
+
+#### `vana --help`
+
+
+
+#### `vana data --help`
+
+
+
+#### `vana setup`
+
+
+
+### State and diagnostics
+
+#### `vana status`
+
+
+
+#### `vana doctor`
+
+
+
+#### `vana logs`
+
+
+
+### Discovery
+
+#### `vana sources`
+
+
+
+#### `vana sources github`
+
+
+
+### Post-success data surfaces
+
+#### `vana data list`
+
+
+
+#### `vana data list` (clean machine)
+
+
+
+#### `vana data show github`
+
+
+
+#### `vana data show github` (missing)
+
+
+
+#### `vana data path github`
+
+
+
+### Connect flows
+
+#### Successful interactive path
+
+
+
+#### `--no-input` path (no session)
+
+
+
+#### `--no-input` path (session reuse attempt)
+
+
+
+#### Legacy/manual interactive path (Shop)
+
+
+
+#### Legacy/manual `--no-input` path (Shop)
+
+
+
+#### Unavailable connector (Steam)
+
+
+
+#### Unavailable connector `--no-input` (Steam)
+
+
+
+### Collect flows
+
+#### `vana collect`
+
+
+
+#### `vana collect github`
+
+
+
+### Server management
+
+#### `vana server status`
+
+
+
+#### `vana server sync`
+
+
+
+#### `vana server data`
+
+
+
+## Acceptance Commands
+
+Fast local review:
+
+```bash
+pnpm preflight:cli
+pnpm demo:transcripts
+```
+
+Fast human CLI spot-check:
+
+```bash
+vana --version
+vana doctor
+vana status
+vana sources
+vana sources github
+vana connect github
+vana collect github
+vana data show github
+vana server status
+vana server sync
+vana logs github
+```
+
+Fast machine CLI spot-check:
+
+```bash
+vana version --json | jq
+vana status --json | jq
+vana sources --json | jq '.summary, .recommendedSource'
+vana sources github --json | jq
+vana data show github --json | jq '.summary, .data.profile'
+vana connect github --json --no-input
+vana collect --json | jq
+vana server status --json | jq
+vana server sync --json | jq
+vana server data --json | jq
+```
+
+## Regeneration
+
+Refresh transcripts (updates [CLI-TRANSCRIPTS.md](CLI-TRANSCRIPTS.md) in place):
+
+```bash
+pnpm demo:transcripts
+```
+
+Render demo media:
+
+```bash
+pnpm demo:vhs
+```
+
+Watch the deployed canary lane:
+
+```bash
+pnpm release:watch
+```
diff --git a/docs/CLI-RUNTIME-PORTABILITY-NOTES.md b/docs/CLI-RUNTIME-PORTABILITY-NOTES.md
new file mode 100644
index 00000000..206b0c36
--- /dev/null
+++ b/docs/CLI-RUNTIME-PORTABILITY-NOTES.md
@@ -0,0 +1,124 @@
+# CLI Runtime Portability Notes
+
+_March 14, 2026_
+
+This note records the current local conclusions for Batch 5B.
+
+It is intentionally narrower than the full execution playbook. Its purpose is
+to answer: "what portability/correctness concerns have already been checked,
+what changed in code, and what still remains pre-stable?"
+
+## Confirmed Now
+
+### Display-path invariant
+
+- Human-facing `~` rendering is now centralized in
+ [src/cli/render/format.ts](../src/cli/render/format.ts).
+- Functional paths still come from
+ [src/core/paths.ts](../src/core/paths.ts)
+ via `os.homedir()` + `path.join(...)`.
+- Regression coverage now exists in
+ [test/cli/render-format.test.ts](../test/cli/render-format.test.ts).
+
+Conclusion:
+
+- `~` is currently a presentation concern, not a filesystem API input.
+
+### Concurrent CLI state writes
+
+- `updateSourceState(...)` previously did an uncoordinated read-modify-write.
+- It now uses:
+ - a bounded lock file (`vana-connect-state.json.lock`)
+ - stale-lock cleanup
+ - atomic temp-file write + rename
+- Regression coverage now exists in
+ [test/core/state-store.test.ts](../test/core/state-store.test.ts).
+
+Conclusion:
+
+- the local CLI now has a defended cross-process story for the single shared
+ state file without introducing a third-party lock dependency.
+
+## Still Intentionally Best-Effort
+
+### External `sqlite3` for cookie import
+
+- [src/runtime/playwright/browser.ts](../src/runtime/playwright/browser.ts)
+ still shells out to `sqlite3` when importing cookies from an existing system
+ Chrome profile.
+- That path is opportunistic:
+ - only used for system Chrome profile import
+ - skipped entirely for downloaded Chromium
+ - swallowed if `sqlite3` is unavailable
+ - enabled by default only on macOS
+ - gated behind `VANA_ENABLE_SYSTEM_COOKIE_IMPORT=1` on Linux/Windows for
+ explicit validation only
+
+Conclusion:
+
+- this is not a core product feature
+- the core supported path is Vana-managed browser state plus manual/headed
+ login when needed
+- macOS keeps the enhancement enabled by default because the original
+ `data-connect` implementation was explicitly designed around that platform
+- Linux/Windows should remain opt-in until explicitly validated on real
+ desktop environments
+
+Validation handle:
+
+```bash
+pnpm test:runtime:cookie-import
+```
+
+This covers the current product contract:
+
+- macOS enhancement remains available by default
+- Linux/Windows skip system-profile cookie import by default
+- Linux/Windows can still exercise the import path explicitly with
+ `VANA_ENABLE_SYSTEM_COOKIE_IMPORT=1` during targeted validation
+
+### Playwright browser installation
+
+- [src/runtime/managed-playwright.ts](../src/runtime/managed-playwright.ts)
+ still uses Playwright's internal registry API rather than a user-facing CLI
+ invocation
+- that is deliberate:
+ - avoids imposing `npx`/system Node assumptions on users
+ - keeps `vana setup` as the single product surface
+
+Conclusion:
+
+- acceptable for now
+- still a pre-stable validation item, because it relies on internals rather
+ than an explicitly blessed public install API
+
+## Measurement Tooling
+
+Use:
+
+```bash
+pnpm runtime:footprint
+```
+
+This reports:
+
+- `~/.vana` size
+- browser cache size
+- browser profile size
+- connector cache size
+- log size
+- installed package runtime size for:
+ - `playwright`
+ - `playwright-core`
+ - `chromium-bidi`
+
+This is meant to replace guesswork when discussing runtime/bundle size.
+
+## Current Position
+
+The remaining portability work before stable is now narrower:
+
+1. explicitly validate Linux/Windows behavior for opt-in system-Chrome cookie
+ import using `VANA_ENABLE_SYSTEM_COOKIE_IMPORT=1`
+2. re-validate the Playwright internal install path on the final stable track
+3. collect actual footprint numbers before designing cleanup/GC behavior
diff --git a/docs/CLI-SDK-ARCHITECTURE.md b/docs/CLI-SDK-ARCHITECTURE.md
new file mode 100644
index 00000000..7a0bbe9a
--- /dev/null
+++ b/docs/CLI-SDK-ARCHITECTURE.md
@@ -0,0 +1,276 @@
+# `vana-connect` CLI / SDK Architecture
+
+_As of March 12, 2026_
+
+## Purpose
+
+This document defines the intended architecture for the first version of `vana-connect`, with enough structure to support:
+
+- a strong MVP CLI
+- a shared SDK/core
+- future runtime backends beyond Playwright
+- future Personal Server targets beyond local desktop
+
+It is intentionally focused on what matters now.
+
+## Architectural goals
+
+- one product surface
+- one shared core
+- one CLI built on top of that core
+- runtime abstraction from the start
+- simple defaults in v1
+
+## Product shape
+
+`vana-connect` should be treated as one system with layered responsibilities:
+
+- core types, state, and contracts
+- runtime orchestration SDK
+- CLI presentation layer
+
+This can live comfortably in a monorepo later, but the architecture should be clean regardless of repo layout.
+
+## Recommended package boundaries
+
+### `connect-core`
+
+Purpose:
+
+- shared types
+- state models
+- outcome models
+- event schemas
+- config parsing
+- error types
+
+This package should have no scraping-backend dependency.
+
+### `connect-sdk`
+
+Purpose:
+
+- source resolution
+- connect lifecycle orchestration
+- status inspection
+- Personal Server target detection
+- ingest orchestration
+- runtime selection
+
+This is the main programmatic surface the CLI should use.
+
+### `connect-cli`
+
+Purpose:
+
+- command grammar
+- prompts
+- human-readable formatting
+- JSON mode output
+- log-path surfacing
+
+The CLI should be thin. It should not own core business logic.
+
+### Later runtime packages
+
+Potential future packages:
+
+- `connect-runtime-playwright`
+- `connect-runtime-embrowse`
+- `connect-runtime-native-webview`
+
+These should remain internal or semi-internal at first unless there is a strong reason to expose them publicly.
+
+## v1 runtime architecture
+
+### Core principle
+
+The CLI/SDK should depend on a connector execution interface, not directly on Playwright-specific product semantics.
+
+### v1 default runtime
+
+The v1 runtime should be:
+
+- managed by Vana
+- Playwright-based
+- isolated by default
+- headless-first
+- capable of escalating to headed mode when needed
+
+### Why this is the right default
+
+- best reproducibility
+- best supportability
+- easiest first-run experience
+- easiest to reason about local state
+
+## Runtime interface
+
+The runtime abstraction should answer:
+
+- can this source run?
+- can this source request input?
+- can this source escalate to headed mode?
+- where are logs and artifacts?
+- what profile strategy is active?
+
+Illustrative shape:
+
+```ts
+interface ConnectorRuntime {
+ kind: string;
+ ensureInstalled(): Promise;
+ isInstalled(): Promise;
+ run(request: RunRequest): AsyncIterable;
+ supportsHeaded(): boolean;
+ supportsManagedProfiles(): boolean;
+}
+```
+
+This is not final API design. It is the architectural boundary that matters.
+
+## Profile strategy
+
+This should be treated as a distinct concern from runtime.
+
+### v1 default
+
+- Vana-managed isolated profile
+
+### Why
+
+- avoids cross-browser lock issues
+- avoids corrupting user profiles
+- keeps sessions reproducible
+- easier to debug and support
+
+### Future profile strategies
+
+- existing browser profile
+- ephemeral isolated profile
+- hardened sandbox profile
+
+These should be future extension points, not v1 onboarding decisions.
+
+## Headed vs headless
+
+### v1 stance
+
+- default: headless
+- fallback: headed when required or explicitly requested
+
+### Why
+
+- some login flows require visible interaction
+- anti-bot / CAPTCHA flows may need it
+- preserving this capability avoids over-constraining the product
+
+### Product implication
+
+The CLI should be able to express:
+
+- running headless
+- escalating to headed mode
+- failing in `--no-input` mode when headed/manual interaction is required
+
+## Dependency model
+
+### v1 recommendation
+
+Do not make Playwright a user-facing peer dependency.
+
+The user should not have to reason about:
+
+- installing Playwright themselves
+- matching Playwright versions
+- wiring runner dependencies manually
+
+### Instead
+
+- Vana manages runtime provisioning
+- setup/install flow provisions the required runtime
+- CLI and SDK treat the runtime as an implementation detail
+
+### Why
+
+- peer dependencies are bad onboarding
+- they leak internals into the product surface
+- they make the first-run experience feel unfinished
+
+## Existing system compatibility
+
+The architecture should preserve room for:
+
+- current Playwright runner flow
+- existing connector scripts
+- current local browser-profile/session model
+- future Personal Server ingest support
+
+It should also avoid making future compatibility impossible with:
+
+- alternative execution backends
+- agent-driven browser environments
+- hosted scraping environments
+
+## Personal Server architecture boundary
+
+The CLI and SDK should treat the Personal Server as a target environment, not a scraping runtime concern.
+
+That means:
+
+- collection runtime decides how data is extracted
+- ingest layer decides how data is delivered to the active Personal Server
+- status layer reports whether ingest succeeded
+
+This keeps the model clean as Personal Server targets evolve:
+
+- local desktop-bundled
+- self-hosted
+- cloud-hosted
+
+## Logging and artifact strategy
+
+The system should preserve full logs and artifacts even when primary output is compact.
+
+### v1 rule
+
+- compact primary output
+- full logs available on disk
+- stable paths surfaced when useful
+
+This should apply to:
+
+- setup
+- connector runs
+- validation
+- ingest attempts
+
+This serves both humans and agents without polluting the main UX.
+
+## What is fixed in v1
+
+- managed runtime
+- Playwright-backed execution
+- isolated profile by default
+- headless-first behavior
+- headed fallback capability
+
+## What is intentionally left flexible
+
+- alternate runtimes
+- alternate profile strategies
+- local vs cloud Personal Server targets
+- future richer interactive surfaces
+
+## Conclusion
+
+The right v1 architecture is:
+
+- TypeScript-based
+- core + SDK + CLI layered cleanly
+- runtime-abstracted
+- managed-runtime by default
+- isolated-profile by default
+- headless-first with headed fallback
+
+That gives you the fastest path to a strong MVP without trapping the product in Playwright-specific or desktop-only assumptions.
diff --git a/docs/CLI-STATE-MODEL.md b/docs/CLI-STATE-MODEL.md
new file mode 100644
index 00000000..7724116d
--- /dev/null
+++ b/docs/CLI-STATE-MODEL.md
@@ -0,0 +1,303 @@
+# `vana-connect` CLI State Model
+
+_As of March 12, 2026_
+
+## Purpose
+
+This document defines the state model that the `vana connect` CLI should expose and reason about.
+
+The goal is not to document every internal file. The goal is to make the product legible:
+
+- what is installed?
+- what is connected?
+- where is data stored?
+- is a Personal Server available?
+- what happened last?
+
+This state model is the foundation for:
+
+- `vana connect status`
+- first-run trust
+- reconnect behavior
+- future local/cloud Personal Server support
+
+## Product stance
+
+Users should not need to understand implementation internals to answer basic questions.
+
+The CLI should make these things obvious:
+
+- runtime state
+- source state
+- data state
+- Personal Server state
+
+## State domains
+
+For MVP, the CLI should think in four domains:
+
+- runtime
+- sources
+- data artifacts
+- Personal Server target
+
+## 1. Runtime state
+
+This answers: can `vana connect` actually run connectors on this machine?
+
+### Required runtime states
+
+- `installed`
+- `missing`
+- `unhealthy`
+
+### What runtime includes
+
+- runner binary or runner directory
+- browser/runtime dependencies
+- expected local runtime files
+
+### Current foundation
+
+The existing skill uses `~/.vana/playwright-runner/` and `~/.vana/run-connector.cjs` as key runtime artifacts.
+
+### CLI expectations
+
+`vana connect status` should be able to say:
+
+- runtime installed or not
+- path to active runtime
+- whether runtime looks healthy enough to execute connectors
+
+## 2. Source state
+
+This answers: what is the state of a specific connector/source such as Steam or GitHub?
+
+### Required source states
+
+- `unknown`
+- `available`
+- `installed`
+- `session_present`
+- `needs_auth`
+- `last_run_succeeded`
+- `last_run_failed`
+
+These do not all need to be stored as a single field. They are observable states the CLI should be able to infer or record.
+
+### What source state should capture
+
+- whether the source is known in the registry
+- whether the connector is present locally
+- whether saved session/auth state exists
+- whether the source has ever been run successfully
+- whether the last known attempt failed
+
+### Session state
+
+For MVP, the important question is not “what exact auth backend are we using?”
+
+It is:
+
+- can we likely reuse a saved session?
+- if not, will the user need to authenticate again?
+
+### CLI expectations
+
+`vana connect status` and optional `inspect` should be able to show:
+
+- source known or not
+- connector installed or not
+- session likely reusable or not
+- last run outcome
+
+## 3. Data state
+
+This answers: where did the user’s data actually end up?
+
+This is one of the most important trust questions.
+
+### Required data states
+
+- `none`
+- `collected_local`
+- `ingested_personal_server`
+- `ingest_unavailable`
+- `ingest_failed`
+
+### Why this matters
+
+The CLI must not blur:
+
+- “we successfully scraped the data”
+- and
+- “your Personal Server has it now”
+
+Those are related but distinct outcomes.
+
+### Current foundation
+
+Today the headless flow clearly produces local result artifacts. The desktop app has explicit Personal Server ingestion logic. The CLI must make that distinction explicit instead of pretending they are the same thing.
+
+### CLI expectations
+
+After `vana connect `, the user should understand:
+
+- what data was collected
+- whether it is only local
+- whether it was ingested successfully
+- where to look next
+
+## 4. Personal Server target state
+
+This answers: is there an active Personal Server target that the CLI can use?
+
+### Product principle
+
+The user should think in terms of:
+
+- “my Personal Server”
+
+Not:
+
+- “some local app on localhost”
+- “some cloud instance”
+- “some protocol participant”
+
+Implementation may vary. Product language should stay stable.
+
+### Required target states for MVP
+
+- `available`
+- `unavailable`
+- `unknown`
+
+### What target state should eventually represent
+
+- local desktop-bundled target
+- self-hosted target
+- cloud-hosted target
+
+For MVP, the CLI does not need a big environment-management surface. It just needs to know enough to say:
+
+- Personal Server reachable
+- Personal Server not reachable
+- ingest attempted or not attempted
+
+### CLI expectations
+
+`vana connect status` should ideally show:
+
+- whether a Personal Server target is detected
+- whether it appears reachable
+- whether recent ingest succeeded
+
+## Persisted vs derived state
+
+Not all state needs its own database or manifest.
+
+For MVP:
+
+- prefer deriving state from existing files and runtime checks where possible
+- record only the minimum additional metadata needed for a good UX
+
+### Likely derived from existing artifacts
+
+- runtime installed
+- connector installed
+- session folder exists
+- last result file exists
+- Personal Server reachable
+
+### Likely worth recording explicitly
+
+- last run outcome per source
+- last ingest outcome per source
+- last error summary per source
+- timestamps for last successful collection and ingest
+
+This can be very simple at first.
+
+## Suggested MVP state record
+
+If the CLI needs its own lightweight state file, it should be small and outcome-oriented.
+
+Possible location:
+
+- under `~/.vana/`
+
+Possible shape:
+
+```json
+{
+ "version": 1,
+ "sources": {
+ "steam": {
+ "connectorInstalled": true,
+ "sessionPresent": true,
+ "lastRunAt": "2026-03-12T12:00:00Z",
+ "lastRunOutcome": "connected_local_only",
+ "lastIngestAt": null,
+ "lastIngestOutcome": "not_attempted",
+ "lastError": null
+ }
+ }
+}
+```
+
+This is illustrative, not final.
+
+## What `status` should summarize in MVP
+
+At minimum, `vana connect status` should answer:
+
+- is runtime installed?
+- is a Personal Server target available?
+- what sources are installed locally?
+- which sources appear to have saved sessions?
+- what was the last outcome for each source?
+- is each source local-only or ingested?
+
+This command should aim to answer the user’s practical question:
+
+**“Can I trust that my data is connected and usable right now?”**
+
+## How this supports future cloud support
+
+This state model is intentionally environment-agnostic.
+
+That matters because future Personal Server targets may be:
+
+- local
+- self-hosted
+- cloud-hosted
+
+If the CLI is built around:
+
+- target availability
+- target reachability
+- ingest outcome
+
+then it can support those futures without changing the user’s mental model.
+
+## What not to do
+
+Avoid these mistakes:
+
+- surfacing raw internal file structure as the product state model
+- treating “collected locally” and “ingested” as the same outcome
+- forcing users to know which runtime implementation is active
+- inventing a large config or environment system before MVP needs it
+
+## Conclusion
+
+The MVP state model should stay small and explicit.
+
+The CLI needs to make four things legible:
+
+- runtime health
+- source connection state
+- data location/outcome
+- Personal Server availability
+
+If those are clear, the product will feel much more trustworthy and much less improvised.
diff --git a/docs/CLI-TRANSCRIPTS.md b/docs/CLI-TRANSCRIPTS.md
new file mode 100644
index 00000000..071474f9
--- /dev/null
+++ b/docs/CLI-TRANSCRIPTS.md
@@ -0,0 +1,651 @@
+# CLI Transcripts
+
+_March 15, 2026_
+
+Generated review artifacts for the human-mode CLI. These are deterministic
+fixture-based captures — not live runs.
+
+Refresh with:
+
+```bash
+pnpm demo:transcripts
+```
+
+For the full review index, see [CLI-REVIEW-SURFACE.md](CLI-REVIEW-SURFACE.md).
+
+---
+
+## Foundational
+
+### `vana --help`
+
+
+
+```
+$ vana --help
+
+Usage: vana [options] [command]
+
+Connect sources, collect data, and inspect it locally.
+
+Options:
+ -v, --version Print CLI version
+ -h, --help display help for command
+
+Commands:
+ version [options] Print CLI version
+ connect [options] [source] Connect a source and collect data
+ sources [options] [source] List supported sources, or show detail for one
+ source
+ collect [options] [source] Re-collect data from a previously connected source
+ status [options] Show runtime and Personal Server status
+ doctor [options] Inspect local CLI, runtime, and install health
+ setup [options] Install or repair the local runtime
+ data Inspect collected datasets, paths, and summaries
+ logs [options] [source] Inspect stored connector run logs
+ server [options] Manage Personal Server connection
+ help [command] display help for command
+
+Quick start:
+ vana connect Connect a source and collect data
+ vana sources Browse available sources
+ vana status Check system health
+
+Data:
+ vana data list List collected datasets
+ vana data show Inspect a dataset
+
+Server:
+ vana server Personal Server status and management
+
+More:
+ vana doctor Detailed diagnostics
+ vana logs [source] View run logs
+ vana setup Install or repair runtime
+```
+
+
+
+### `vana data --help`
+
+
+
+```
+$ vana data --help
+
+Usage: vana data [options] [command]
+
+Inspect collected datasets, paths, and summaries
+
+Options:
+ -h, --help display help for command
+
+Commands:
+ list [options] List locally available collected datasets
+ show [options] Show a collected dataset
+ path [options] Print the local path for a collected dataset
+
+Examples:
+ vana data list
+ vana data show github
+ vana data path github --json
+```
+
+
+
+### `vana setup`
+
+
+
+```
+$ vana setup
+
+Vana Connect setup
+Runtime
+The local runtime is already installed.
+ Browser: /opt/playwright/chromium-1208/chrome-linux64/chrome
+
+ Next: `vana connect github`
+```
+
+
+
+---
+
+## State and diagnostics
+
+### `vana status`
+
+
+
+```
+$ vana status
+
+Vana Connect
+
+ Runtime: installed
+ Personal Server: not connected
+ Sources: 0 connected, 1 needs attention
+
+ Next: Browse available sources with `vana sources`.
+```
+
+
+
+### `vana doctor`
+
+
+
+```
+$ vana doctor
+
+Vana Connect doctor
+Summary
+ CLI: 0.8.1
+ Channel: stable
+ Install: Development checkout
+ Runtime: installed
+ Personal Server: available
+ Tracked sources: 1
+ Attention: 1
+ Connected: 0
+ Headed sessions: Unavailable
+ Managed profiles: Available
+ Screenshots: Available
+
+Checks
+ CLI: Version 0.8.1
+ Runtime: Browser available at /opt/playwright/chromium-1208/chrome-linux64/chrome
+ Personal Server: http://localhost:8080
+ Executable: Present at /usr/local/bin/node
+ Data home: Present at ~/.vana
+ State file: Present at ~/.vana/vana-connect-state.json
+ Connector cache: Present at ~/.vana/connectors
+ Browser profiles: Missing at ~/.vana/browser-profiles
+ Logs: Present at ~/.vana/logs
+ Tracked sources: 1 source in local state
+ Latest issue: GitHub: Checksum mismatch
+
+Needs attention
+GitHub unavailable
+ Checksum mismatch for GitHub connector script.
+ Updated:
+ Run log: ~/.vana/logs/fetch-github-.log
+
+Paths
+ Executable: /usr/local/bin/node
+ Data home: ~/.vana
+ State file: ~/.vana/vana-connect-state.json
+ Connector cache: ~/.vana/connectors
+ Browser profiles: ~/.vana/browser-profiles
+ Logs: ~/.vana/logs
+
+Lifecycle
+ Upgrade: git pull && pnpm install && pnpm build
+ Uninstall: Remove the local checkout and any generated ~/.vana state.
+
+ Next: Check overall status with `vana status`.
+```
+
+
+
+### `vana logs`
+
+
+
+```
+$ vana logs
+
+Run logs (1)
+
+Need attention (1)
+
+Needs attention (1)
+GitHub unavailable
+ Path: ~/.vana/logs/fetch-github-.log
+ Updated:
+
+ Next: Inspect the latest issue log with `vana logs github`.
+```
+
+
+
+---
+
+## Discovery
+
+### `vana sources`
+
+
+
+```
+$ vana sources
+
+Available sources (9)
+
+Ready now (1) · Browser login (8)
+
+Ready now (1)
+GitHub recommended
+ Your GitHub profile, repositories, and starred repositories.
+
+Browser login (8)
+ChatGPT
+ Your email, memories, and all conversations from ChatGPT.
+Instagram
+ Your Instagram profile, posts, and ad interests.
+LinkedIn
+ Your LinkedIn profile, experience, education, skills, languages, and connections.
+Oura Ring
+ Your Oura Ring readiness scores, sleep data, and daily activity.
+Shop
+ Your Shop app order history.
+Spotify
+ Your Spotify library, playlists, listening history, and preferences.
+Uber
+ Your Uber trip history and receipts.
+YouTube
+ Your YouTube profile, subscriptions, playlists, playlist items, liked videos, watch later list, and recent watch history.
+
+ Next: `vana connect github`
+```
+
+
+
+### `vana sources github`
+
+
+
+```
+$ vana sources github
+
+GitHub new
+
+Your GitHub profile, repositories, and starred repositories.
+
+ Version: 1.1.3
+ Export frequency: unknown
+ Auth mode: terminal
+ Company: github
+
+ Next: `vana connect github`
+```
+
+
+
+---
+
+## Post-success data surfaces
+
+### `vana data list`
+
+
+
+```
+$ vana data list
+
+Collected data (2)
+
+GitHub [synced]
+ Profile: tnunamak
+ Repositories: 2
+ Latest repos: vana-connect, data-connectors
+ Starred: 0
+ Updated: Mar 14, 2026, 8:10 AM
+ Path: ~/.vana/last-result.json
+
+Spotify [local]
+ Profile: tnunamak
+ Playlists: 2
+ Playlists: Data Portability, Build Flow
+ Updated: Mar 13, 2026, 4:23 PM
+ Path: ~/.vana/spotify-result.json
+
+ Next: `vana data show github`
+```
+
+
+
+### `vana data list` (clean machine)
+
+
+
+```
+$ vana data list
+
+Collected data
+
+ No datasets yet.
+
+ Next: `vana connect github`
+```
+
+
+
+### `vana data show github`
+
+
+
+```
+$ vana data show github
+
+GitHub data
+
+ Profile: tnunamak
+ Repositories: 2
+ Latest repos: vana-connect, data-connectors
+ Starred: 0
+
+ Path: ~/.vana/last-result.json
+ Updated: Mar 14, 2026, 8:10 AM
+
+ Next: `vana data path github`
+```
+
+
+
+### `vana data show github` (missing)
+
+
+
+```
+$ vana data show github
+
+No collected dataset found for GitHub. Run `vana connect github` first.
+
+ Next: `vana connect github`
+```
+
+
+
+### `vana data path github`
+
+
+
+```
+$ vana data path github
+
+~/.vana/last-result.json
+```
+
+
+
+---
+
+## Connect flows
+
+### Successful interactive path
+
+
+
+```
+$ vana connect github
+
+ Connect GitHub
+
+ ✓ Signed in
+ ✓ Profile
+ ✓ Repositories — 8 found
+ ✓ Starred — 0 found
+
+ ✓ Connected GitHub.
+ Collected your GitHub data and synced it to your Personal Server.
+
+ Next: vana data show github
+```
+
+
+
+### Interactive-required / `--no-input` path
+
+
+
+```
+$ vana connect github --no-input
+
+ Connect GitHub
+
+
+ ✕ GitHub needs credentials. Run without --no-input to authenticate.
+```
+
+
+
+### `--json --no-input` path
+
+
+
+```
+$ vana connect github --json --no-input
+
+{"type":"setup-check","runtime":"installed"}
+{"type":"outcome","status":"connector_unavailable","source":"github","reason":"..."}
+```
+
+
+
+### Unavailable connector
+
+
+
+```
+$ vana connect steam
+
+ Connect Steam
+
+
+ ✕ Steam is not available.
+ See what's ready: vana sources
+```
+
+
+
+### Unavailable connector `--no-input` path
+
+
+
+```
+$ vana connect steam --no-input
+
+ Connect Steam
+
+
+ ✕ Steam is not available.
+ See what's ready: vana sources
+```
+
+
+
+### Runtime error
+
+
+
+```
+$ vana connect github
+
+ Connect GitHub
+
+
+ ✕ Problem connecting GitHub.
+ Connector run failed.
+ Retry: vana connect github
+```
+
+
+
+### Legacy/manual step required
+
+
+
+```
+$ vana connect shop
+
+ Connect Shop
+
+
+ ✕ Manual step required for Shop.
+ Complete the browser step locally, then rerun vana connect shop.
+```
+
+
+
+---
+
+## Collect flows
+
+### `vana collect`
+
+
+
+```
+$ vana collect
+
+No sources are due for collection.
+```
+
+
+
+### `vana collect github`
+
+
+
+```
+$ vana collect github
+
+Source "github" has not been connected yet. Run `vana connect github` first.
+```
+
+
+
+---
+
+## Server management
+
+### `vana server --help`
+
+
+
+```
+$ vana server --help
+
+Usage: vana server [options] [command]
+
+Manage Personal Server connection
+
+Options:
+ --json Output machine-readable JSON
+ -h, --help display help for command
+
+Commands:
+ status [options] Show Personal Server status
+ set-url [options] Save a Personal Server URL
+ clear-url [options] Remove the saved Personal Server URL
+ sync [options] Sync all local-only datasets to your Personal Server
+ data [options] [scope] List scopes stored in your Personal Server
+
+Examples:
+ vana server
+ vana server set-url http://localhost:8080
+ vana server set-url https://ps-abc123.server.vana.org
+ vana server clear-url
+```
+
+
+
+### `vana server status`
+
+
+
+```
+$ vana server status
+
+Personal Server
+
+ URL: http://localhost:8080 (auto-detected)
+ Status: healthy
+ Version: 0.0.1
+ Uptime: 15h 22m
+ Owner: 0x2AC93684679a5bdA03C6160def908CdB8D46792f
+
+ Save with `vana server set-url http://localhost:8080`.
+
+ More: `vana server sync` | `vana server data` | `vana server --help`
+```
+
+
+
+### `vana server status` (not connected)
+
+
+
+```
+$ vana server status
+
+Personal Server
+
+ Status: Not connected
+
+ Set a URL: `vana server set-url `
+ Or set VANA_PERSONAL_SERVER_URL environment variable
+ Or start a Personal Server on localhost:8080
+```
+
+
+
+### `vana server sync`
+
+
+
+```
+$ vana server sync
+
+github:
+ github.profile ✓
+ github.repositories ✗ (HTTP 400)
+ github.starred ✗ (HTTP 400)
+spotify:
+ spotify.profile ✓
+ spotify.playlists ✗ (HTTP 400)
+Synced 2 dataset(s).
+```
+
+
+
+### `vana server sync` (nothing pending)
+
+
+
+```
+$ vana server sync
+
+No pending datasets to sync.
+```
+
+
+
+### `vana server data`
+
+
+
+```
+$ vana server data
+
+ github.profile: 1 version
+ spotify.profile: 1 version
+
+ Showing locally-known scopes. Connect your Personal Server for live data.
+```
+
+
+
+### `vana server data` (empty)
+
+
+
+```
+$ vana server data
+
+No scopes found.
+```
+
+
diff --git a/docs/CLI-USER-JOURNEYS.md b/docs/CLI-USER-JOURNEYS.md
new file mode 100644
index 00000000..6efef068
--- /dev/null
+++ b/docs/CLI-USER-JOURNEYS.md
@@ -0,0 +1,266 @@
+# `vana-connect` CLI User Journeys
+
+_As of March 12, 2026_
+
+## Purpose
+
+This document defines the canonical user journeys that should shape the first version of the `vana connect` CLI.
+
+The goal is not to cover every future capability. The goal is to make the first impression feel:
+
+- fast
+- trustworthy
+- obvious
+- composable
+
+These journeys should drive command design, output modes, and sequencing.
+
+## Product stance
+
+For MVP:
+
+- the canonical command form is `vana connect ...`
+- the canonical first command is `vana connect `
+- `setup` exists, but should not be the primary onboarding path
+- the CLI should auto-detect missing runtime/setup during `connect`
+- the ideal success state is data synced into the Personal Server
+- the acceptable MVP success state is data collected locally with a clear status and next step
+
+## Journey 1: First-time connect of one source
+
+This is the most important journey in the product. It should receive disproportionate design effort.
+
+### Scenario
+
+The user has heard about Vana Connect and wants to connect one source quickly from the terminal.
+
+Example:
+
+```bash
+vana connect steam
+```
+
+### Starting state
+
+- `vana connect` CLI is installed or otherwise runnable
+- local runtime may or may not be installed
+- no assumption that Data Connect desktop is installed
+- user may or may not have a Personal Server running
+
+### Happy path
+
+1. User runs `vana connect steam`
+2. CLI checks local prerequisites
+3. If runtime is missing, CLI explains what it will install locally and asks once
+4. CLI installs or verifies runtime automatically
+5. CLI fetches the Steam connector
+6. CLI checks for existing session state
+7. CLI runs the connector
+8. If credentials or 2FA are needed, CLI asks cleanly
+9. CLI collects data
+10. If Personal Server is available, CLI ingests data
+11. CLI prints a human summary of what was collected
+12. CLI prints the next useful step
+
+### Success state
+
+Preferred:
+
+- source is connected
+- data is stored in the Personal Server
+- user understands what was collected
+
+Acceptable MVP fallback:
+
+- source is connected
+- data is stored locally
+- user understands where it is and how to use it next
+
+### What makes this journey feel world-class
+
+- user does not need to run `setup` manually
+- trust boundaries are explicit
+- install actions are obvious before they happen
+- auth prompts are minimal and legible
+- success is summarized in user terms, not file terms
+- next step is obvious
+
+### Failure branches that must be designed
+
+- runtime install fails
+- connector does not exist
+- login fails
+- site flow changed
+- Personal Server unavailable
+- partial data collected
+
+## Journey 2: Inspect current status
+
+This is the most important trust and recovery journey after first run.
+
+### Scenario
+
+The user wants to know what is installed, connected, and usable.
+
+Example:
+
+```bash
+vana connect status
+```
+
+### Starting state
+
+- runtime may or may not be installed
+- some connectors may be installed
+- some sessions may exist
+- some results may only be local
+- Personal Server may or may not be available
+
+### Happy path
+
+The command should answer, at minimum:
+
+- is runtime installed?
+- is Personal Server reachable?
+- which connectors are installed?
+- which connectors have saved auth/session state?
+- what was the last successful run per source?
+- was data only collected locally or also ingested?
+
+### Success state
+
+The user can tell, without reading docs:
+
+- whether the system is healthy
+- whether a source is connected
+- whether data is already usable
+- what needs attention
+
+### Why this matters
+
+Without this command, the product will feel fragile even if the underlying runtime is good.
+
+## Journey 3: Reconnect or re-auth a source
+
+### Scenario
+
+A source was connected before, but the session expired or sync failed.
+
+Example:
+
+```bash
+vana connect steam
+```
+
+or later:
+
+```bash
+vana connect reconnect steam
+```
+
+### Starting state
+
+- connector is already installed
+- cached session may exist but be invalid
+- user wants the shortest path back to healthy
+
+### Happy path
+
+1. CLI detects installed connector and existing state
+2. CLI attempts reuse of saved session
+3. If session is invalid, CLI explains that re-auth is needed
+4. CLI prompts only for the missing auth input
+5. CLI reruns collection
+6. CLI reports the new success state
+
+### Success state
+
+- user is back to a working connected state quickly
+- re-auth feels like repair, not a full restart
+
+### Design note
+
+For MVP, this can be the same command as `vana connect `. It does not need a separate noun on day one if the behavior is clear.
+
+## Journey 4: Discover what can be connected
+
+### Scenario
+
+The user wants to see which sources are supported.
+
+Example:
+
+```bash
+vana connect list
+```
+
+### Starting state
+
+- user may know one source or none
+
+### Happy path
+
+The command shows:
+
+- supported connectors
+- whether each is installed locally
+- whether each has been connected before
+- possibly a small “recommended/common” grouping
+
+### Success state
+
+- user can choose what to connect next
+- discovery feels intentional rather than hidden in docs
+
+### Design note
+
+For MVP, this can be simple. It does not need marketplace-level polish, but it does need to exist.
+
+## Journey 5: Use the data after connection
+
+### Scenario
+
+The user has connected a source and wants to know what to do next.
+
+### Starting state
+
+- data has been collected locally or ingested to Personal Server
+
+### Happy path
+
+The CLI should make one of these next states obvious:
+
+- data is available in the Personal Server
+- data is available in a local result artifact
+- data can be inspected or exported
+
+### Success state
+
+The user feels that connection was not the end goal; it unlocked an immediate next action.
+
+### Design note
+
+For MVP, a short post-success hint is enough. This does not require a large feature surface yet.
+
+## What is explicitly not a core MVP journey
+
+These matter later, but should not dominate v1 design:
+
+- connect every possible source automatically
+- scheduling and background sync
+- full TUI experience
+- multi-machine sync
+- advanced app permission workflows
+- blockchain / token use cases
+
+Those are important future surfaces, but they should not dilute the first-run experience.
+
+## MVP priority order
+
+1. First-time connect of one source
+2. Inspect current status
+3. Reconnect / re-auth
+4. Discover available sources
+5. Post-connect next step
+
+If the first two are excellent, the MVP can leave a very strong impression even if the rest are still thin.
diff --git a/docs/CLI-UX-QUALITY-BAR.md b/docs/CLI-UX-QUALITY-BAR.md
new file mode 100644
index 00000000..69a8e675
--- /dev/null
+++ b/docs/CLI-UX-QUALITY-BAR.md
@@ -0,0 +1,185 @@
+# `vana-connect` CLI UX Quality Bar
+
+_As of March 12, 2026_
+
+## Purpose
+
+This document defines what “beautiful” should mean for the `vana connect` CLI.
+
+The goal is to make beauty actionable. For this product, beauty should not mean decoration or terminal spectacle. It should mean:
+
+- clarity
+- restraint
+- confidence
+- strong pacing
+- high signal-to-noise
+
+## Core principle
+
+A beautiful CLI feels:
+
+- obvious to start
+- calm while running
+- precise when it fails
+- satisfying when it succeeds
+
+The best beauty benchmark is not “does this look flashy?”
+
+It is:
+
+**“Does this feel inevitable, polished, and lighter than it should?”**
+
+## What beauty means here
+
+### 1. Command beauty
+
+Commands should feel guessable.
+
+Good:
+
+- `vana connect steam`
+- `vana connect status`
+- `vana connect list`
+
+Bad:
+
+- commands shaped around internal scripts
+- commands that require reading docs before first use
+
+### 2. Output beauty
+
+Output should have:
+
+- clean hierarchy
+- short line lengths
+- obvious state transitions
+- minimal clutter
+
+The user should be able to scan and understand:
+
+- what is happening
+- whether it succeeded
+- what to do next
+
+### 3. Copy beauty
+
+Copy should be:
+
+- concise
+- technically serious
+- specific
+
+It should avoid:
+
+- filler
+- hype
+- vague reassurance
+- “friendly” noise
+
+### 4. Progress beauty
+
+Progress should feel smooth, not chatty.
+
+Good progress:
+
+- meaningful step changes
+- occasional counts when they matter
+- calm updates during long operations
+
+Bad progress:
+
+- constant noisy logging
+- fake precision
+- overwhelming dependency output
+
+### 5. Success beauty
+
+A successful run should feel like an outcome, not a file write.
+
+Good:
+
+- “Connected Steam. Collected your Steam data and synced it to your Personal Server.”
+
+Bad:
+
+- “Saved result to ~/.vana/last-result.json”
+
+Artifact paths are supporting detail, not the story.
+
+### 6. Failure beauty
+
+A failure should feel understandable and recoverable.
+
+Good failure:
+
+- one-sentence problem statement
+- one useful next step
+- enough specificity to trust the message
+
+Bad failure:
+
+- wall of raw subprocess output
+- vague “something went wrong”
+- making the user infer whether retry is safe
+
+### 7. Machine beauty
+
+`--json` mode should also feel beautiful.
+
+For machine mode, beauty means:
+
+- stable event names
+- stable field names
+- no clutter
+- no decorative output
+- strong predictability
+
+This matters because coding agents are users too.
+
+## What beauty does not require
+
+Not required:
+
+- heavy ANSI art
+- custom TUI chrome
+- animations
+- excessive color
+- terminal gimmicks
+
+These can easily reduce polish rather than increase it.
+
+## Implementation filters
+
+Every command and output path should be evaluated against these questions:
+
+- Is this the shortest obvious command?
+- Can a user scan this in two seconds?
+- Is any line here doing unnecessary work?
+- Does the success message describe an outcome rather than an artifact?
+- Does this message preserve trust?
+- Would this still feel clean on the 100th use?
+
+## Beauty standards for v1
+
+For v1, the CLI should meet these standards:
+
+- first run feels obvious
+- install prompts are crisp
+- progress is calm
+- success is outcome-shaped
+- local-only vs ingested is elegant and unmistakable
+- status is compact and useful
+- `--json` mode is clean and deterministic
+
+## Conclusion
+
+For `vana connect`, beauty is a valid requirement.
+
+It should be understood as:
+
+- taste
+- compression
+- confidence
+- legibility
+
+Not terminal theatrics.
diff --git a/docs/CLI-UX-SIMULATION.md b/docs/CLI-UX-SIMULATION.md
new file mode 100644
index 00000000..dfb7937c
--- /dev/null
+++ b/docs/CLI-UX-SIMULATION.md
@@ -0,0 +1,350 @@
+# `vana-connect` CLI UX Simulation
+
+_As of March 12, 2026_
+
+## Purpose
+
+This is a lightweight internal simulation pass for the `vana connect` CLI.
+
+It exists to pressure-test:
+
+- the first-run flow
+- the trust model
+- the human vs agent mode split
+- the distinction between local collection and Personal Server ingest
+
+This is not a polished prototype. It is a fast decision tool before locking the v1 spec.
+
+## What we are testing
+
+The main questions are:
+
+- does the canonical first command feel right?
+- does setup inline cleanly?
+- does success feel meaningful?
+- is local-only vs ingested unmistakable?
+- does `--json` mode feel agent-safe?
+
+## Transcript 1: First-time human run, happy path with Personal Server available
+
+Command:
+
+```bash
+vana connect steam
+```
+
+Output:
+
+```text
+Vana Connect needs a local browser runtime before it can connect Steam.
+
+This will install:
+- the connector runner
+- a Chromium browser engine
+- local runtime files under ~/.vana/
+
+Your credentials stay on this machine. Nothing is sent anywhere except Steam itself.
+
+Continue? [Y/n] y
+
+Installing runner...
+Installing browser engine...
+Preparing local runtime...
+Runtime ready.
+
+Finding a connector for Steam...
+Connector ready.
+
+Found an existing Steam session. Trying that first...
+
+Connecting to Steam...
+Collecting your data...
+
+Personal Server detected. Syncing your Steam data...
+
+Connected Steam.
+Collected your Steam data and synced it to your Personal Server.
+
+Next: run `vana connect status` to inspect your current connection state.
+```
+
+Assessment:
+
+- strong start
+- install copy feels clear
+- outcome is meaningful
+- likely good enough for MVP
+
+Main risk:
+
+- “Found an existing Steam session” should only appear when true
+
+## Transcript 2: First-time human run, no Personal Server available
+
+Command:
+
+```bash
+vana connect steam
+```
+
+Output:
+
+```text
+Finding a connector for Steam...
+Connector ready.
+
+Connecting to Steam...
+Collecting your data...
+
+No Personal Server is available right now, so your Steam data was saved locally.
+
+Connected Steam.
+Collected your Steam data and saved it locally.
+
+Next: run `vana connect status` to inspect your current connection state.
+Local result: ~/.vana/last-result.json
+```
+
+Assessment:
+
+- the distinction between local and ingested is clear
+- still feels like success, which is important
+
+Main risk:
+
+- if this happens too often, users may feel they only got a partial product
+
+Implication:
+
+- if Personal Server availability is part of the intended default experience, the CLI should probably attempt to detect and message it very early
+
+## Transcript 3: Human run requiring credentials
+
+Command:
+
+```bash
+vana connect steam
+```
+
+Output:
+
+```text
+Finding a connector for Steam...
+Connector ready.
+
+To connect Steam, Vana Connect will open a local browser session on this machine.
+Your credentials stay local.
+
+Steam needs your login details to continue.
+Enter the requested fields below.
+
+Email or username: alice@example.com
+Password: ********
+
+Collecting your data...
+
+Connected Steam.
+Collected your Steam data and saved it locally.
+```
+
+Assessment:
+
+- good enough for MVP
+- trust copy is doing useful work
+
+Main risk:
+
+- if the connector actually uses a file-based continuation path under the hood, the interactive experience needs to hide that complexity unless it fails
+
+## Transcript 4: Human run with 2FA
+
+Command:
+
+```bash
+vana connect steam
+```
+
+Output:
+
+```text
+Finding a connector for Steam...
+Connector ready.
+
+To connect Steam, Vana Connect will open a local browser session on this machine.
+Your credentials stay local.
+
+Steam needs your login details to continue.
+Enter the requested fields below.
+
+Email or username: alice@example.com
+Password: ********
+
+Steam asked for a verification code.
+Enter the current code to continue.
+
+Verification code: 123456
+
+Collecting your data...
+
+Connected Steam.
+Collected your Steam data and saved it locally.
+```
+
+Assessment:
+
+- simple and legible
+
+Main risk:
+
+- time-sensitive codes mean the actual implementation must not introduce lag between prompt and submission
+
+## Transcript 5: Agent run in JSON mode, success with local-only result
+
+Command:
+
+```bash
+vana connect steam --json --no-input
+```
+
+Output:
+
+```json
+{"type":"setup-check","runtime":"installed"}
+{"type":"connector-resolved","source":"steam","connectorPath":"/home/user/.vana/connectors/valve/steam-playwright.js"}
+{"type":"run-started","source":"steam"}
+{"type":"collection-complete","source":"steam","resultPath":"/home/user/.vana/last-result.json"}
+{"type":"ingest-skipped","reason":"personal_server_unavailable"}
+{"type":"outcome","status":"connected_local_only","source":"steam","resultPath":"/home/user/.vana/last-result.json"}
+```
+
+Assessment:
+
+- this is the right general shape
+- concise and deterministic
+
+Main risk:
+
+- event names and field names must be stabilized in the spec, not invented per command
+
+## Transcript 6: Agent run in JSON mode, input required
+
+Command:
+
+```bash
+vana connect steam --json --no-input
+```
+
+Output:
+
+```json
+{"type":"setup-check","runtime":"installed"}
+{"type":"connector-resolved","source":"steam","connectorPath":"/home/user/.vana/connectors/valve/steam-playwright.js"}
+{"type":"needs-input","source":"steam","fields":["username","password"],"message":"Steam needs credentials"}
+{"type":"outcome","status":"needs_input","source":"steam"}
+```
+
+Assessment:
+
+- good for agents
+- avoids hanging
+
+Main risk:
+
+- if the underlying runtime produces richer field schemas, the CLI should preserve them in machine mode rather than collapsing too much
+
+## Transcript 7: Status command
+
+Command:
+
+```bash
+vana connect status
+```
+
+Output:
+
+```text
+Vana Connect status
+
+Runtime: installed
+Personal Server: available
+
+Steam: connected, synced
+GitHub: connected, local only
+Spotify: not connected
+```
+
+Assessment:
+
+- compact and useful
+- enough for MVP if backed by real state
+
+Main risk:
+
+- status can easily become misleading if state is inferred too loosely
+
+Implication:
+
+- the implementation should prefer conservative status language over pretending certainty
+
+## Transcript 8: Setup failure
+
+Command:
+
+```bash
+vana connect steam
+```
+
+Output:
+
+```text
+Vana Connect could not finish installing the local runtime.
+Check your network connection and try `vana connect setup` again.
+```
+
+Assessment:
+
+- clean fallback
+- preserves `setup` as repair path
+
+Main risk:
+
+- if the real failure is more specific, the CLI should include a short cause line when available
+
+## Decisions after simulation
+
+The simulation supports these decisions:
+
+- `vana connect ` is the right canonical first command
+- inline setup is better than forcing `setup` first
+- the product must distinguish:
+ - connected and ingested
+ - connected local only
+- `--json --no-input` is enough for the core agent mode in MVP
+- `status` is essential for trust and recovery
+
+## Small adjustments suggested by the simulation
+
+### 1. Make Personal Server state visible early
+
+The user should not be surprised at the end to learn that their data is only local.
+
+The CLI should likely detect target availability during connect and be ready to message it clearly.
+
+### 2. Stabilize machine event names in the v1 spec
+
+The JSON transcripts feel right, but the exact event contract needs to be locked.
+
+### 3. Keep human output extremely compact
+
+Anything more verbose than these transcripts will likely reduce the sense of polish.
+
+## Conclusion
+
+The current design direction survives a lightweight transcript test.
+
+It is now reasonable to write the v1 spec with confidence, focusing on:
+
+- command behavior
+- event contract
+- outcome states
+- state inspection rules
diff --git a/docs/CLI-V1-SPEC.md b/docs/CLI-V1-SPEC.md
new file mode 100644
index 00000000..a74b20e1
--- /dev/null
+++ b/docs/CLI-V1-SPEC.md
@@ -0,0 +1,424 @@
+# `vana-connect` CLI v1 Spec
+
+_As of March 12, 2026_
+
+## Purpose
+
+This document defines the implementation target for the first version of the `vana connect` CLI.
+
+It is intended to be:
+
+- small enough to ship quickly
+- strong enough to leave an excellent first impression
+- explicit enough to guide implementation decisions
+
+## Product goal
+
+Ship an MVP CLI that feels intentional and trustworthy for both:
+
+- humans using the terminal directly
+- coding agents using machine-readable mode
+
+The v1 CLI does not need to feel broad or mature. It does need to feel:
+
+- fast to first value
+- clear about what happened
+- clear about where data went
+- reliable enough to trust
+
+## Product stance
+
+### Core command family
+
+The CLI command family is:
+
+```bash
+vana connect ...
+```
+
+### Canonical first command
+
+The canonical first command is:
+
+```bash
+vana connect
+```
+
+Example:
+
+```bash
+vana connect steam
+```
+
+### Core philosophy
+
+- one command model
+- one underlying lifecycle
+- human-friendly default mode
+- machine-readable mode via flags
+
+## v1 command surface
+
+The public v1 commands are:
+
+- `vana connect `
+- `vana connect list`
+- `vana connect status`
+- `vana connect setup`
+
+Optional if cheap:
+
+- `vana connect inspect `
+
+No additional top-level command surfaces are part of v1.
+
+## v1 flags
+
+Required:
+
+- `--json`
+- `--no-input`
+- `--yes`
+
+Optional if cheap:
+
+- `--quiet`
+
+## Command behavior
+
+### `vana connect `
+
+#### Goal
+
+Connect one source end-to-end with the shortest possible path to value.
+
+#### Required behavior
+
+1. Check runtime availability.
+2. If runtime is missing:
+ - explain what will be installed
+ - ask for confirmation unless `--yes` is present
+ - perform setup inline
+3. Resolve the requested source connector.
+4. Check for reusable saved session/auth state.
+5. Run collection.
+6. Prompt for input if required, unless `--no-input` is present.
+7. Detect whether a Personal Server target is available.
+8. If available, attempt ingest.
+9. Print a concise outcome summary.
+
+#### Human-mode output requirements
+
+Must communicate:
+
+- what is being installed, if anything
+- what source is being connected
+- whether an existing session is being reused
+- whether data was collected
+- whether data was ingested or saved locally only
+- what to do next
+
+#### Machine-mode requirements
+
+Must emit structured events and a final outcome object.
+
+#### Success outcomes
+
+- `connected_and_ingested`
+- `connected_local_only`
+
+#### Recoverable outcomes
+
+- `needs_input`
+- `setup_required`
+- `personal_server_unavailable`
+- `auth_failed`
+- `connector_unavailable`
+- `ingest_failed`
+
+#### Hard failure outcomes
+
+- `runtime_error`
+- `invalid_connector`
+- `unexpected_internal_error`
+
+### `vana connect list`
+
+#### Goal
+
+Show the sources the user can connect.
+
+#### Required behavior
+
+- list supported sources
+- indicate whether each source is installed locally when known
+- stay compact in human mode
+- return structured output in `--json` mode
+
+### `vana connect status`
+
+#### Goal
+
+Answer the question:
+
+**“Is my setup healthy, and is my data connected and usable?”**
+
+#### Required behavior
+
+Report at minimum:
+
+- runtime installed or not
+- Personal Server target available/unavailable/unknown
+- installed sources
+- likely session presence per source
+- last known outcome per source
+- local-only vs ingested state per source when known
+
+#### Output requirement
+
+Human mode should be compact and summary-first.
+
+Machine mode should be structured and stable.
+
+### `vana connect setup`
+
+#### Goal
+
+Provide explicit bootstrap and repair.
+
+#### Required behavior
+
+- install runtime prerequisites
+- verify key runtime artifacts exist
+- summarize what was installed
+- exit cleanly if already healthy
+
+#### Product role
+
+This command exists for:
+
+- explicit install
+- repair
+- CI/bootstrap
+
+It is not the intended first-run entrypoint for humans.
+
+## Mode behavior
+
+### Default mode
+
+- human-readable
+- concise
+- prompts allowed
+- no raw JSON
+
+### `--json`
+
+- structured output only
+- no decorative formatting
+- stable event and outcome objects
+
+### `--no-input`
+
+- do not prompt
+- fail with a structured `needs_input` outcome if input is required
+
+### `--yes`
+
+- auto-approve safe setup/install prompts
+
+### `--quiet`
+
+- reduce non-essential human chatter
+- keep warnings and errors
+
+## State model requirements
+
+The CLI must reason about four state domains:
+
+- runtime
+- sources
+- data
+- Personal Server target
+
+### Runtime states
+
+- `installed`
+- `missing`
+- `unhealthy`
+
+### Source states
+
+- `unknown`
+- `available`
+- `installed`
+- `session_present`
+- `needs_auth`
+- `last_run_succeeded`
+- `last_run_failed`
+
+### Data states
+
+- `none`
+- `collected_local`
+- `ingested_personal_server`
+- `ingest_unavailable`
+- `ingest_failed`
+
+### Personal Server target states
+
+- `available`
+- `unavailable`
+- `unknown`
+
+## Data-location rule
+
+The CLI must never blur:
+
+- successful local collection
+- successful Personal Server ingest
+
+This distinction must appear in:
+
+- human success copy
+- `status`
+- `--json` output
+
+## Personal Server model
+
+In v1, the CLI should speak in terms of:
+
+- the user’s Personal Server
+
+It should not force users to reason about:
+
+- desktop app internals
+- localhost implementation details
+- cloud-vs-local backend distinctions
+
+The implementation may detect:
+
+- local target
+- future self-hosted target
+- future cloud-hosted target
+
+But that is an implementation concern, not the user-facing model.
+
+## Machine-readable event contract
+
+v1 should formalize a small event set instead of exposing arbitrary internal messages.
+
+### Required event categories
+
+- setup
+- connector resolution
+- run lifecycle
+- input required
+- collection result
+- ingest result
+- final outcome
+
+### Example event shapes
+
+Illustrative only:
+
+```json
+{"type":"setup-check","runtime":"installed"}
+{"type":"connector-resolved","source":"steam","connectorPath":"..."}
+{"type":"run-started","source":"steam"}
+{"type":"needs-input","source":"steam","fields":["username","password"]}
+{"type":"collection-complete","source":"steam","resultPath":"..."}
+{"type":"ingest-complete","source":"steam","target":"personal_server"}
+{"type":"outcome","status":"connected_and_ingested","source":"steam"}
+```
+
+The exact field set should be locked during implementation and kept intentionally small.
+
+## Copy requirements
+
+The CLI copy must:
+
+- explain installs before performing them
+- state that credentials stay local
+- summarize outcomes in user terms
+- keep file paths as supporting detail
+- clearly distinguish local-only vs ingested outcomes
+
+The copy should stay:
+
+- calm
+- concise
+- technically serious
+
+## Existing foundations to reuse
+
+The v1 CLI should build on the current primitives rather than replace them outright.
+
+Reuse:
+
+- setup bootstrap behavior
+- connector fetch behavior
+- runner lifecycle
+- request-input continuation model
+- local state directory
+- validator core where helpful
+
+Wrap:
+
+- raw script names
+- raw bootstrap UX
+- raw helper output
+
+## MVP non-goals
+
+These are explicitly not required for v1:
+
+- connect-all as the default onboarding flow
+- scheduling or daemonized background sync
+- TUI-first interaction
+- large configuration trees
+- advanced Personal Server environment management
+- full cloud orchestration UX
+- blockchain / token use cases
+
+## Implementation priority
+
+Build in this order:
+
+1. `vana connect `
+2. `vana connect status`
+3. `vana connect list`
+4. `vana connect setup`
+5. optional `inspect`
+
+This order matches user impact.
+
+## Acceptance criteria
+
+### Human acceptance
+
+v1 succeeds if a human can:
+
+- run `vana connect steam`
+- get setup handled inline if missing
+- understand what was installed
+- understand what data was collected
+- understand whether it was ingested or only saved locally
+- run `vana connect status` later and understand their current state
+
+### Agent acceptance
+
+v1 succeeds if an agent can:
+
+- run `vana connect steam --json --no-input`
+- detect whether setup is missing
+- detect whether input is required
+- detect whether collection succeeded
+- detect whether ingest succeeded
+- distinguish local-only from ingested outcomes
+
+## Conclusion
+
+v1 should be intentionally small and disproportionately focused on first-run quality.
+
+If the CLI can connect one source beautifully, report status honestly, and serve both humans and agents through one stable command model, it will be a strong MVP.
diff --git a/docs/vhs/README.md b/docs/vhs/README.md
new file mode 100644
index 00000000..c84c7ab1
--- /dev/null
+++ b/docs/vhs/README.md
@@ -0,0 +1,78 @@
+# VHS demos
+
+This directory holds deterministic terminal demo assets for the `vana` CLI.
+
+The goal is to make the README a reliable progress surface for the team without
+depending on live credentials or live connector runs.
+
+The visible commands in these tapes should match what a real user would type.
+Fixture seeding, `HOME`, and any other harness setup should stay hidden in the
+rendering scripts.
+
+## Fixture model
+
+The demo tapes should use:
+
+- a temp or fixture `HOME`
+- `VANA_DATA_CONNECTORS_DIR` pointing at a deterministic fixture connector repo
+- seeded `~/.vana/` state and result files
+
+Prepare the fixture home with:
+
+```bash
+pnpm demo:vhs:fixtures
+```
+
+That creates:
+
+- `docs/vhs/fixtures/demo-home/.vana/vana-connect-state.json`
+- fake installed connector files
+- `docs/vhs/fixtures/demo-data-connectors/` with deterministic demo connectors
+- a fake downloaded Chromium path so `vana status` reads as installed
+- sample collected result files for `vana data ...`
+
+## Current tapes
+
+- `status-and-sources.tape`
+- `data-inspection.tape`
+- `connect-success.tape`
+
+The public `connect-success` tape should end on user value, not only progress
+output. In practice that means a successful `vana connect github` run followed
+by `vana data show github`.
+
+## Rendering
+
+The preferred renderer is `vhs` from Charm.
+
+One command path:
+
+```bash
+pnpm demo:vhs
+```
+
+That command:
+
+- reseeds the fixture home
+- renders all checked-in `.tape` files
+- writes GIF assets next to the tapes
+
+It will use a local `vhs` binary if present, or Docker if available.
+By default the scripts prefer the deterministic fixture connector repo generated
+under `docs/vhs/fixtures/demo-data-connectors/`, but you can override that with
+`VANA_DATA_CONNECTORS_DIR=/path/to/data-connectors`.
+
+CI also renders the tapes on Linux in the `demo-preview` job and uploads the
+resulting GIFs and transcripts as a workflow artifact so the branch always has a
+current review surface.
+
+Typical usage once `vhs` is available locally:
+
+```bash
+HOME="$PWD/docs/vhs/fixtures/demo-home" \
+VANA_DATA_CONNECTORS_DIR="/path/to/data-connectors" \
+vhs docs/vhs/status-and-sources.tape
+```
+
+Generated GIF assets should be committed once they are stable enough for the
+README.
diff --git a/docs/vhs/collect-github.tape b/docs/vhs/collect-github.tape
new file mode 100644
index 00000000..82ac000c
--- /dev/null
+++ b/docs/vhs/collect-github.tape
@@ -0,0 +1,19 @@
+Output docs/vhs/collect-github.gif
+Require vana
+
+Set Shell "bash"
+Set FontSize 22
+Set Width 1200
+Set TypingSpeed 50ms
+Set CursorBlink false
+Set Theme "Catppuccin Mocha"
+Set WindowBar Colorful
+Set Padding 20
+Set BorderRadius 10
+Set LoopOffset 50%
+Set Height 350
+
+Type "vana collect github"
+Sleep 500ms
+Enter
+Sleep 4000ms
diff --git a/docs/vhs/collect.tape b/docs/vhs/collect.tape
new file mode 100644
index 00000000..2cfbc8c1
--- /dev/null
+++ b/docs/vhs/collect.tape
@@ -0,0 +1,19 @@
+Output docs/vhs/collect.gif
+Require vana
+
+Set Shell "bash"
+Set FontSize 22
+Set Width 1200
+Set TypingSpeed 50ms
+Set CursorBlink false
+Set Theme "Catppuccin Mocha"
+Set WindowBar Colorful
+Set Padding 20
+Set BorderRadius 10
+Set LoopOffset 50%
+Set Height 350
+
+Type "vana collect"
+Sleep 500ms
+Enter
+Sleep 4000ms
diff --git a/docs/vhs/connect-github-no-input.tape b/docs/vhs/connect-github-no-input.tape
new file mode 100644
index 00000000..53d12ab8
--- /dev/null
+++ b/docs/vhs/connect-github-no-input.tape
@@ -0,0 +1,19 @@
+Output docs/vhs/connect-github-no-input.gif
+Require vana
+
+Set Shell "bash"
+Set FontSize 22
+Set Width 1200
+Set TypingSpeed 50ms
+Set CursorBlink false
+Set Theme "Catppuccin Mocha"
+Set WindowBar Colorful
+Set Padding 20
+Set BorderRadius 10
+Set LoopOffset 50%
+Set Height 770
+
+Type "vana connect github --no-input"
+Sleep 500ms
+Enter
+Sleep 4000ms
diff --git a/docs/vhs/connect-github-session-reuse-no-input.tape b/docs/vhs/connect-github-session-reuse-no-input.tape
new file mode 100644
index 00000000..aa8f279d
--- /dev/null
+++ b/docs/vhs/connect-github-session-reuse-no-input.tape
@@ -0,0 +1,19 @@
+Output docs/vhs/connect-github-session-reuse-no-input.gif
+Require vana
+
+Set Shell "bash"
+Set FontSize 22
+Set Width 1200
+Set TypingSpeed 50ms
+Set CursorBlink false
+Set Theme "Catppuccin Mocha"
+Set WindowBar Colorful
+Set Padding 20
+Set BorderRadius 10
+Set LoopOffset 50%
+Set Height 798
+
+Type "vana connect github --no-input"
+Sleep 500ms
+Enter
+Sleep 4000ms
diff --git a/docs/vhs/connect-github-success.tape b/docs/vhs/connect-github-success.tape
new file mode 100644
index 00000000..5fdd416f
--- /dev/null
+++ b/docs/vhs/connect-github-success.tape
@@ -0,0 +1,19 @@
+Output docs/vhs/connect-github-success.gif
+Require vana
+
+Set Shell "bash"
+Set FontSize 22
+Set Width 1200
+Set TypingSpeed 50ms
+Set CursorBlink false
+Set Theme "Catppuccin Mocha"
+Set WindowBar Colorful
+Set Padding 20
+Set BorderRadius 10
+Set LoopOffset 50%
+Set Height 1218
+
+Type "vana connect github"
+Sleep 500ms
+Enter
+Sleep 6000ms
diff --git a/docs/vhs/connect-shop-no-input.tape b/docs/vhs/connect-shop-no-input.tape
new file mode 100644
index 00000000..7423b7d5
--- /dev/null
+++ b/docs/vhs/connect-shop-no-input.tape
@@ -0,0 +1,19 @@
+Output docs/vhs/connect-shop-no-input.gif
+Require vana
+
+Set Shell "bash"
+Set FontSize 22
+Set Width 1200
+Set TypingSpeed 50ms
+Set CursorBlink false
+Set Theme "Catppuccin Mocha"
+Set WindowBar Colorful
+Set Padding 20
+Set BorderRadius 10
+Set LoopOffset 50%
+Set Height 770
+
+Type "vana connect shop --no-input"
+Sleep 500ms
+Enter
+Sleep 4000ms
diff --git a/docs/vhs/connect-shop.tape b/docs/vhs/connect-shop.tape
new file mode 100644
index 00000000..5e7416cf
--- /dev/null
+++ b/docs/vhs/connect-shop.tape
@@ -0,0 +1,19 @@
+Output docs/vhs/connect-shop.gif
+Require vana
+
+Set Shell "bash"
+Set FontSize 22
+Set Width 1200
+Set TypingSpeed 50ms
+Set CursorBlink false
+Set Theme "Catppuccin Mocha"
+Set WindowBar Colorful
+Set Padding 20
+Set BorderRadius 10
+Set LoopOffset 50%
+Set Height 686
+
+Type "vana connect shop"
+Sleep 500ms
+Enter
+Sleep 3000ms
diff --git a/docs/vhs/connect-steam-no-input.tape b/docs/vhs/connect-steam-no-input.tape
new file mode 100644
index 00000000..0a454362
--- /dev/null
+++ b/docs/vhs/connect-steam-no-input.tape
@@ -0,0 +1,19 @@
+Output docs/vhs/connect-steam-no-input.gif
+Require vana
+
+Set Shell "bash"
+Set FontSize 22
+Set Width 1200
+Set TypingSpeed 50ms
+Set CursorBlink false
+Set Theme "Catppuccin Mocha"
+Set WindowBar Colorful
+Set Padding 20
+Set BorderRadius 10
+Set LoopOffset 50%
+Set Height 518
+
+Type "vana connect steam --no-input"
+Sleep 500ms
+Enter
+Sleep 3000ms
diff --git a/docs/vhs/connect-steam.tape b/docs/vhs/connect-steam.tape
new file mode 100644
index 00000000..59fec94d
--- /dev/null
+++ b/docs/vhs/connect-steam.tape
@@ -0,0 +1,19 @@
+Output docs/vhs/connect-steam.gif
+Require vana
+
+Set Shell "bash"
+Set FontSize 22
+Set Width 1200
+Set TypingSpeed 50ms
+Set CursorBlink false
+Set Theme "Catppuccin Mocha"
+Set WindowBar Colorful
+Set Padding 20
+Set BorderRadius 10
+Set LoopOffset 50%
+Set Height 518
+
+Type "vana connect steam"
+Sleep 500ms
+Enter
+Sleep 3000ms
diff --git a/docs/vhs/data-help.tape b/docs/vhs/data-help.tape
new file mode 100644
index 00000000..caf27fc4
--- /dev/null
+++ b/docs/vhs/data-help.tape
@@ -0,0 +1,19 @@
+Output docs/vhs/data-help.gif
+Require vana
+
+Set Shell "bash"
+Set FontSize 22
+Set Width 1200
+Set TypingSpeed 50ms
+Set CursorBlink false
+Set Theme "Catppuccin Mocha"
+Set WindowBar Colorful
+Set Padding 20
+Set BorderRadius 10
+Set LoopOffset 50%
+Set Height 658
+
+Type "vana data"
+Sleep 500ms
+Enter
+Sleep 3000ms
diff --git a/docs/vhs/data-list-empty.tape b/docs/vhs/data-list-empty.tape
new file mode 100644
index 00000000..7c7c8287
--- /dev/null
+++ b/docs/vhs/data-list-empty.tape
@@ -0,0 +1,19 @@
+Output docs/vhs/data-list-empty.gif
+Require vana
+
+Set Shell "bash"
+Set FontSize 22
+Set Width 1200
+Set TypingSpeed 50ms
+Set CursorBlink false
+Set Theme "Catppuccin Mocha"
+Set WindowBar Colorful
+Set Padding 20
+Set BorderRadius 10
+Set LoopOffset 50%
+Set Height 378
+
+Type "vana data list"
+Sleep 500ms
+Enter
+Sleep 3000ms
diff --git a/docs/vhs/data-list.tape b/docs/vhs/data-list.tape
new file mode 100644
index 00000000..30fd16a7
--- /dev/null
+++ b/docs/vhs/data-list.tape
@@ -0,0 +1,19 @@
+Output docs/vhs/data-list.gif
+Require vana
+
+Set Shell "bash"
+Set FontSize 22
+Set Width 1200
+Set TypingSpeed 50ms
+Set CursorBlink false
+Set Theme "Catppuccin Mocha"
+Set WindowBar Colorful
+Set Padding 20
+Set BorderRadius 10
+Set LoopOffset 50%
+Set Height 882
+
+Type "vana data list"
+Sleep 500ms
+Enter
+Sleep 4000ms
diff --git a/docs/vhs/data-path-github.tape b/docs/vhs/data-path-github.tape
new file mode 100644
index 00000000..7b7916c5
--- /dev/null
+++ b/docs/vhs/data-path-github.tape
@@ -0,0 +1,19 @@
+Output docs/vhs/data-path-github.gif
+Require vana
+
+Set Shell "bash"
+Set FontSize 22
+Set Width 1200
+Set TypingSpeed 50ms
+Set CursorBlink false
+Set Theme "Catppuccin Mocha"
+Set WindowBar Colorful
+Set Padding 20
+Set BorderRadius 10
+Set LoopOffset 50%
+Set Height 350
+
+Type "vana data path github"
+Sleep 500ms
+Enter
+Sleep 4000ms
diff --git a/docs/vhs/data-show-github-missing.tape b/docs/vhs/data-show-github-missing.tape
new file mode 100644
index 00000000..e38c5929
--- /dev/null
+++ b/docs/vhs/data-show-github-missing.tape
@@ -0,0 +1,19 @@
+Output docs/vhs/data-show-github-missing.gif
+Require vana
+
+Set Shell "bash"
+Set FontSize 22
+Set Width 1200
+Set TypingSpeed 50ms
+Set CursorBlink false
+Set Theme "Catppuccin Mocha"
+Set WindowBar Colorful
+Set Padding 20
+Set BorderRadius 10
+Set LoopOffset 50%
+Set Height 350
+
+Type "vana data show github"
+Sleep 500ms
+Enter
+Sleep 3000ms
diff --git a/docs/vhs/data-show-github.tape b/docs/vhs/data-show-github.tape
new file mode 100644
index 00000000..72d15692
--- /dev/null
+++ b/docs/vhs/data-show-github.tape
@@ -0,0 +1,19 @@
+Output docs/vhs/data-show-github.gif
+Require vana
+
+Set Shell "bash"
+Set FontSize 22
+Set Width 1200
+Set TypingSpeed 50ms
+Set CursorBlink false
+Set Theme "Catppuccin Mocha"
+Set WindowBar Colorful
+Set Padding 20
+Set BorderRadius 10
+Set LoopOffset 50%
+Set Height 714
+
+Type "vana data show github"
+Sleep 500ms
+Enter
+Sleep 3000ms
diff --git a/docs/vhs/demo.tape b/docs/vhs/demo.tape
new file mode 100644
index 00000000..38081eea
--- /dev/null
+++ b/docs/vhs/demo.tape
@@ -0,0 +1,43 @@
+# Hero demo for the README.
+# Shows the core workflow: connect a source, then inspect the collected data.
+#
+# Render with the existing infrastructure:
+# pnpm demo:vhs
+#
+# Or manually (requires seeded fixture HOME):
+# HOME="$PWD/docs/vhs/fixtures/demo-home" \
+# VANA_DEMO_FAST_SUCCESS=1 \
+# vhs docs/vhs/demo.tape
+
+Output docs/assets/demo.gif
+
+Set Shell "bash"
+Set FontSize 22
+Set Width 1200
+Set Height 800
+Set Framerate 24
+Set TypingSpeed 50ms
+Set CursorBlink false
+Set Theme "Catppuccin Mocha"
+Set WindowBar Colorful
+Set WindowBarSize 40
+Set Padding 20
+Set BorderRadius 10
+
+# Connect GitHub. Output streams over ~5-6s from the fixture connector.
+Type "vana connect github"
+Sleep 500ms
+Enter
+Sleep 7000ms
+
+# Let the viewer read the success message and path.
+Sleep 2000ms
+
+# Inspect what was collected.
+Type "vana data show github"
+Sleep 500ms
+Enter
+Sleep 3000ms
+
+# Hold on the summary so the viewer can read it before the GIF loops.
+Sleep 4000ms
diff --git a/docs/vhs/doctor.tape b/docs/vhs/doctor.tape
new file mode 100644
index 00000000..b0fbef4a
--- /dev/null
+++ b/docs/vhs/doctor.tape
@@ -0,0 +1,19 @@
+Output docs/vhs/doctor.gif
+Require vana
+
+Set Shell "bash"
+Set FontSize 22
+Set Width 1200
+Set TypingSpeed 50ms
+Set CursorBlink false
+Set Theme "Catppuccin Mocha"
+Set WindowBar Colorful
+Set Padding 20
+Set BorderRadius 10
+Set LoopOffset 50%
+Set Height 1694
+
+Type "vana doctor"
+Sleep 500ms
+Enter
+Sleep 5000ms
diff --git a/docs/vhs/fixtures/demo-data-connectors/connectors/github/github-playwright.js b/docs/vhs/fixtures/demo-data-connectors/connectors/github/github-playwright.js
new file mode 100644
index 00000000..4edb1579
--- /dev/null
+++ b/docs/vhs/fixtures/demo-data-connectors/connectors/github/github-playwright.js
@@ -0,0 +1,53 @@
+const delay = (ms) => new Promise((resolve) => setTimeout(resolve, ms));
+
+(async () => {
+ await page.setData("status", "Checking GitHub login...");
+
+ if (process.env.VANA_DEMO_FAST_SUCCESS !== "1") {
+ await page.requestInput({
+ message: "Log in to GitHub",
+ schema: {
+ type: "object",
+ properties: {
+ username: { type: "string" },
+ password: { type: "string", format: "password" },
+ },
+ },
+ });
+ }
+
+ const demoDelay = process.env.VANA_DEMO_FAST_SUCCESS === "1" ? 800 : 120;
+ await delay(demoDelay);
+ await page.setData(
+ "status",
+ "Login confirmed. Collecting data in background...",
+ );
+ await page.setProgress({
+ phase: { step: 1, total: 3, label: "Profile" },
+ message: "Fetching profile...",
+ });
+ await delay(demoDelay);
+ await page.setProgress({
+ phase: { step: 2, total: 3, label: "Repositories" },
+ message: "Fetched 2 repositories",
+ count: 2,
+ });
+ await delay(demoDelay);
+ await page.setProgress({
+ phase: { step: 3, total: 3, label: "Starred" },
+ message: "Fetched 0 starred repositories",
+ count: 0,
+ });
+ await delay(demoDelay);
+
+ return {
+ profile: { username: "tnunamak" },
+ repositories: [{ name: "vana-connect" }, { name: "data-connectors" }],
+ starred: [],
+ exportSummary: {
+ count: 2,
+ label: "items",
+ details: "2 repositories, 0 starred",
+ },
+ };
+})();
diff --git a/docs/vhs/fixtures/demo-data-connectors/connectors/shop/shop-playwright.js b/docs/vhs/fixtures/demo-data-connectors/connectors/shop/shop-playwright.js
new file mode 100644
index 00000000..4dcbcda9
--- /dev/null
+++ b/docs/vhs/fixtures/demo-data-connectors/connectors/shop/shop-playwright.js
@@ -0,0 +1,8 @@
+(async () => {
+ await page.showBrowser("https://shop.app/account/order-history");
+ await page.promptUser(
+ "Finish signing in to Shop in the browser window.",
+ async () => false,
+ 1,
+ );
+})();
diff --git a/docs/vhs/fixtures/demo-data-connectors/connectors/spotify/spotify-playwright.js b/docs/vhs/fixtures/demo-data-connectors/connectors/spotify/spotify-playwright.js
new file mode 100644
index 00000000..99c5effa
--- /dev/null
+++ b/docs/vhs/fixtures/demo-data-connectors/connectors/spotify/spotify-playwright.js
@@ -0,0 +1,16 @@
+(async () => {
+ await page.requestInput({
+ message: "Connect Spotify",
+ schema: {
+ type: "object",
+ properties: {
+ email: { type: "string" },
+ },
+ },
+ });
+
+ return {
+ profile: { username: "tnunamak" },
+ playlists: [{ name: "Data Portability" }, { name: "Build Flow" }],
+ };
+})();
diff --git a/docs/vhs/fixtures/demo-data-connectors/registry.json b/docs/vhs/fixtures/demo-data-connectors/registry.json
new file mode 100644
index 00000000..732a0cc6
--- /dev/null
+++ b/docs/vhs/fixtures/demo-data-connectors/registry.json
@@ -0,0 +1,31 @@
+{
+ "connectors": [
+ {
+ "id": "github",
+ "name": "GitHub",
+ "company": "github",
+ "description": "Exports your GitHub profile, repositories, and starred repositories using Playwright browser automation.",
+ "files": {
+ "script": "connectors/github/github-playwright.js"
+ }
+ },
+ {
+ "id": "shop",
+ "name": "Shop",
+ "company": "shop",
+ "description": "Exports your Shop app order history using Playwright browser automation.",
+ "files": {
+ "script": "connectors/shop/shop-playwright.js"
+ }
+ },
+ {
+ "id": "spotify",
+ "name": "Spotify",
+ "company": "spotify",
+ "description": "Exports your Spotify playlists using Playwright browser automation.",
+ "files": {
+ "script": "connectors/spotify/spotify-playwright.js"
+ }
+ }
+ ]
+}
diff --git a/docs/vhs/fixtures/demo-home/.vana/browsers/chromium-1200/chrome-linux64/chrome b/docs/vhs/fixtures/demo-home/.vana/browsers/chromium-1200/chrome-linux64/chrome
new file mode 100755
index 00000000..e69de29b
diff --git a/docs/vhs/fixtures/demo-home/.vana/connectors/github/github-playwright.js b/docs/vhs/fixtures/demo-home/.vana/connectors/github/github-playwright.js
new file mode 100644
index 00000000..7605c970
--- /dev/null
+++ b/docs/vhs/fixtures/demo-home/.vana/connectors/github/github-playwright.js
@@ -0,0 +1 @@
+// demo fixture
diff --git a/docs/vhs/fixtures/demo-home/.vana/connectors/shop/shop-playwright.js b/docs/vhs/fixtures/demo-home/.vana/connectors/shop/shop-playwright.js
new file mode 100644
index 00000000..7605c970
--- /dev/null
+++ b/docs/vhs/fixtures/demo-home/.vana/connectors/shop/shop-playwright.js
@@ -0,0 +1 @@
+// demo fixture
diff --git a/docs/vhs/fixtures/demo-home/.vana/connectors/spotify/spotify-playwright.js b/docs/vhs/fixtures/demo-home/.vana/connectors/spotify/spotify-playwright.js
new file mode 100644
index 00000000..7605c970
--- /dev/null
+++ b/docs/vhs/fixtures/demo-home/.vana/connectors/spotify/spotify-playwright.js
@@ -0,0 +1 @@
+// demo fixture
diff --git a/docs/vhs/fixtures/demo-home/.vana/last-result.json b/docs/vhs/fixtures/demo-home/.vana/last-result.json
new file mode 100644
index 00000000..a7259568
--- /dev/null
+++ b/docs/vhs/fixtures/demo-home/.vana/last-result.json
@@ -0,0 +1,14 @@
+{
+ "profile": {
+ "username": "tnunamak"
+ },
+ "repositories": [
+ {
+ "name": "vana-connect"
+ },
+ {
+ "name": "data-connectors"
+ }
+ ],
+ "starred": []
+}
diff --git a/docs/vhs/fixtures/demo-home/.vana/spotify-result.json b/docs/vhs/fixtures/demo-home/.vana/spotify-result.json
new file mode 100644
index 00000000..a5ec06c1
--- /dev/null
+++ b/docs/vhs/fixtures/demo-home/.vana/spotify-result.json
@@ -0,0 +1,13 @@
+{
+ "profile": {
+ "username": "tnunamak"
+ },
+ "playlists": [
+ {
+ "name": "Data Portability"
+ },
+ {
+ "name": "Build Flow"
+ }
+ ]
+}
diff --git a/docs/vhs/fixtures/demo-home/.vana/vana-connect-state.json b/docs/vhs/fixtures/demo-home/.vana/vana-connect-state.json
new file mode 100644
index 00000000..77f55387
--- /dev/null
+++ b/docs/vhs/fixtures/demo-home/.vana/vana-connect-state.json
@@ -0,0 +1,61 @@
+{
+ "version": 1,
+ "sources": {
+ "github": {
+ "sessionPresent": true,
+ "lastRunAt": "2026-03-14T13:10:03.677Z",
+ "lastRunOutcome": "connected_local_only",
+ "dataState": "ingested_personal_server",
+ "lastResultPath": "/home/tnunamak/code/vana-connect-cli-pr/docs/vhs/fixtures/demo-home/.vana/last-result.json",
+ "lastLogPath": "/home/tnunamak/code/vana-connect-cli-pr/docs/vhs/fixtures/demo-home/.vana/logs/run-github-demo.log",
+ "ingestScopes": [
+ {
+ "scope": "github.profile",
+ "status": "stored",
+ "syncedAt": "2026-03-17T07:49:42.184Z"
+ },
+ {
+ "scope": "github.repositories",
+ "status": "failed",
+ "error": "HTTP 400: {\"error\":\"INVALID_BODY\",\"message\":\"Request body must be a JSON object\"}"
+ },
+ {
+ "scope": "github.starred",
+ "status": "failed",
+ "error": "HTTP 400: {\"error\":\"INVALID_BODY\",\"message\":\"Request body must be a JSON object\"}"
+ }
+ ]
+ },
+ "shop": {
+ "lastRunAt": "2026-03-14T13:11:10.000Z",
+ "lastRunOutcome": "legacy_auth",
+ "dataState": "none",
+ "lastLogPath": "/home/tnunamak/code/vana-connect-cli-pr/docs/vhs/fixtures/demo-home/.vana/logs/run-shop-demo.log"
+ },
+ "steam": {
+ "lastRunAt": "2026-03-14T13:12:00.000Z",
+ "lastRunOutcome": "connector_unavailable",
+ "dataState": "none",
+ "lastError": "No connector is available for steam right now.",
+ "lastLogPath": "/home/tnunamak/code/vana-connect-cli-pr/docs/vhs/fixtures/demo-home/.vana/logs/fetch-steam-demo.log"
+ },
+ "spotify": {
+ "lastRunAt": "2026-03-13T21:23:00.000Z",
+ "lastRunOutcome": "connected_local_only",
+ "dataState": "ingested_personal_server",
+ "lastResultPath": "/home/tnunamak/code/vana-connect-cli-pr/docs/vhs/fixtures/demo-home/.vana/spotify-result.json",
+ "ingestScopes": [
+ {
+ "scope": "spotify.profile",
+ "status": "stored",
+ "syncedAt": "2026-03-17T07:49:42.272Z"
+ },
+ {
+ "scope": "spotify.playlists",
+ "status": "failed",
+ "error": "HTTP 400: {\"error\":\"INVALID_BODY\",\"message\":\"Request body must be a JSON object\"}"
+ }
+ ]
+ }
+ }
+}
diff --git a/docs/vhs/help.tape b/docs/vhs/help.tape
new file mode 100644
index 00000000..98259268
--- /dev/null
+++ b/docs/vhs/help.tape
@@ -0,0 +1,19 @@
+Output docs/vhs/help.gif
+Require vana
+
+Set Shell "bash"
+Set FontSize 22
+Set Width 1200
+Set TypingSpeed 50ms
+Set CursorBlink false
+Set Theme "Catppuccin Mocha"
+Set WindowBar Colorful
+Set Padding 20
+Set BorderRadius 10
+Set LoopOffset 50%
+Set Height 1162
+
+Type "vana"
+Sleep 500ms
+Enter
+Sleep 5000ms
diff --git a/docs/vhs/logs.tape b/docs/vhs/logs.tape
new file mode 100644
index 00000000..1c3a3e20
--- /dev/null
+++ b/docs/vhs/logs.tape
@@ -0,0 +1,19 @@
+Output docs/vhs/logs.gif
+Require vana
+
+Set Shell "bash"
+Set FontSize 22
+Set Width 1200
+Set TypingSpeed 50ms
+Set CursorBlink false
+Set Theme "Catppuccin Mocha"
+Set WindowBar Colorful
+Set Padding 20
+Set BorderRadius 10
+Set LoopOffset 50%
+Set Height 770
+
+Type "vana logs"
+Sleep 500ms
+Enter
+Sleep 4000ms
diff --git a/docs/vhs/server-data.tape b/docs/vhs/server-data.tape
new file mode 100644
index 00000000..2873c739
--- /dev/null
+++ b/docs/vhs/server-data.tape
@@ -0,0 +1,19 @@
+Output docs/vhs/server-data.gif
+Require vana
+
+Set Shell "bash"
+Set FontSize 22
+Set Width 1200
+Set TypingSpeed 50ms
+Set CursorBlink false
+Set Theme "Catppuccin Mocha"
+Set WindowBar Colorful
+Set Padding 20
+Set BorderRadius 10
+Set LoopOffset 50%
+Set Height 350
+
+Type "vana server data"
+Sleep 500ms
+Enter
+Sleep 4000ms
diff --git a/docs/vhs/server-status.tape b/docs/vhs/server-status.tape
new file mode 100644
index 00000000..862a4c13
--- /dev/null
+++ b/docs/vhs/server-status.tape
@@ -0,0 +1,19 @@
+Output docs/vhs/server-status.gif
+Require vana
+
+Set Shell "bash"
+Set FontSize 22
+Set Width 1200
+Set TypingSpeed 50ms
+Set CursorBlink false
+Set Theme "Catppuccin Mocha"
+Set WindowBar Colorful
+Set Padding 20
+Set BorderRadius 10
+Set LoopOffset 50%
+Set Height 406
+
+Type "vana server status"
+Sleep 500ms
+Enter
+Sleep 4000ms
diff --git a/docs/vhs/server-sync.tape b/docs/vhs/server-sync.tape
new file mode 100644
index 00000000..9ded19d1
--- /dev/null
+++ b/docs/vhs/server-sync.tape
@@ -0,0 +1,19 @@
+Output docs/vhs/server-sync.gif
+Require vana
+
+Set Shell "bash"
+Set FontSize 22
+Set Width 1200
+Set TypingSpeed 50ms
+Set CursorBlink false
+Set Theme "Catppuccin Mocha"
+Set WindowBar Colorful
+Set Padding 20
+Set BorderRadius 10
+Set LoopOffset 50%
+Set Height 462
+
+Type "vana server sync"
+Sleep 500ms
+Enter
+Sleep 5000ms
diff --git a/docs/vhs/setup.tape b/docs/vhs/setup.tape
new file mode 100644
index 00000000..fca9f465
--- /dev/null
+++ b/docs/vhs/setup.tape
@@ -0,0 +1,19 @@
+Output docs/vhs/setup.gif
+Require vana
+
+Set Shell "bash"
+Set FontSize 22
+Set Width 1200
+Set TypingSpeed 50ms
+Set CursorBlink false
+Set Theme "Catppuccin Mocha"
+Set WindowBar Colorful
+Set Padding 20
+Set BorderRadius 10
+Set LoopOffset 50%
+Set Height 434
+
+Type "vana setup"
+Sleep 500ms
+Enter
+Sleep 3000ms
diff --git a/docs/vhs/sources-github.tape b/docs/vhs/sources-github.tape
new file mode 100644
index 00000000..fa86595b
--- /dev/null
+++ b/docs/vhs/sources-github.tape
@@ -0,0 +1,19 @@
+Output docs/vhs/sources-github.gif
+Require vana
+
+Set Shell "bash"
+Set FontSize 22
+Set Width 1200
+Set TypingSpeed 50ms
+Set CursorBlink false
+Set Theme "Catppuccin Mocha"
+Set WindowBar Colorful
+Set Padding 20
+Set BorderRadius 10
+Set LoopOffset 50%
+Set Height 518
+
+Type "vana sources github"
+Sleep 500ms
+Enter
+Sleep 4000ms
diff --git a/docs/vhs/sources.tape b/docs/vhs/sources.tape
new file mode 100644
index 00000000..4c46cd36
--- /dev/null
+++ b/docs/vhs/sources.tape
@@ -0,0 +1,19 @@
+Output docs/vhs/sources.gif
+Require vana
+
+Set Shell "bash"
+Set FontSize 22
+Set Width 1200
+Set TypingSpeed 50ms
+Set CursorBlink false
+Set Theme "Catppuccin Mocha"
+Set WindowBar Colorful
+Set Padding 20
+Set BorderRadius 10
+Set LoopOffset 50%
+Set Height 770
+
+Type "vana sources"
+Sleep 500ms
+Enter
+Sleep 4000ms
diff --git a/docs/vhs/status.tape b/docs/vhs/status.tape
new file mode 100644
index 00000000..7297cd64
--- /dev/null
+++ b/docs/vhs/status.tape
@@ -0,0 +1,19 @@
+Output docs/vhs/status.gif
+Require vana
+
+Set Shell "bash"
+Set FontSize 22
+Set Width 1200
+Set TypingSpeed 50ms
+Set CursorBlink false
+Set Theme "Catppuccin Mocha"
+Set WindowBar Colorful
+Set Padding 20
+Set BorderRadius 10
+Set LoopOffset 50%
+Set Height 1190
+
+Type "vana status"
+Sleep 500ms
+Enter
+Sleep 5000ms
diff --git a/install/install.ps1 b/install/install.ps1
new file mode 100644
index 00000000..2fa99ab0
--- /dev/null
+++ b/install/install.ps1
@@ -0,0 +1,182 @@
+$ErrorActionPreference = "Stop"
+
+$Repo = if ($env:VANA_RELEASE_REPO) { $env:VANA_RELEASE_REPO } else { "vana-com/vana-connect" }
+$Version = if ($env:VANA_VERSION) { $env:VANA_VERSION } else { "" }
+$BinDir = if ($env:VANA_INSTALL_BIN_DIR) { $env:VANA_INSTALL_BIN_DIR } else { Join-Path $HOME "AppData\Local\Microsoft\WinGet\Links" }
+$InstallRoot = if ($env:VANA_INSTALL_ROOT) { $env:VANA_INSTALL_ROOT } else { Join-Path $HOME "AppData\Local\Vana" }
+$ReleaseApiUrl = if ($env:VANA_RELEASE_API_URL) { $env:VANA_RELEASE_API_URL } else { "https://api.github.com/repos/$Repo/releases/latest" }
+$ReleaseBaseUrl = if ($env:VANA_RELEASE_BASE_URL) { $env:VANA_RELEASE_BASE_URL } else { "https://github.com/$Repo/releases/download" }
+
+for ($i = 0; $i -lt $args.Length; $i++) {
+ switch ($args[$i]) {
+ "--version" {
+ $Version = $args[$i + 1]
+ $i++
+ }
+ "--bin-dir" {
+ $BinDir = $args[$i + 1]
+ $i++
+ }
+ "--install-root" {
+ $InstallRoot = $args[$i + 1]
+ $i++
+ }
+ "--repo" {
+ $Repo = $args[$i + 1]
+ $i++
+ }
+ default {
+ throw "Unknown argument: $($args[$i])"
+ }
+ }
+}
+
+$TargetArch = switch ([System.Runtime.InteropServices.RuntimeInformation]::OSArchitecture.ToString()) {
+ "Arm64" { "arm64" }
+ "X64" { "x64" }
+ default { throw "Unsupported architecture: $($_)" }
+}
+
+if (-not $Version) {
+ $Attempts = 5
+ for ($Attempt = 1; $Attempt -le $Attempts; $Attempt++) {
+ try {
+ $Release = Invoke-RestMethod -Uri $ReleaseApiUrl
+ break
+ }
+ catch {
+ if ($Attempt -eq $Attempts) {
+ throw
+ }
+ Start-Sleep -Seconds 2
+ }
+ }
+ $Version = $Release.tag_name
+}
+
+if (-not $Version) {
+ throw "Unable to resolve a release version for $Repo"
+}
+
+$AssetBase = "vana-win32-$TargetArch"
+$ArchiveName = "$AssetBase.zip"
+$ChecksumName = "$ArchiveName.sha256"
+$UseRemoteReleaseBase = $ReleaseBaseUrl -match '^(https?|file)://'
+if ($UseRemoteReleaseBase) {
+ $DownloadBase = "$ReleaseBaseUrl/$Version"
+ $ArchiveUrl = "$DownloadBase/$ArchiveName"
+ $ChecksumUrl = "$DownloadBase/$ChecksumName"
+}
+else {
+ $DownloadBase = Join-Path $ReleaseBaseUrl $Version
+ $ArchiveUrl = Join-Path $DownloadBase $ArchiveName
+ $ChecksumUrl = Join-Path $DownloadBase $ChecksumName
+}
+
+$TempDir = Join-Path ([System.IO.Path]::GetTempPath()) ("vana-install-" + [System.Guid]::NewGuid().ToString("N"))
+New-Item -ItemType Directory -Path $TempDir | Out-Null
+
+function Copy-VanaAsset {
+ param(
+ [Parameter(Mandatory = $true)]
+ [string]$Source,
+ [Parameter(Mandatory = $true)]
+ [string]$Destination
+ )
+
+ if ($Source -match '^(https?)://') {
+ $Attempts = 8
+ for ($Attempt = 1; $Attempt -le $Attempts; $Attempt++) {
+ try {
+ Invoke-WebRequest -Uri $Source -OutFile $Destination
+ return
+ }
+ catch {
+ if ($Attempt -eq $Attempts) {
+ throw
+ }
+ Start-Sleep -Seconds 2
+ }
+ }
+ return
+ }
+
+ $ResolvedPath = $Source
+ if ($Source.StartsWith('file://')) {
+ $ResolvedPath = ([System.Uri]$Source).LocalPath
+ }
+
+ Copy-Item -Path $ResolvedPath -Destination $Destination -Force
+}
+
+try {
+ Write-Host "Installing $AssetBase from $Version"
+ $ArchivePath = Join-Path $TempDir $ArchiveName
+ $ChecksumPath = Join-Path $TempDir $ChecksumName
+
+ Copy-VanaAsset -Source $ArchiveUrl -Destination $ArchivePath
+ Copy-VanaAsset -Source $ChecksumUrl -Destination $ChecksumPath
+
+ $Expected = (Get-Content $ChecksumPath).Split(" ", [System.StringSplitOptions]::RemoveEmptyEntries)[0].Trim()
+ $Actual = (Get-FileHash -Path $ArchivePath -Algorithm SHA256).Hash.ToLowerInvariant()
+ if ($Expected.ToLowerInvariant() -ne $Actual) {
+ throw "Checksum verification failed"
+ }
+
+ $ReleaseDir = Join-Path $InstallRoot "releases\$Version"
+ Expand-Archive -Path $ArchivePath -DestinationPath $TempDir -Force
+ $ExtractedDir = Join-Path $TempDir $AssetBase
+ if (-not (Test-Path $ExtractedDir)) {
+ throw "Unexpected archive layout: missing $ExtractedDir"
+ }
+
+ if (Test-Path $ReleaseDir) {
+ Remove-Item $ReleaseDir -Recurse -Force
+ }
+ Copy-Item -Path $ExtractedDir -Destination $ReleaseDir -Recurse
+
+ $CurrentDir = Join-Path $InstallRoot "current"
+ if (Test-Path $CurrentDir) {
+ Remove-Item $CurrentDir -Recurse -Force
+ }
+ Copy-Item -Path $ReleaseDir -Destination $CurrentDir -Recurse
+
+ New-Item -ItemType Directory -Force -Path $BinDir | Out-Null
+ $WrapperPath = Join-Path $BinDir "vana.cmd"
+ @(
+ "@echo off"
+ "`"$CurrentDir\vana.exe`" %*"
+ ) | Set-Content -Path $WrapperPath -Encoding ASCII
+
+ $ExePath = Join-Path $BinDir "vana.exe"
+ if (Test-Path $ExePath) {
+ Remove-Item $ExePath -Force
+ }
+
+ $StatusOutput = & $WrapperPath status --json 2>&1
+ if ($LASTEXITCODE -ne 0) {
+ throw "Installed vana failed a post-install self-check. The release payload may be incomplete.`n$StatusOutput"
+ }
+
+ $UserPath = [Environment]::GetEnvironmentVariable("Path", "User")
+ $PathEntries = @()
+ if ($UserPath) {
+ $PathEntries = $UserPath.Split(";")
+ }
+ if ($PathEntries -notcontains $BinDir) {
+ $NewPath = if ($UserPath) { "$BinDir;$UserPath" } else { $BinDir }
+ [Environment]::SetEnvironmentVariable("Path", $NewPath, "User")
+ Write-Host ""
+ Write-Host "Added $BinDir to your user PATH. Open a new terminal if `vana` is not available yet."
+ }
+
+ Write-Host ""
+ Write-Host "Installed vana to $WrapperPath"
+ Write-Host "Next step:"
+ Write-Host " vana status"
+}
+finally {
+ if (Test-Path $TempDir) {
+ Remove-Item $TempDir -Recurse -Force
+ }
+}
diff --git a/install/install.sh b/install/install.sh
new file mode 100644
index 00000000..e1713e20
--- /dev/null
+++ b/install/install.sh
@@ -0,0 +1,153 @@
+#!/usr/bin/env sh
+set -eu
+
+REPO="${VANA_RELEASE_REPO:-vana-com/vana-connect}"
+VERSION="${VANA_VERSION:-}"
+BIN_DIR="${VANA_INSTALL_BIN_DIR:-$HOME/.local/bin}"
+INSTALL_ROOT="${VANA_INSTALL_ROOT:-$HOME/.local/share/vana}"
+RELEASE_API_URL="${VANA_RELEASE_API_URL:-https://api.github.com/repos/$REPO/releases/latest}"
+RELEASE_BASE_URL="${VANA_RELEASE_BASE_URL:-https://github.com/$REPO/releases/download}"
+
+while [ "$#" -gt 0 ]; do
+ case "$1" in
+ --version)
+ VERSION="$2"
+ shift 2
+ ;;
+ --bin-dir)
+ BIN_DIR="$2"
+ shift 2
+ ;;
+ --install-root)
+ INSTALL_ROOT="$2"
+ shift 2
+ ;;
+ --repo)
+ REPO="$2"
+ shift 2
+ ;;
+ *)
+ echo "Unknown argument: $1" >&2
+ exit 1
+ ;;
+ esac
+done
+
+need_cmd() {
+ if ! command -v "$1" >/dev/null 2>&1; then
+ echo "Missing required command: $1" >&2
+ exit 1
+ fi
+}
+
+need_cmd curl
+need_cmd tar
+
+download_asset() {
+ url="$1"
+ destination="$2"
+ curl --retry 8 --retry-delay 2 --retry-all-errors -fsSL "$url" -o "$destination"
+}
+
+OS="$(uname -s)"
+ARCH="$(uname -m)"
+
+case "$OS" in
+ Linux) PLATFORM="linux" ;;
+ Darwin) PLATFORM="darwin" ;;
+ *)
+ echo "Unsupported operating system: $OS" >&2
+ exit 1
+ ;;
+esac
+
+case "$ARCH" in
+ x86_64|amd64) TARGET_ARCH="x64" ;;
+ arm64|aarch64) TARGET_ARCH="arm64" ;;
+ *)
+ echo "Unsupported architecture: $ARCH" >&2
+ exit 1
+ ;;
+esac
+
+if [ -z "$VERSION" ]; then
+ VERSION="$(
+ curl --retry 5 --retry-delay 2 --retry-all-errors -fsSL "$RELEASE_API_URL" |
+ sed -n 's/.*"tag_name":[[:space:]]*"\([^"]*\)".*/\1/p' |
+ head -n 1
+ )"
+fi
+
+if [ -z "$VERSION" ]; then
+ echo "Unable to resolve a release version for $REPO" >&2
+ exit 1
+fi
+
+ASSET_BASE="vana-$PLATFORM-$TARGET_ARCH"
+ARCHIVE_NAME="$ASSET_BASE.tar.gz"
+CHECKSUM_NAME="$ARCHIVE_NAME.sha256"
+DOWNLOAD_BASE="$RELEASE_BASE_URL/$VERSION"
+ARCHIVE_URL="$DOWNLOAD_BASE/$ARCHIVE_NAME"
+CHECKSUM_URL="$DOWNLOAD_BASE/$CHECKSUM_NAME"
+
+TMP_DIR="$(mktemp -d)"
+cleanup() {
+ rm -rf "$TMP_DIR"
+}
+trap cleanup EXIT INT TERM
+
+echo "Installing $ASSET_BASE from $VERSION"
+download_asset "$ARCHIVE_URL" "$TMP_DIR/$ARCHIVE_NAME"
+download_asset "$CHECKSUM_URL" "$TMP_DIR/$CHECKSUM_NAME"
+
+if command -v sha256sum >/dev/null 2>&1; then
+ (cd "$TMP_DIR" && sha256sum -c "$CHECKSUM_NAME")
+elif command -v shasum >/dev/null 2>&1; then
+ EXPECTED="$(awk '{print $1}' "$TMP_DIR/$CHECKSUM_NAME")"
+ ACTUAL="$(shasum -a 256 "$TMP_DIR/$ARCHIVE_NAME" | awk '{print $1}')"
+ if [ "$EXPECTED" != "$ACTUAL" ]; then
+ echo "Checksum verification failed" >&2
+ exit 1
+ fi
+else
+ echo "Missing checksum verifier: expected sha256sum or shasum" >&2
+ exit 1
+fi
+
+mkdir -p "$INSTALL_ROOT/releases/$VERSION" "$BIN_DIR"
+RELEASE_DIR="$INSTALL_ROOT/releases/$VERSION"
+EXTRACTED_DIR="$TMP_DIR/$ASSET_BASE"
+
+rm -rf "$RELEASE_DIR"
+tar -xzf "$TMP_DIR/$ARCHIVE_NAME" -C "$TMP_DIR"
+
+if [ ! -d "$EXTRACTED_DIR" ]; then
+ echo "Unexpected archive layout: missing $EXTRACTED_DIR" >&2
+ exit 1
+fi
+
+mkdir -p "$RELEASE_DIR"
+cp -R "$EXTRACTED_DIR/." "$RELEASE_DIR"
+
+ln -sfn "$INSTALL_ROOT/releases/$VERSION" "$INSTALL_ROOT/current"
+ln -sfn "$INSTALL_ROOT/current/vana" "$BIN_DIR/vana"
+
+if ! HOME="${HOME}" VANA_APP_ROOT="$INSTALL_ROOT/current/app" "$BIN_DIR/vana" status --json >/dev/null 2>&1; then
+ echo "Installed vana failed a post-install self-check. The release payload may be incomplete." >&2
+ exit 1
+fi
+
+echo "Installed vana to $BIN_DIR/vana"
+case ":$PATH:" in
+ *":$BIN_DIR:"*) ;;
+ *)
+ echo ""
+ echo "$BIN_DIR is not on your PATH."
+ echo "Add this line to your shell profile:"
+ echo " export PATH=\"$BIN_DIR:\$PATH\""
+ ;;
+esac
+
+echo ""
+echo "Next step:"
+echo " vana status"
diff --git a/package.json b/package.json
index 570b6d6b..8c893dfc 100644
--- a/package.json
+++ b/package.json
@@ -16,6 +16,9 @@
"publishConfig": {
"access": "public"
},
+ "bin": {
+ "vana": "./dist/cli/bin.js"
+ },
"files": [
"dist"
],
@@ -35,12 +38,39 @@
"./core": {
"types": "./dist/core/index.d.ts",
"import": "./dist/core/index.js"
+ },
+ "./runtime": {
+ "types": "./dist/runtime/index.d.ts",
+ "import": "./dist/runtime/index.js"
+ },
+ "./connectors": {
+ "types": "./dist/connectors/index.d.ts",
+ "import": "./dist/connectors/index.js"
+ },
+ "./cli": {
+ "types": "./dist/cli/main.d.ts",
+ "import": "./dist/cli/main.js"
}
},
"scripts": {
"build": "tsc --build",
- "clean": "tsc --build --clean",
+ "build:sea": "npx -y node@25 ./scripts/build-sea.mjs",
+ "build:sea:smoke": "npx -y node@25 ./scripts/build-sea.mjs --smoke",
+ "clean": "node ./scripts/clean-build.mjs",
+ "cli": "node dist/cli/bin.js",
+ "demo:vhs": "pnpm build && node ./scripts/render-vhs.mjs",
+ "demo:vhs:fixtures": "node ./scripts/prepare-vhs-fixtures.mjs",
+ "demo:transcripts": "pnpm build && node ./scripts/capture-cli-transcripts.mjs",
+ "pack:check": "node ./scripts/assert-pack-contents.mjs",
+ "sea:check": "node ./scripts/assert-sea-artifact.mjs",
+ "homebrew:check": "node ./scripts/assert-homebrew-formula-sync.mjs",
+ "release:check-demo-assets": "node ./scripts/assert-release-demo-assets.mjs",
+ "release:watch": "node ./scripts/watch-release-lane.mjs",
"skills:sync": "node scripts/sync-skills.js",
+ "test:install:unix": "sh ./scripts/test-install-unix.sh",
+ "test:install:windows": "pwsh -File ./scripts/test-install-windows.ps1",
+ "test:install:github-release": "sh ./scripts/test-install-github-release.sh",
+ "test:runtime:cookie-import": "vitest run test/runtime/browser.test.ts",
"test": "vitest run",
"test:watch": "vitest",
"test:e2e": "vitest run --config test/e2e/vitest.config.ts",
@@ -49,6 +79,10 @@
"lint:eslint:fix": "eslint --fix 'src/**/*.ts' 'src/**/*.tsx'",
"format": "prettier --write .",
"format:check": "prettier --check .",
+ "package-managers:generate": "node ./scripts/generate-package-manager-metadata.mjs",
+ "preflight:cli": "pnpm test test/core/state-store.test.ts test/cli/render-format.test.ts test/runtime/in-process-run.test.ts test/cli/index.test.ts && pnpm build && pnpm format:check && pnpm demo:transcripts",
+ "release:collect-assets": "node ./scripts/collect-release-assets.mjs",
+ "runtime:footprint": "node ./scripts/report-runtime-footprint.mjs",
"validate": "npm run lint && npm run lint:eslint && npm run format:check && npm test",
"prepare": "husky"
},
@@ -64,9 +98,19 @@
"engines": {
"node": ">=20.0.0"
},
+ "dependencies": {
+ "@inquirer/prompts": "8.3.0",
+ "@modelcontextprotocol/sdk": "^1.27.1",
+ "chromium-bidi": "15.0.0",
+ "commander": "14.0.3",
+ "ora": "^8.2.0",
+ "picocolors": "^1.1.1",
+ "playwright": "1.58.2",
+ "zod": "4.3.6"
+ },
"peerDependencies": {
- "viem": "^2.0.0",
- "react": ">=18.0.0"
+ "react": ">=18.0.0",
+ "viem": "^2.0.0"
},
"peerDependenciesMeta": {
"react": {
@@ -84,6 +128,7 @@
"@types/node": "^25.2.0",
"@types/react": "^19.0.0",
"conventional-changelog-conventionalcommits": "^9.1.0",
+ "esbuild": "^0.27.4",
"eslint": "^9.39.2",
"eslint-config-prettier": "^10.1.8",
"husky": "^9.1.7",
diff --git a/pnpm-lock.yaml b/pnpm-lock.yaml
index e21446dd..5fd48937 100644
--- a/pnpm-lock.yaml
+++ b/pnpm-lock.yaml
@@ -6,6 +6,31 @@ settings:
importers:
.:
+ dependencies:
+ "@inquirer/prompts":
+ specifier: 8.3.0
+ version: 8.3.0(@types/node@25.2.2)
+ "@modelcontextprotocol/sdk":
+ specifier: ^1.27.1
+ version: 1.27.1(zod@4.3.6)
+ chromium-bidi:
+ specifier: 15.0.0
+ version: 15.0.0(devtools-protocol@0.0.1596832)
+ commander:
+ specifier: 14.0.3
+ version: 14.0.3
+ ora:
+ specifier: ^8.2.0
+ version: 8.2.0
+ picocolors:
+ specifier: ^1.1.1
+ version: 1.1.1
+ playwright:
+ specifier: 1.58.2
+ version: 1.58.2
+ zod:
+ specifier: 4.3.6
+ version: 4.3.6
devDependencies:
"@commitlint/cli":
specifier: ^20.4.1
@@ -37,6 +62,9 @@ importers:
conventional-changelog-conventionalcommits:
specifier: ^9.1.0
version: 9.1.0
+ esbuild:
+ specifier: ^0.27.4
+ version: 0.27.4
eslint:
specifier: ^9.39.2
version: 9.39.2(jiti@2.6.1)
@@ -849,235 +877,235 @@ packages:
integrity: sha512-dFoMUuQA20zvtVTuxZww6OHoJYgrzfKM1t52mVySDJnMSEa08ruEvdYQbhvyu6soU+NeLVd3yKfTfT0NeV6qGg==,
}
- "@esbuild/aix-ppc64@0.27.3":
+ "@esbuild/aix-ppc64@0.27.4":
resolution:
{
- integrity: sha512-9fJMTNFTWZMh5qwrBItuziu834eOCUcEqymSH7pY+zoMVEZg3gcPuBNxH1EvfVYe9h0x/Ptw8KBzv7qxb7l8dg==,
+ integrity: sha512-cQPwL2mp2nSmHHJlCyoXgHGhbEPMrEEU5xhkcy3Hs/O7nGZqEpZ2sUtLaL9MORLtDfRvVl2/3PAuEkYZH0Ty8Q==,
}
engines: { node: ">=18" }
cpu: [ppc64]
os: [aix]
- "@esbuild/android-arm64@0.27.3":
+ "@esbuild/android-arm64@0.27.4":
resolution:
{
- integrity: sha512-YdghPYUmj/FX2SYKJ0OZxf+iaKgMsKHVPF1MAq/P8WirnSpCStzKJFjOjzsW0QQ7oIAiccHdcqjbHmJxRb/dmg==,
+ integrity: sha512-gdLscB7v75wRfu7QSm/zg6Rx29VLdy9eTr2t44sfTW7CxwAtQghZ4ZnqHk3/ogz7xao0QAgrkradbBzcqFPasw==,
}
engines: { node: ">=18" }
cpu: [arm64]
os: [android]
- "@esbuild/android-arm@0.27.3":
+ "@esbuild/android-arm@0.27.4":
resolution:
{
- integrity: sha512-i5D1hPY7GIQmXlXhs2w8AWHhenb00+GxjxRncS2ZM7YNVGNfaMxgzSGuO8o8SJzRc/oZwU2bcScvVERk03QhzA==,
+ integrity: sha512-X9bUgvxiC8CHAGKYufLIHGXPJWnr0OCdR0anD2e21vdvgCI8lIfqFbnoeOz7lBjdrAGUhqLZLcQo6MLhTO2DKQ==,
}
engines: { node: ">=18" }
cpu: [arm]
os: [android]
- "@esbuild/android-x64@0.27.3":
+ "@esbuild/android-x64@0.27.4":
resolution:
{
- integrity: sha512-IN/0BNTkHtk8lkOM8JWAYFg4ORxBkZQf9zXiEOfERX/CzxW3Vg1ewAhU7QSWQpVIzTW+b8Xy+lGzdYXV6UZObQ==,
+ integrity: sha512-PzPFnBNVF292sfpfhiyiXCGSn9HZg5BcAz+ivBuSsl6Rk4ga1oEXAamhOXRFyMcjwr2DVtm40G65N3GLeH1Lvw==,
}
engines: { node: ">=18" }
cpu: [x64]
os: [android]
- "@esbuild/darwin-arm64@0.27.3":
+ "@esbuild/darwin-arm64@0.27.4":
resolution:
{
- integrity: sha512-Re491k7ByTVRy0t3EKWajdLIr0gz2kKKfzafkth4Q8A5n1xTHrkqZgLLjFEHVD+AXdUGgQMq+Godfq45mGpCKg==,
+ integrity: sha512-b7xaGIwdJlht8ZFCvMkpDN6uiSmnxxK56N2GDTMYPr2/gzvfdQN8rTfBsvVKmIVY/X7EM+/hJKEIbbHs9oA4tQ==,
}
engines: { node: ">=18" }
cpu: [arm64]
os: [darwin]
- "@esbuild/darwin-x64@0.27.3":
+ "@esbuild/darwin-x64@0.27.4":
resolution:
{
- integrity: sha512-vHk/hA7/1AckjGzRqi6wbo+jaShzRowYip6rt6q7VYEDX4LEy1pZfDpdxCBnGtl+A5zq8iXDcyuxwtv3hNtHFg==,
+ integrity: sha512-sR+OiKLwd15nmCdqpXMnuJ9W2kpy0KigzqScqHI3Hqwr7IXxBp3Yva+yJwoqh7rE8V77tdoheRYataNKL4QrPw==,
}
engines: { node: ">=18" }
cpu: [x64]
os: [darwin]
- "@esbuild/freebsd-arm64@0.27.3":
+ "@esbuild/freebsd-arm64@0.27.4":
resolution:
{
- integrity: sha512-ipTYM2fjt3kQAYOvo6vcxJx3nBYAzPjgTCk7QEgZG8AUO3ydUhvelmhrbOheMnGOlaSFUoHXB6un+A7q4ygY9w==,
+ integrity: sha512-jnfpKe+p79tCnm4GVav68A7tUFeKQwQyLgESwEAUzyxk/TJr4QdGog9sqWNcUbr/bZt/O/HXouspuQDd9JxFSw==,
}
engines: { node: ">=18" }
cpu: [arm64]
os: [freebsd]
- "@esbuild/freebsd-x64@0.27.3":
+ "@esbuild/freebsd-x64@0.27.4":
resolution:
{
- integrity: sha512-dDk0X87T7mI6U3K9VjWtHOXqwAMJBNN2r7bejDsc+j03SEjtD9HrOl8gVFByeM0aJksoUuUVU9TBaZa2rgj0oA==,
+ integrity: sha512-2kb4ceA/CpfUrIcTUl1wrP/9ad9Atrp5J94Lq69w7UwOMolPIGrfLSvAKJp0RTvkPPyn6CIWrNy13kyLikZRZQ==,
}
engines: { node: ">=18" }
cpu: [x64]
os: [freebsd]
- "@esbuild/linux-arm64@0.27.3":
+ "@esbuild/linux-arm64@0.27.4":
resolution:
{
- integrity: sha512-sZOuFz/xWnZ4KH3YfFrKCf1WyPZHakVzTiqji3WDc0BCl2kBwiJLCXpzLzUBLgmp4veFZdvN5ChW4Eq/8Fc2Fg==,
+ integrity: sha512-7nQOttdzVGth1iz57kxg9uCz57dxQLHWxopL6mYuYthohPKEK0vU0C3O21CcBK6KDlkYVcnDXY099HcCDXd9dA==,
}
engines: { node: ">=18" }
cpu: [arm64]
os: [linux]
- "@esbuild/linux-arm@0.27.3":
+ "@esbuild/linux-arm@0.27.4":
resolution:
{
- integrity: sha512-s6nPv2QkSupJwLYyfS+gwdirm0ukyTFNl3KTgZEAiJDd+iHZcbTPPcWCcRYH+WlNbwChgH2QkE9NSlNrMT8Gfw==,
+ integrity: sha512-aBYgcIxX/wd5n2ys0yESGeYMGF+pv6g0DhZr3G1ZG4jMfruU9Tl1i2Z+Wnj9/KjGz1lTLCcorqE2viePZqj4Eg==,
}
engines: { node: ">=18" }
cpu: [arm]
os: [linux]
- "@esbuild/linux-ia32@0.27.3":
+ "@esbuild/linux-ia32@0.27.4":
resolution:
{
- integrity: sha512-yGlQYjdxtLdh0a3jHjuwOrxQjOZYD/C9PfdbgJJF3TIZWnm/tMd/RcNiLngiu4iwcBAOezdnSLAwQDPqTmtTYg==,
+ integrity: sha512-oPtixtAIzgvzYcKBQM/qZ3R+9TEUd1aNJQu0HhGyqtx6oS7qTpvjheIWBbes4+qu1bNlo2V4cbkISr8q6gRBFA==,
}
engines: { node: ">=18" }
cpu: [ia32]
os: [linux]
- "@esbuild/linux-loong64@0.27.3":
+ "@esbuild/linux-loong64@0.27.4":
resolution:
{
- integrity: sha512-WO60Sn8ly3gtzhyjATDgieJNet/KqsDlX5nRC5Y3oTFcS1l0KWba+SEa9Ja1GfDqSF1z6hif/SkpQJbL63cgOA==,
+ integrity: sha512-8mL/vh8qeCoRcFH2nM8wm5uJP+ZcVYGGayMavi8GmRJjuI3g1v6Z7Ni0JJKAJW+m0EtUuARb6Lmp4hMjzCBWzA==,
}
engines: { node: ">=18" }
cpu: [loong64]
os: [linux]
- "@esbuild/linux-mips64el@0.27.3":
+ "@esbuild/linux-mips64el@0.27.4":
resolution:
{
- integrity: sha512-APsymYA6sGcZ4pD6k+UxbDjOFSvPWyZhjaiPyl/f79xKxwTnrn5QUnXR5prvetuaSMsb4jgeHewIDCIWljrSxw==,
+ integrity: sha512-1RdrWFFiiLIW7LQq9Q2NES+HiD4NyT8Itj9AUeCl0IVCA459WnPhREKgwrpaIfTOe+/2rdntisegiPWn/r/aAw==,
}
engines: { node: ">=18" }
cpu: [mips64el]
os: [linux]
- "@esbuild/linux-ppc64@0.27.3":
+ "@esbuild/linux-ppc64@0.27.4":
resolution:
{
- integrity: sha512-eizBnTeBefojtDb9nSh4vvVQ3V9Qf9Df01PfawPcRzJH4gFSgrObw+LveUyDoKU3kxi5+9RJTCWlj4FjYXVPEA==,
+ integrity: sha512-tLCwNG47l3sd9lpfyx9LAGEGItCUeRCWeAx6x2Jmbav65nAwoPXfewtAdtbtit/pJFLUWOhpv0FpS6GQAmPrHA==,
}
engines: { node: ">=18" }
cpu: [ppc64]
os: [linux]
- "@esbuild/linux-riscv64@0.27.3":
+ "@esbuild/linux-riscv64@0.27.4":
resolution:
{
- integrity: sha512-3Emwh0r5wmfm3ssTWRQSyVhbOHvqegUDRd0WhmXKX2mkHJe1SFCMJhagUleMq+Uci34wLSipf8Lagt4LlpRFWQ==,
+ integrity: sha512-BnASypppbUWyqjd1KIpU4AUBiIhVr6YlHx/cnPgqEkNoVOhHg+YiSVxM1RLfiy4t9cAulbRGTNCKOcqHrEQLIw==,
}
engines: { node: ">=18" }
cpu: [riscv64]
os: [linux]
- "@esbuild/linux-s390x@0.27.3":
+ "@esbuild/linux-s390x@0.27.4":
resolution:
{
- integrity: sha512-pBHUx9LzXWBc7MFIEEL0yD/ZVtNgLytvx60gES28GcWMqil8ElCYR4kvbV2BDqsHOvVDRrOxGySBM9Fcv744hw==,
+ integrity: sha512-+eUqgb/Z7vxVLezG8bVB9SfBie89gMueS+I0xYh2tJdw3vqA/0ImZJ2ROeWwVJN59ihBeZ7Tu92dF/5dy5FttA==,
}
engines: { node: ">=18" }
cpu: [s390x]
os: [linux]
- "@esbuild/linux-x64@0.27.3":
+ "@esbuild/linux-x64@0.27.4":
resolution:
{
- integrity: sha512-Czi8yzXUWIQYAtL/2y6vogER8pvcsOsk5cpwL4Gk5nJqH5UZiVByIY8Eorm5R13gq+DQKYg0+JyQoytLQas4dA==,
+ integrity: sha512-S5qOXrKV8BQEzJPVxAwnryi2+Iq5pB40gTEIT69BQONqR7JH1EPIcQ/Uiv9mCnn05jff9umq/5nqzxlqTOg9NA==,
}
engines: { node: ">=18" }
cpu: [x64]
os: [linux]
- "@esbuild/netbsd-arm64@0.27.3":
+ "@esbuild/netbsd-arm64@0.27.4":
resolution:
{
- integrity: sha512-sDpk0RgmTCR/5HguIZa9n9u+HVKf40fbEUt+iTzSnCaGvY9kFP0YKBWZtJaraonFnqef5SlJ8/TiPAxzyS+UoA==,
+ integrity: sha512-xHT8X4sb0GS8qTqiwzHqpY00C95DPAq7nAwX35Ie/s+LO9830hrMd3oX0ZMKLvy7vsonee73x0lmcdOVXFzd6Q==,
}
engines: { node: ">=18" }
cpu: [arm64]
os: [netbsd]
- "@esbuild/netbsd-x64@0.27.3":
+ "@esbuild/netbsd-x64@0.27.4":
resolution:
{
- integrity: sha512-P14lFKJl/DdaE00LItAukUdZO5iqNH7+PjoBm+fLQjtxfcfFE20Xf5CrLsmZdq5LFFZzb5JMZ9grUwvtVYzjiA==,
+ integrity: sha512-RugOvOdXfdyi5Tyv40kgQnI0byv66BFgAqjdgtAKqHoZTbTF2QqfQrFwa7cHEORJf6X2ht+l9ABLMP0dnKYsgg==,
}
engines: { node: ">=18" }
cpu: [x64]
os: [netbsd]
- "@esbuild/openbsd-arm64@0.27.3":
+ "@esbuild/openbsd-arm64@0.27.4":
resolution:
{
- integrity: sha512-AIcMP77AvirGbRl/UZFTq5hjXK+2wC7qFRGoHSDrZ5v5b8DK/GYpXW3CPRL53NkvDqb9D+alBiC/dV0Fb7eJcw==,
+ integrity: sha512-2MyL3IAaTX+1/qP0O1SwskwcwCoOI4kV2IBX1xYnDDqthmq5ArrW94qSIKCAuRraMgPOmG0RDTA74mzYNQA9ow==,
}
engines: { node: ">=18" }
cpu: [arm64]
os: [openbsd]
- "@esbuild/openbsd-x64@0.27.3":
+ "@esbuild/openbsd-x64@0.27.4":
resolution:
{
- integrity: sha512-DnW2sRrBzA+YnE70LKqnM3P+z8vehfJWHXECbwBmH/CU51z6FiqTQTHFenPlHmo3a8UgpLyH3PT+87OViOh1AQ==,
+ integrity: sha512-u8fg/jQ5aQDfsnIV6+KwLOf1CmJnfu1ShpwqdwC0uA7ZPwFws55Ngc12vBdeUdnuWoQYx/SOQLGDcdlfXhYmXQ==,
}
engines: { node: ">=18" }
cpu: [x64]
os: [openbsd]
- "@esbuild/openharmony-arm64@0.27.3":
+ "@esbuild/openharmony-arm64@0.27.4":
resolution:
{
- integrity: sha512-NinAEgr/etERPTsZJ7aEZQvvg/A6IsZG/LgZy+81wON2huV7SrK3e63dU0XhyZP4RKGyTm7aOgmQk0bGp0fy2g==,
+ integrity: sha512-JkTZrl6VbyO8lDQO3yv26nNr2RM2yZzNrNHEsj9bm6dOwwu9OYN28CjzZkH57bh4w0I2F7IodpQvUAEd1mbWXg==,
}
engines: { node: ">=18" }
cpu: [arm64]
os: [openharmony]
- "@esbuild/sunos-x64@0.27.3":
+ "@esbuild/sunos-x64@0.27.4":
resolution:
{
- integrity: sha512-PanZ+nEz+eWoBJ8/f8HKxTTD172SKwdXebZ0ndd953gt1HRBbhMsaNqjTyYLGLPdoWHy4zLU7bDVJztF5f3BHA==,
+ integrity: sha512-/gOzgaewZJfeJTlsWhvUEmUG4tWEY2Spp5M20INYRg2ZKl9QPO3QEEgPeRtLjEWSW8FilRNacPOg8R1uaYkA6g==,
}
engines: { node: ">=18" }
cpu: [x64]
os: [sunos]
- "@esbuild/win32-arm64@0.27.3":
+ "@esbuild/win32-arm64@0.27.4":
resolution:
{
- integrity: sha512-B2t59lWWYrbRDw/tjiWOuzSsFh1Y/E95ofKz7rIVYSQkUYBjfSgf6oeYPNWHToFRr2zx52JKApIcAS/D5TUBnA==,
+ integrity: sha512-Z9SExBg2y32smoDQdf1HRwHRt6vAHLXcxD2uGgO/v2jK7Y718Ix4ndsbNMU/+1Qiem9OiOdaqitioZwxivhXYg==,
}
engines: { node: ">=18" }
cpu: [arm64]
os: [win32]
- "@esbuild/win32-ia32@0.27.3":
+ "@esbuild/win32-ia32@0.27.4":
resolution:
{
- integrity: sha512-QLKSFeXNS8+tHW7tZpMtjlNb7HKau0QDpwm49u0vUp9y1WOF+PEzkU84y9GqYaAVW8aH8f3GcBck26jh54cX4Q==,
+ integrity: sha512-DAyGLS0Jz5G5iixEbMHi5KdiApqHBWMGzTtMiJ72ZOLhbu/bzxgAe8Ue8CTS3n3HbIUHQz/L51yMdGMeoxXNJw==,
}
engines: { node: ">=18" }
cpu: [ia32]
os: [win32]
- "@esbuild/win32-x64@0.27.3":
+ "@esbuild/win32-x64@0.27.4":
resolution:
{
- integrity: sha512-4uJGhsxuptu3OcpVAzli+/gWusVGwZZHTlS63hh++ehExkVT8SgiEf7/uC/PclrPPkLhZqGgCTjd0VWLo6xMqA==,
+ integrity: sha512-+knoa0BDoeXgkNvvV1vvbZX4+hizelrkwmGJBdT17t8FNPwG2lKemmuMZlmaNQ3ws3DKKCxpb4zRZEIp3UxFCg==,
}
engines: { node: ">=18" }
cpu: [x64]
@@ -1558,6 +1586,25 @@ packages:
}
engines: { node: ">=18" }
+ "@inquirer/ansi@2.0.3":
+ resolution:
+ {
+ integrity: sha512-g44zhR3NIKVs0zUesa4iMzExmZpLUdTLRMCStqX3GE5NT6VkPcxQGJ+uC8tDgBUC/vB1rUhUd55cOf++4NZcmw==,
+ }
+ engines: { node: ">=23.5.0 || ^22.13.0 || ^21.7.0 || ^20.12.0" }
+
+ "@inquirer/checkbox@5.1.0":
+ resolution:
+ {
+ integrity: sha512-/HjF1LN0a1h4/OFsbGKHNDtWICFU/dqXCdym719HFTyJo9IG7Otr+ziGWc9S0iQuohRZllh+WprSgd5UW5Fw0g==,
+ }
+ engines: { node: ">=23.5.0 || ^22.13.0 || ^21.7.0 || ^20.12.0" }
+ peerDependencies:
+ "@types/node": ">=18"
+ peerDependenciesMeta:
+ "@types/node":
+ optional: true
+
"@inquirer/confirm@5.1.21":
resolution:
{
@@ -1570,6 +1617,18 @@ packages:
"@types/node":
optional: true
+ "@inquirer/confirm@6.0.8":
+ resolution:
+ {
+ integrity: sha512-Di6dgmiZ9xCSUxWUReWTqDtbhXCuG2MQm2xmgSAIruzQzBqNf49b8E07/vbCYY506kDe8BiwJbegXweG8M1klw==,
+ }
+ engines: { node: ">=23.5.0 || ^22.13.0 || ^21.7.0 || ^20.12.0" }
+ peerDependencies:
+ "@types/node": ">=18"
+ peerDependenciesMeta:
+ "@types/node":
+ optional: true
+
"@inquirer/core@10.3.2":
resolution:
{
@@ -1582,6 +1641,54 @@ packages:
"@types/node":
optional: true
+ "@inquirer/core@11.1.5":
+ resolution:
+ {
+ integrity: sha512-QQPAX+lka8GyLcZ7u7Nb1h6q72iZ/oy0blilC3IB2nSt1Qqxp7akt94Jqhi/DzARuN3Eo9QwJRvtl4tmVe4T5A==,
+ }
+ engines: { node: ">=23.5.0 || ^22.13.0 || ^21.7.0 || ^20.12.0" }
+ peerDependencies:
+ "@types/node": ">=18"
+ peerDependenciesMeta:
+ "@types/node":
+ optional: true
+
+ "@inquirer/editor@5.0.8":
+ resolution:
+ {
+ integrity: sha512-sLcpbb9B3XqUEGrj1N66KwhDhEckzZ4nI/W6SvLXyBX8Wic3LDLENlWRvkOGpCPoserabe+MxQkpiMoI8irvyA==,
+ }
+ engines: { node: ">=23.5.0 || ^22.13.0 || ^21.7.0 || ^20.12.0" }
+ peerDependencies:
+ "@types/node": ">=18"
+ peerDependenciesMeta:
+ "@types/node":
+ optional: true
+
+ "@inquirer/expand@5.0.8":
+ resolution:
+ {
+ integrity: sha512-QieW3F1prNw3j+hxO7/NKkG1pk3oz7pOB6+5Upwu3OIwADfPX0oZVppsqlL+Vl/uBHHDSOBY0BirLctLnXwGGg==,
+ }
+ engines: { node: ">=23.5.0 || ^22.13.0 || ^21.7.0 || ^20.12.0" }
+ peerDependencies:
+ "@types/node": ">=18"
+ peerDependenciesMeta:
+ "@types/node":
+ optional: true
+
+ "@inquirer/external-editor@2.0.3":
+ resolution:
+ {
+ integrity: sha512-LgyI7Agbda74/cL5MvA88iDpvdXI2KuMBCGRkbCl2Dg1vzHeOgs+s0SDcXV7b+WZJrv2+ERpWSM65Fpi9VfY3w==,
+ }
+ engines: { node: ">=23.5.0 || ^22.13.0 || ^21.7.0 || ^20.12.0" }
+ peerDependencies:
+ "@types/node": ">=18"
+ peerDependenciesMeta:
+ "@types/node":
+ optional: true
+
"@inquirer/figures@1.0.15":
resolution:
{
@@ -1589,6 +1696,97 @@ packages:
}
engines: { node: ">=18" }
+ "@inquirer/figures@2.0.3":
+ resolution:
+ {
+ integrity: sha512-y09iGt3JKoOCBQ3w4YrSJdokcD8ciSlMIWsD+auPu+OZpfxLuyz+gICAQ6GCBOmJJt4KEQGHuZSVff2jiNOy7g==,
+ }
+ engines: { node: ">=23.5.0 || ^22.13.0 || ^21.7.0 || ^20.12.0" }
+
+ "@inquirer/input@5.0.8":
+ resolution:
+ {
+ integrity: sha512-p0IJslw0AmedLEkOU+yrEX3Aj2RTpQq7ZOf8nc1DIhjzaxRWrrgeuE5Kyh39fVRgtcACaMXx/9WNo8+GjgBOfw==,
+ }
+ engines: { node: ">=23.5.0 || ^22.13.0 || ^21.7.0 || ^20.12.0" }
+ peerDependencies:
+ "@types/node": ">=18"
+ peerDependenciesMeta:
+ "@types/node":
+ optional: true
+
+ "@inquirer/number@4.0.8":
+ resolution:
+ {
+ integrity: sha512-uGLiQah9A0F9UIvJBX52m0CnqtLaym0WpT9V4YZrjZ+YRDKZdwwoEPz06N6w8ChE2lrnsdyhY9sL+Y690Kh9gQ==,
+ }
+ engines: { node: ">=23.5.0 || ^22.13.0 || ^21.7.0 || ^20.12.0" }
+ peerDependencies:
+ "@types/node": ">=18"
+ peerDependenciesMeta:
+ "@types/node":
+ optional: true
+
+ "@inquirer/password@5.0.8":
+ resolution:
+ {
+ integrity: sha512-zt1sF4lYLdvPqvmvHdmjOzuUUjuCQ897pdUCO8RbXMUDKXJTTyOQgtn23le+jwcb+MpHl3VAFvzIdxRAf6aPlA==,
+ }
+ engines: { node: ">=23.5.0 || ^22.13.0 || ^21.7.0 || ^20.12.0" }
+ peerDependencies:
+ "@types/node": ">=18"
+ peerDependenciesMeta:
+ "@types/node":
+ optional: true
+
+ "@inquirer/prompts@8.3.0":
+ resolution:
+ {
+ integrity: sha512-JAj66kjdH/F1+B7LCigjARbwstt3SNUOSzMdjpsvwJmzunK88gJeXmcm95L9nw1KynvFVuY4SzXh/3Y0lvtgSg==,
+ }
+ engines: { node: ">=23.5.0 || ^22.13.0 || ^21.7.0 || ^20.12.0" }
+ peerDependencies:
+ "@types/node": ">=18"
+ peerDependenciesMeta:
+ "@types/node":
+ optional: true
+
+ "@inquirer/rawlist@5.2.4":
+ resolution:
+ {
+ integrity: sha512-fTuJ5Cq9W286isLxwj6GGyfTjx1Zdk4qppVEPexFuA6yioCCXS4V1zfKroQqw7QdbDPN73xs2DiIAlo55+kBqg==,
+ }
+ engines: { node: ">=23.5.0 || ^22.13.0 || ^21.7.0 || ^20.12.0" }
+ peerDependencies:
+ "@types/node": ">=18"
+ peerDependenciesMeta:
+ "@types/node":
+ optional: true
+
+ "@inquirer/search@4.1.4":
+ resolution:
+ {
+ integrity: sha512-9yPTxq7LPmYjrGn3DRuaPuPbmC6u3fiWcsE9ggfLcdgO/ICHYgxq7mEy1yJ39brVvgXhtOtvDVjDh9slJxE4LQ==,
+ }
+ engines: { node: ">=23.5.0 || ^22.13.0 || ^21.7.0 || ^20.12.0" }
+ peerDependencies:
+ "@types/node": ">=18"
+ peerDependenciesMeta:
+ "@types/node":
+ optional: true
+
+ "@inquirer/select@5.1.0":
+ resolution:
+ {
+ integrity: sha512-OyYbKnchS1u+zRe14LpYrN8S0wH1vD0p2yKISvSsJdH2TpI87fh4eZdWnpdbrGauCRWDph3NwxRmM4Pcm/hx1Q==,
+ }
+ engines: { node: ">=23.5.0 || ^22.13.0 || ^21.7.0 || ^20.12.0" }
+ peerDependencies:
+ "@types/node": ">=18"
+ peerDependenciesMeta:
+ "@types/node":
+ optional: true
+
"@inquirer/type@3.0.10":
resolution:
{
@@ -1601,6 +1799,18 @@ packages:
"@types/node":
optional: true
+ "@inquirer/type@4.0.3":
+ resolution:
+ {
+ integrity: sha512-cKZN7qcXOpj1h+1eTTcGDVLaBIHNMT1Rz9JqJP5MnEJ0JhgVWllx7H/tahUp5YEK1qaByH2Itb8wLG/iScD5kw==,
+ }
+ engines: { node: ">=23.5.0 || ^22.13.0 || ^21.7.0 || ^20.12.0" }
+ peerDependencies:
+ "@types/node": ">=18"
+ peerDependenciesMeta:
+ "@types/node":
+ optional: true
+
"@isaacs/balanced-match@4.0.1":
resolution:
{
@@ -1828,6 +2038,19 @@ packages:
"@cfworker/json-schema":
optional: true
+ "@modelcontextprotocol/sdk@1.27.1":
+ resolution:
+ {
+ integrity: sha512-sr6GbP+4edBwFndLbM60gf07z0FQ79gaExpnsjMGePXqFcSSb7t6iscpjk9DhFhwd+mTEQrzNafGP8/iGGFYaA==,
+ }
+ engines: { node: ">=18" }
+ peerDependencies:
+ "@cfworker/json-schema": ^4.1.1
+ zod: ^3.25 || ^4.0
+ peerDependenciesMeta:
+ "@cfworker/json-schema":
+ optional: true
+
"@msgpack/msgpack@3.1.2":
resolution:
{
@@ -5664,6 +5887,12 @@ packages:
}
engines: { node: ">=10" }
+ chardet@2.1.1:
+ resolution:
+ {
+ integrity: sha512-PsezH1rqdV9VvyNhxxOW32/d75r01NY7TQCmOqomRo15ZSOKbpTFVsfjghxo6JloQUCGnH4k1LGu0R4yCLlWQQ==,
+ }
+
charenc@0.0.2:
resolution:
{
@@ -5677,6 +5906,14 @@ packages:
}
engines: { node: ">= 20.19.0" }
+ chromium-bidi@15.0.0:
+ resolution:
+ {
+ integrity: sha512-ESWZM1u85CoeSozBXXG9M73S5tH0EjkqnFJoQ6F3MHs2YGe0CLVMaRvhGxetLP6w4GVR59+/cpWvDLUpLvJXLQ==,
+ }
+ peerDependencies:
+ devtools-protocol: "*"
+
class-variance-authority@0.7.1:
resolution:
{
@@ -6299,6 +6536,12 @@ packages:
integrity: sha512-ypdmJU/TbBby2Dxibuv7ZLW3Bs1QEmM7nHjEANfohJLvE0XVujisn1qPJcZxg+qDucsr+bP6fLD1rPS3AhJ7EQ==,
}
+ devtools-protocol@0.0.1596832:
+ resolution:
+ {
+ integrity: sha512-IwRVIiCa4mpaKeLcZ2cmGpG0hP8ls3zj3zg87Z/JwULm2xYmhOcMrwdeHos6xaANQHGEXzSCzji+6kEuZu873A==,
+ }
+
diff@8.0.3:
resolution:
{
@@ -6526,10 +6769,10 @@ packages:
integrity: sha512-C+d6UdsYDk0lMebHNR4S2NybQMMngAOnOwYBQjTOiv0MkoJMP0Myw2mgpDLBcpfCmRLxyFqYhS/CfOENq4SJhQ==,
}
- esbuild@0.27.3:
+ esbuild@0.27.4:
resolution:
{
- integrity: sha512-8VwMnyGCONIs6cWue2IdpHxHnAjzxnw2Zr7MkVxB2vjmQ2ivqGFb4LEG3SMnv0Gb2F/G/2yA8zUaiL1gywDCCg==,
+ integrity: sha512-Rq4vbHnYkK5fws5NF7MYTU68FPRE1ajX7heQ/8QXXWqNgqqJ/GkmmyxIzUnf2Sr/bakf8l54716CcMGHYhMrrQ==,
}
engines: { node: ">=18" }
hasBin: true
@@ -6871,12 +7114,30 @@ packages:
integrity: sha512-wpYMUmFu5f00Sm0cj2pfivpmawLZ0NKdviQ4w9zJeR8JVtOpOxHmLaJuj0vxvGqMJQWyP/COUkF75/57OKyRag==,
}
+ fast-string-truncated-width@3.0.3:
+ resolution:
+ {
+ integrity: sha512-0jjjIEL6+0jag3l2XWWizO64/aZVtpiGE3t0Zgqxv0DPuxiMjvB3M24fCyhZUO4KomJQPj3LTSUnDP3GpdwC0g==,
+ }
+
+ fast-string-width@3.0.2:
+ resolution:
+ {
+ integrity: sha512-gX8LrtNEI5hq8DVUfRQMbr5lpaS4nMIWV+7XEbXk2b8kiQIizgnlr12B4dA3ZEx3308ze0O4Q1R+cHts8kyUJg==,
+ }
+
fast-uri@3.1.0:
resolution:
{
integrity: sha512-iPeeDKJSWf4IEOasVVrknXpaBV0IApz/gp7S2bb7Z4Lljbl2MGJRqInZiUrQwV16cpzw/D3S5j5Julj/gT52AA==,
}
+ fast-wrap-ansi@0.2.0:
+ resolution:
+ {
+ integrity: sha512-rLV8JHxTyhVmFYhBJuMujcrHqOT2cnO5Zxj37qROj23CP39GXubJRBUFF0z8KFK77Uc0SukZUf7JZhsVEQ6n8w==,
+ }
+
fastq@1.20.1:
resolution:
{
@@ -7075,6 +7336,14 @@ packages:
}
engines: { node: ">=14.14" }
+ fsevents@2.3.2:
+ resolution:
+ {
+ integrity: sha512-xiqMQR4xAeHTuB9uWm+fFRcIOgKBMiOBP+eXiyT7jsgVCq1bkVygt00oASowB7EdtpOHaaPgKt812P9ab+DDKA==,
+ }
+ engines: { node: ^8.16.0 || ^10.6.0 || >=11.0.0 }
+ os: [darwin]
+
fsevents@2.3.3:
resolution:
{
@@ -8552,6 +8821,12 @@ packages:
typescript:
optional: true
+ mitt@3.0.1:
+ resolution:
+ {
+ integrity: sha512-vKivATfr97l2/QBCYAkXYDbrIWPM2IIKEl7YPhjCvKlG3kE2gm+uBo6nEXK3M5/Ffh/FLpKExzOQ3JJoJGFKBw==,
+ }
+
motion-dom@12.34.0:
resolution:
{
@@ -8619,6 +8894,13 @@ packages:
}
engines: { node: ^18.17.0 || >=20.5.0 }
+ mute-stream@3.0.0:
+ resolution:
+ {
+ integrity: sha512-dkEJPVvun4FryqBmZ5KhDo0K9iDXAwn08tMLDinNdRBNPcYEDiWYysLcc6k3mjTMlbP9KyylvRpd4wFtwrT9rw==,
+ }
+ engines: { node: ^20.17.0 || >=22.9.0 }
+
mz@2.7.0:
resolution:
{
@@ -9454,6 +9736,22 @@ packages:
}
engines: { node: ">=4" }
+ playwright-core@1.58.2:
+ resolution:
+ {
+ integrity: sha512-yZkEtftgwS8CsfYo7nm0KE8jsvm6i/PTgVtB8DL726wNf6H2IMsDuxCpJj59KDaxCtSnrWan2AeDqM7JBaultg==,
+ }
+ engines: { node: ">=18" }
+ hasBin: true
+
+ playwright@1.58.2:
+ resolution:
+ {
+ integrity: sha512-vA30H8Nvkq/cPBnNw4Q8TWz1EJyqgpuinBcHET0YVJVFldr8JDNiU9LaWAE1KqSkRYazuaBhTpB5ZzShOezQ6A==,
+ }
+ engines: { node: ">=18" }
+ hasBin: true
+
pngjs@5.0.0:
resolution:
{
@@ -11899,7 +12197,7 @@ snapshots:
"@csstools/css-color-parser": 4.0.1(@csstools/css-parser-algorithms@4.0.0(@csstools/css-tokenizer@4.0.0))(@csstools/css-tokenizer@4.0.0)
"@csstools/css-parser-algorithms": 4.0.0(@csstools/css-tokenizer@4.0.0)
"@csstools/css-tokenizer": 4.0.0
- lru-cache: 11.2.5
+ lru-cache: 11.2.6
"@asamuzakjp/dom-selector@6.8.1":
dependencies:
@@ -12433,82 +12731,82 @@ snapshots:
"@emotion/unitless@0.10.0": {}
- "@esbuild/aix-ppc64@0.27.3":
+ "@esbuild/aix-ppc64@0.27.4":
optional: true
- "@esbuild/android-arm64@0.27.3":
+ "@esbuild/android-arm64@0.27.4":
optional: true
- "@esbuild/android-arm@0.27.3":
+ "@esbuild/android-arm@0.27.4":
optional: true
- "@esbuild/android-x64@0.27.3":
+ "@esbuild/android-x64@0.27.4":
optional: true
- "@esbuild/darwin-arm64@0.27.3":
+ "@esbuild/darwin-arm64@0.27.4":
optional: true
- "@esbuild/darwin-x64@0.27.3":
+ "@esbuild/darwin-x64@0.27.4":
optional: true
- "@esbuild/freebsd-arm64@0.27.3":
+ "@esbuild/freebsd-arm64@0.27.4":
optional: true
- "@esbuild/freebsd-x64@0.27.3":
+ "@esbuild/freebsd-x64@0.27.4":
optional: true
- "@esbuild/linux-arm64@0.27.3":
+ "@esbuild/linux-arm64@0.27.4":
optional: true
- "@esbuild/linux-arm@0.27.3":
+ "@esbuild/linux-arm@0.27.4":
optional: true
- "@esbuild/linux-ia32@0.27.3":
+ "@esbuild/linux-ia32@0.27.4":
optional: true
- "@esbuild/linux-loong64@0.27.3":
+ "@esbuild/linux-loong64@0.27.4":
optional: true
- "@esbuild/linux-mips64el@0.27.3":
+ "@esbuild/linux-mips64el@0.27.4":
optional: true
- "@esbuild/linux-ppc64@0.27.3":
+ "@esbuild/linux-ppc64@0.27.4":
optional: true
- "@esbuild/linux-riscv64@0.27.3":
+ "@esbuild/linux-riscv64@0.27.4":
optional: true
- "@esbuild/linux-s390x@0.27.3":
+ "@esbuild/linux-s390x@0.27.4":
optional: true
- "@esbuild/linux-x64@0.27.3":
+ "@esbuild/linux-x64@0.27.4":
optional: true
- "@esbuild/netbsd-arm64@0.27.3":
+ "@esbuild/netbsd-arm64@0.27.4":
optional: true
- "@esbuild/netbsd-x64@0.27.3":
+ "@esbuild/netbsd-x64@0.27.4":
optional: true
- "@esbuild/openbsd-arm64@0.27.3":
+ "@esbuild/openbsd-arm64@0.27.4":
optional: true
- "@esbuild/openbsd-x64@0.27.3":
+ "@esbuild/openbsd-x64@0.27.4":
optional: true
- "@esbuild/openharmony-arm64@0.27.3":
+ "@esbuild/openharmony-arm64@0.27.4":
optional: true
- "@esbuild/sunos-x64@0.27.3":
+ "@esbuild/sunos-x64@0.27.4":
optional: true
- "@esbuild/win32-arm64@0.27.3":
+ "@esbuild/win32-arm64@0.27.4":
optional: true
- "@esbuild/win32-ia32@0.27.3":
+ "@esbuild/win32-ia32@0.27.4":
optional: true
- "@esbuild/win32-x64@0.27.3":
+ "@esbuild/win32-x64@0.27.4":
optional: true
"@eslint-community/eslint-utils@4.9.1(eslint@9.39.2(jiti@2.6.1))":
@@ -12761,6 +13059,17 @@ snapshots:
"@inquirer/ansi@1.0.2": {}
+ "@inquirer/ansi@2.0.3": {}
+
+ "@inquirer/checkbox@5.1.0(@types/node@25.2.2)":
+ dependencies:
+ "@inquirer/ansi": 2.0.3
+ "@inquirer/core": 11.1.5(@types/node@25.2.2)
+ "@inquirer/figures": 2.0.3
+ "@inquirer/type": 4.0.3(@types/node@25.2.2)
+ optionalDependencies:
+ "@types/node": 25.2.2
+
"@inquirer/confirm@5.1.21(@types/node@20.19.33)":
dependencies:
"@inquirer/core": 10.3.2(@types/node@20.19.33)
@@ -12776,6 +13085,13 @@ snapshots:
"@types/node": 25.2.2
optional: true
+ "@inquirer/confirm@6.0.8(@types/node@25.2.2)":
+ dependencies:
+ "@inquirer/core": 11.1.5(@types/node@25.2.2)
+ "@inquirer/type": 4.0.3(@types/node@25.2.2)
+ optionalDependencies:
+ "@types/node": 25.2.2
+
"@inquirer/core@10.3.2(@types/node@20.19.33)":
dependencies:
"@inquirer/ansi": 1.0.2
@@ -12803,8 +13119,105 @@ snapshots:
"@types/node": 25.2.2
optional: true
+ "@inquirer/core@11.1.5(@types/node@25.2.2)":
+ dependencies:
+ "@inquirer/ansi": 2.0.3
+ "@inquirer/figures": 2.0.3
+ "@inquirer/type": 4.0.3(@types/node@25.2.2)
+ cli-width: 4.1.0
+ fast-wrap-ansi: 0.2.0
+ mute-stream: 3.0.0
+ signal-exit: 4.1.0
+ optionalDependencies:
+ "@types/node": 25.2.2
+
+ "@inquirer/editor@5.0.8(@types/node@25.2.2)":
+ dependencies:
+ "@inquirer/core": 11.1.5(@types/node@25.2.2)
+ "@inquirer/external-editor": 2.0.3(@types/node@25.2.2)
+ "@inquirer/type": 4.0.3(@types/node@25.2.2)
+ optionalDependencies:
+ "@types/node": 25.2.2
+
+ "@inquirer/expand@5.0.8(@types/node@25.2.2)":
+ dependencies:
+ "@inquirer/core": 11.1.5(@types/node@25.2.2)
+ "@inquirer/type": 4.0.3(@types/node@25.2.2)
+ optionalDependencies:
+ "@types/node": 25.2.2
+
+ "@inquirer/external-editor@2.0.3(@types/node@25.2.2)":
+ dependencies:
+ chardet: 2.1.1
+ iconv-lite: 0.7.2
+ optionalDependencies:
+ "@types/node": 25.2.2
+
"@inquirer/figures@1.0.15": {}
+ "@inquirer/figures@2.0.3": {}
+
+ "@inquirer/input@5.0.8(@types/node@25.2.2)":
+ dependencies:
+ "@inquirer/core": 11.1.5(@types/node@25.2.2)
+ "@inquirer/type": 4.0.3(@types/node@25.2.2)
+ optionalDependencies:
+ "@types/node": 25.2.2
+
+ "@inquirer/number@4.0.8(@types/node@25.2.2)":
+ dependencies:
+ "@inquirer/core": 11.1.5(@types/node@25.2.2)
+ "@inquirer/type": 4.0.3(@types/node@25.2.2)
+ optionalDependencies:
+ "@types/node": 25.2.2
+
+ "@inquirer/password@5.0.8(@types/node@25.2.2)":
+ dependencies:
+ "@inquirer/ansi": 2.0.3
+ "@inquirer/core": 11.1.5(@types/node@25.2.2)
+ "@inquirer/type": 4.0.3(@types/node@25.2.2)
+ optionalDependencies:
+ "@types/node": 25.2.2
+
+ "@inquirer/prompts@8.3.0(@types/node@25.2.2)":
+ dependencies:
+ "@inquirer/checkbox": 5.1.0(@types/node@25.2.2)
+ "@inquirer/confirm": 6.0.8(@types/node@25.2.2)
+ "@inquirer/editor": 5.0.8(@types/node@25.2.2)
+ "@inquirer/expand": 5.0.8(@types/node@25.2.2)
+ "@inquirer/input": 5.0.8(@types/node@25.2.2)
+ "@inquirer/number": 4.0.8(@types/node@25.2.2)
+ "@inquirer/password": 5.0.8(@types/node@25.2.2)
+ "@inquirer/rawlist": 5.2.4(@types/node@25.2.2)
+ "@inquirer/search": 4.1.4(@types/node@25.2.2)
+ "@inquirer/select": 5.1.0(@types/node@25.2.2)
+ optionalDependencies:
+ "@types/node": 25.2.2
+
+ "@inquirer/rawlist@5.2.4(@types/node@25.2.2)":
+ dependencies:
+ "@inquirer/core": 11.1.5(@types/node@25.2.2)
+ "@inquirer/type": 4.0.3(@types/node@25.2.2)
+ optionalDependencies:
+ "@types/node": 25.2.2
+
+ "@inquirer/search@4.1.4(@types/node@25.2.2)":
+ dependencies:
+ "@inquirer/core": 11.1.5(@types/node@25.2.2)
+ "@inquirer/figures": 2.0.3
+ "@inquirer/type": 4.0.3(@types/node@25.2.2)
+ optionalDependencies:
+ "@types/node": 25.2.2
+
+ "@inquirer/select@5.1.0(@types/node@25.2.2)":
+ dependencies:
+ "@inquirer/ansi": 2.0.3
+ "@inquirer/core": 11.1.5(@types/node@25.2.2)
+ "@inquirer/figures": 2.0.3
+ "@inquirer/type": 4.0.3(@types/node@25.2.2)
+ optionalDependencies:
+ "@types/node": 25.2.2
+
"@inquirer/type@3.0.10(@types/node@20.19.33)":
optionalDependencies:
"@types/node": 20.19.33
@@ -12814,6 +13227,10 @@ snapshots:
"@types/node": 25.2.2
optional: true
+ "@inquirer/type@4.0.3(@types/node@25.2.2)":
+ optionalDependencies:
+ "@types/node": 25.2.2
+
"@isaacs/balanced-match@4.0.1": {}
"@isaacs/brace-expansion@5.0.1":
@@ -13062,6 +13479,28 @@ snapshots:
transitivePeerDependencies:
- supports-color
+ "@modelcontextprotocol/sdk@1.27.1(zod@4.3.6)":
+ dependencies:
+ "@hono/node-server": 1.19.9(hono@4.11.9)
+ ajv: 8.17.1
+ ajv-formats: 3.0.1(ajv@8.17.1)
+ content-type: 1.0.5
+ cors: 2.8.6
+ cross-spawn: 7.0.6
+ eventsource: 3.0.7
+ eventsource-parser: 3.0.6
+ express: 5.2.1
+ express-rate-limit: 8.2.1(express@5.2.1)
+ hono: 4.11.9
+ jose: 6.1.3
+ json-schema-typed: 8.0.2
+ pkce-challenge: 5.0.1
+ raw-body: 3.0.2
+ zod: 4.3.6
+ zod-to-json-schema: 3.25.1(zod@4.3.6)
+ transitivePeerDependencies:
+ - supports-color
+
"@msgpack/msgpack@3.1.2": {}
"@mswjs/interceptors@0.41.2":
@@ -15754,14 +16193,14 @@ snapshots:
chai: 6.2.2
tinyrainbow: 3.0.3
- "@vitest/mocker@4.0.18(msw@2.12.10(@types/node@20.19.33)(typescript@5.7.3))(vite@7.3.1(@types/node@20.19.33)(jiti@2.6.1)(lightningcss@1.30.2)(yaml@2.8.2))":
+ "@vitest/mocker@4.0.18(msw@2.12.10(@types/node@20.19.33)(typescript@5.7.3))(vite@7.3.1(@types/node@25.2.2)(jiti@2.6.1)(lightningcss@1.30.2)(yaml@2.8.2))":
dependencies:
"@vitest/spy": 4.0.18
estree-walker: 3.0.3
magic-string: 0.30.21
optionalDependencies:
msw: 2.12.10(@types/node@20.19.33)(typescript@5.7.3)
- vite: 7.3.1(@types/node@20.19.33)(jiti@2.6.1)(lightningcss@1.30.2)(yaml@2.8.2)
+ vite: 7.3.1(@types/node@25.2.2)(jiti@2.6.1)(lightningcss@1.30.2)(yaml@2.8.2)
"@vitest/mocker@4.0.18(msw@2.12.10(@types/node@25.2.2)(typescript@5.7.3))(vite@7.3.1(@types/node@25.2.2)(jiti@2.6.1)(lightningcss@1.30.2)(yaml@2.8.2))":
dependencies:
@@ -17001,7 +17440,7 @@ snapshots:
base-x@3.0.11:
dependencies:
- safe-buffer: 5.1.2
+ safe-buffer: 5.2.1
base-x@5.0.1: {}
@@ -17133,12 +17572,20 @@ snapshots:
char-regex@1.0.2: {}
+ chardet@2.1.1: {}
+
charenc@0.0.2: {}
chokidar@5.0.0:
dependencies:
readdirp: 5.0.0
+ chromium-bidi@15.0.0(devtools-protocol@0.0.1596832):
+ dependencies:
+ devtools-protocol: 0.0.1596832
+ mitt: 3.0.1
+ zod: 3.25.76
+
class-variance-authority@0.7.1:
dependencies:
clsx: 2.1.1
@@ -17357,7 +17804,7 @@ snapshots:
"@asamuzakjp/css-color": 4.1.2
"@csstools/css-syntax-patches-for-csstree": 1.0.27
css-tree: 3.1.0
- lru-cache: 11.2.5
+ lru-cache: 11.2.6
csstype@3.2.3: {}
@@ -17439,6 +17886,8 @@ snapshots:
detect-node-es@1.1.0: {}
+ devtools-protocol@0.0.1596832: {}
+
diff@8.0.3: {}
dijkstrajs@1.0.3: {}
@@ -17558,34 +18007,34 @@ snapshots:
dependencies:
es6-promise: 4.2.8
- esbuild@0.27.3:
+ esbuild@0.27.4:
optionalDependencies:
- "@esbuild/aix-ppc64": 0.27.3
- "@esbuild/android-arm": 0.27.3
- "@esbuild/android-arm64": 0.27.3
- "@esbuild/android-x64": 0.27.3
- "@esbuild/darwin-arm64": 0.27.3
- "@esbuild/darwin-x64": 0.27.3
- "@esbuild/freebsd-arm64": 0.27.3
- "@esbuild/freebsd-x64": 0.27.3
- "@esbuild/linux-arm": 0.27.3
- "@esbuild/linux-arm64": 0.27.3
- "@esbuild/linux-ia32": 0.27.3
- "@esbuild/linux-loong64": 0.27.3
- "@esbuild/linux-mips64el": 0.27.3
- "@esbuild/linux-ppc64": 0.27.3
- "@esbuild/linux-riscv64": 0.27.3
- "@esbuild/linux-s390x": 0.27.3
- "@esbuild/linux-x64": 0.27.3
- "@esbuild/netbsd-arm64": 0.27.3
- "@esbuild/netbsd-x64": 0.27.3
- "@esbuild/openbsd-arm64": 0.27.3
- "@esbuild/openbsd-x64": 0.27.3
- "@esbuild/openharmony-arm64": 0.27.3
- "@esbuild/sunos-x64": 0.27.3
- "@esbuild/win32-arm64": 0.27.3
- "@esbuild/win32-ia32": 0.27.3
- "@esbuild/win32-x64": 0.27.3
+ "@esbuild/aix-ppc64": 0.27.4
+ "@esbuild/android-arm": 0.27.4
+ "@esbuild/android-arm64": 0.27.4
+ "@esbuild/android-x64": 0.27.4
+ "@esbuild/darwin-arm64": 0.27.4
+ "@esbuild/darwin-x64": 0.27.4
+ "@esbuild/freebsd-arm64": 0.27.4
+ "@esbuild/freebsd-x64": 0.27.4
+ "@esbuild/linux-arm": 0.27.4
+ "@esbuild/linux-arm64": 0.27.4
+ "@esbuild/linux-ia32": 0.27.4
+ "@esbuild/linux-loong64": 0.27.4
+ "@esbuild/linux-mips64el": 0.27.4
+ "@esbuild/linux-ppc64": 0.27.4
+ "@esbuild/linux-riscv64": 0.27.4
+ "@esbuild/linux-s390x": 0.27.4
+ "@esbuild/linux-x64": 0.27.4
+ "@esbuild/netbsd-arm64": 0.27.4
+ "@esbuild/netbsd-x64": 0.27.4
+ "@esbuild/openbsd-arm64": 0.27.4
+ "@esbuild/openbsd-x64": 0.27.4
+ "@esbuild/openharmony-arm64": 0.27.4
+ "@esbuild/sunos-x64": 0.27.4
+ "@esbuild/win32-arm64": 0.27.4
+ "@esbuild/win32-ia32": 0.27.4
+ "@esbuild/win32-x64": 0.27.4
escalade@3.2.0: {}
@@ -17808,7 +18257,7 @@ snapshots:
extension-port-stream@3.0.0:
dependencies:
- readable-stream: 3.6.2
+ readable-stream: 4.7.0
webextension-polyfill: 0.10.0
eyes@0.1.8: {}
@@ -17841,8 +18290,18 @@ snapshots:
fast-stable-stringify@1.0.0: {}
+ fast-string-truncated-width@3.0.3: {}
+
+ fast-string-width@3.0.2:
+ dependencies:
+ fast-string-truncated-width: 3.0.3
+
fast-uri@3.1.0: {}
+ fast-wrap-ansi@0.2.0:
+ dependencies:
+ fast-string-width: 3.0.2
+
fastq@1.20.1:
dependencies:
reusify: 1.1.0
@@ -17958,6 +18417,9 @@ snapshots:
jsonfile: 6.2.0
universalify: 2.0.1
+ fsevents@2.3.2:
+ optional: true
+
fsevents@2.3.3:
optional: true
@@ -18690,6 +19152,8 @@ snapshots:
optionalDependencies:
typescript: 5.7.3
+ mitt@3.0.1: {}
+
motion-dom@12.34.0:
dependencies:
motion-utils: 12.29.2
@@ -18764,6 +19228,8 @@ snapshots:
mute-stream@2.0.0: {}
+ mute-stream@3.0.0: {}
+
mz@2.7.0:
dependencies:
any-promise: 1.3.0
@@ -19277,6 +19743,14 @@ snapshots:
find-up: 2.1.0
load-json-file: 4.0.0
+ playwright-core@1.58.2: {}
+
+ playwright@1.58.2:
+ dependencies:
+ playwright-core: 1.58.2
+ optionalDependencies:
+ fsevents: 2.3.2
+
pngjs@5.0.0: {}
pony-cause@2.1.11: {}
@@ -20525,7 +20999,7 @@ snapshots:
vite@7.3.1(@types/node@20.19.33)(jiti@2.6.1)(lightningcss@1.30.2)(yaml@2.8.2):
dependencies:
- esbuild: 0.27.3
+ esbuild: 0.27.4
fdir: 6.5.0(picomatch@4.0.3)
picomatch: 4.0.3
postcss: 8.5.6
@@ -20540,7 +21014,7 @@ snapshots:
vite@7.3.1(@types/node@25.2.2)(jiti@2.6.1)(lightningcss@1.30.2)(yaml@2.8.2):
dependencies:
- esbuild: 0.27.3
+ esbuild: 0.27.4
fdir: 6.5.0(picomatch@4.0.3)
picomatch: 4.0.3
postcss: 8.5.6
@@ -20556,7 +21030,7 @@ snapshots:
vitest@4.0.18(@types/node@20.19.33)(jiti@2.6.1)(jsdom@28.1.0(@noble/hashes@1.8.0))(lightningcss@1.30.2)(msw@2.12.10(@types/node@20.19.33)(typescript@5.7.3))(yaml@2.8.2):
dependencies:
"@vitest/expect": 4.0.18
- "@vitest/mocker": 4.0.18(msw@2.12.10(@types/node@20.19.33)(typescript@5.7.3))(vite@7.3.1(@types/node@20.19.33)(jiti@2.6.1)(lightningcss@1.30.2)(yaml@2.8.2))
+ "@vitest/mocker": 4.0.18(msw@2.12.10(@types/node@20.19.33)(typescript@5.7.3))(vite@7.3.1(@types/node@25.2.2)(jiti@2.6.1)(lightningcss@1.30.2)(yaml@2.8.2))
"@vitest/pretty-format": 4.0.18
"@vitest/runner": 4.0.18
"@vitest/snapshot": 4.0.18
@@ -20910,6 +21384,10 @@ snapshots:
dependencies:
zod: 3.25.76
+ zod-to-json-schema@3.25.1(zod@4.3.6):
+ dependencies:
+ zod: 4.3.6
+
zod@3.22.4: {}
zod@3.25.76: {}
diff --git a/research/async-cli/daemon-design-rounds.md b/research/async-cli/daemon-design-rounds.md
new file mode 100644
index 00000000..e151f5e3
--- /dev/null
+++ b/research/async-cli/daemon-design-rounds.md
@@ -0,0 +1,442 @@
+# Daemon/Detach Design: Three Rounds
+
+Informed by 105 findings across auth expiry patterns (Plaid, Stripe, MX,
+Google, Salesforce, Strava), daemon architectures (PM2, Turborepo, Docker,
+Homebrew services), and CLI async patterns (Vercel, Railway, Fly, Docker,
+GitHub, AWS).
+
+---
+
+## Round 1: Brainstorm
+
+### The problem statement
+
+`vana connect chatgpt` takes 30 minutes. Users can't wait at their
+terminal. Sessions/cookies expire unpredictably. Scheduled re-collection
+needs to handle auth failures gracefully when the user isn't available.
+
+### What the research says
+
+**Auth expiry:** Plaid uses a three-tier health model: HEALTHY,
+DEGRADED (will expire soon), ERROR (expired/revoked). They send
+advance warning webhooks (PENDING_DISCONNECTION) 7 days before
+UK consent expiry. Re-auth uses an abbreviated flow (not full
+re-setup). Google revokes tokens after 6 months of inactivity.
+Strava rotates refresh tokens on every use (6-hour access tokens).
+
+**Daemon architecture:** PM2's God Daemon over dual Unix sockets
+(RPC + pub/sub) is the gold standard for Node.js. Turborepo spawns
+a daemon on-demand that degrades gracefully if it crashes. Both use
+PID files + socket liveness checks. PM2 generates OS-specific init
+scripts for auto-start on boot.
+
+**CLI async patterns:** Block by default, `--detach` for opt-out.
+When detached, print a session ID. Server-side operations survive
+terminal close (Heroku, Vercel, Railway). Client-side ones don't
+(Stripe listen).
+
+### Design options
+
+1. **`--detach` only (no daemon)**
+ `vana connect github --detach` forks a child process, returns immediately.
+ `vana status` shows running connections. Terminal bell on complete.
+ No scheduling. No auth health tracking.
+
+2. **Embedded daemon (PM2-lite)**
+ `vana daemon start` spawns a background process that manages scheduled
+ collections, auth health, and notifications. Communicates via Unix
+ socket. `vana daemon stop` shuts it down. `vana status` queries the
+ daemon.
+
+3. **OS service registration**
+ `vana service install` creates a launchd plist (macOS) or systemd unit
+ (Linux). The CLI itself is the service binary. OS handles lifecycle.
+ Like Homebrew services.
+
+4. **Hybrid: detach for one-off, daemon for scheduled**
+ `--detach` for ad-hoc background connects. `vana daemon` for ongoing
+ scheduled collection with auth health tracking.
+
+### Round 1 Design
+
+Option 4 (hybrid) is the right answer. The research shows two distinct
+use cases:
+
+**One-off background:** "I want to connect ChatGPT but not wait 30
+minutes." → `--detach`
+
+**Ongoing scheduled:** "Keep my data fresh daily, handle auth failures."
+→ `vana daemon`
+
+### Connection health model (from Plaid research)
+
+Each source has a connection health status:
+
+```
+healthy → last collection succeeded, session valid
+degraded → session may be expiring soon (heuristic)
+needs_reauth → collection failed due to auth, user must re-login
+error → collection failed for non-auth reason
+disconnected → user has not connected this source
+```
+
+The daemon tracks this per source and acts on it:
+
+- `healthy`: collect on schedule
+- `degraded`: collect but warn user
+- `needs_reauth`: pause collection, notify user
+- `error`: retry with backoff, notify after N failures
+
+### Notification model (from research)
+
+Plaid uses webhooks to developers. For a CLI, the equivalent is:
+
+- Terminal bell (`\a`) — if terminal is open
+- Desktop notification (node-notifier) — if at computer
+- ntfy.sh push — if away from computer
+- File-based status — always (queryable via `vana status`)
+
+Default: terminal bell + status file. User can configure push.
+
+### Round 1 Plan
+
+Phase 1: `--detach` flag
+
+- `child_process.fork()` with detached: true, stdio: 'ignore'
+- Write session to `~/.vana/sessions/{source}-{timestamp}.json`
+- Print session ID on detach
+- `vana status` shows active sessions
+- Bell on complete
+
+Phase 2: Connection health model
+
+- Add `connectionHealth` to source state: healthy/degraded/needs_reauth/error
+- Update health after each collection attempt
+- `vana status` shows health per source
+
+Phase 3: `vana daemon`
+
+- Forked background process, PID file at `~/.vana/daemon.pid`
+- Unix socket at `~/.vana/daemon.sock` for CLI queries
+- Schedule: configurable per source in `~/.vana/config.json`
+- On auth failure: set needs_reauth, notify, pause source
+- On success: set healthy, update lastCollectedAt
+
+Phase 4: Notifications
+
+- Terminal bell (default, free)
+- Desktop notification via node-notifier (opt-in)
+- ntfy.sh push (opt-in, configurable)
+
+### Round 1 Critical Assessment
+
+**What's right:**
+
+- Hybrid matches real use cases
+- Connection health model mirrors Plaid's proven pattern
+- Notification tiers cover different user states (at desk, away, on phone)
+
+**What's wrong:**
+
+1. **The daemon is complex.** Unix sockets, PID files, process forking,
+ OS service registration. That's a lot of infrastructure for a CLI
+ that currently has zero background processes.
+2. **`child_process.fork()` in SEA binaries is fragile.** We know this
+ from Round 1 of the previous design session.
+3. **node-notifier is 12MB.** Heavy dependency for optional feature.
+4. **The health model requires heuristics for "degraded."** Browser
+ cookies don't have expiry dates exposed to us. We can only detect
+ failure after it happens, not predict it.
+5. **Schedule configuration adds a new config surface.** Now we have
+ `~/.vana/config.json`, `~/.vana/next-prompt.md`, and state file.
+
+---
+
+## Round 2: Brainstorm (informed by Round 1 assessment)
+
+Let me reconsider. What if the daemon is simpler than PM2?
+
+The Turborepo pattern: daemon spawns on-demand, does specific work,
+dies when idle. No permanent background process. No systemd integration.
+
+What if `vana daemon` is just:
+
+- `vana daemon start` forks a process that runs scheduled collections
+- It writes to a log file and updates state
+- `vana status` reads the state file (no socket needed)
+- `vana daemon stop` kills via PID file
+- If it dies, it dies. Next `vana connect` or `vana collect` spawns
+ it again if needed
+
+No Unix socket. No IPC. Just a forked process that runs collection
+tasks and writes results to files. The CLI reads those files.
+
+### Connection health (simplified)
+
+Drop "degraded" — we can't predict expiry. Just track:
+
+```
+healthy → last collection succeeded
+needs_reauth → last collection failed due to auth
+error → last collection failed (non-auth)
+stale → no collection in configured interval
+```
+
+These are all detectable from actual collection results. No heuristics.
+
+### Round 2 Design
+
+**`--detach` for one-off background:**
+
+```bash
+$ vana connect chatgpt --detach
+ Connecting ChatGPT in the background.
+ Check progress: vana status
+
+$ vana status
+ Vana Connect
+
+ Runtime installed
+ Personal Server http://localhost:8080
+ Sources 1 connected, 1 collecting
+
+ ChatGPT collecting (12m elapsed)
+ GitHub healthy (collected 2h ago)
+
+ Next: `vana data show github`
+```
+
+When complete:
+
+```
+$ vana status
+ ...
+ ChatGPT healthy (collected just now)
+ GitHub healthy (collected 2h ago)
+```
+
+**`vana daemon` for scheduled:**
+
+```bash
+$ vana daemon start
+ Daemon started. Collections will run on schedule.
+ Check: vana daemon status
+
+$ vana daemon status
+ Daemon running (PID 12345, uptime 2h)
+
+ GitHub every 24h next: 8h healthy
+ ChatGPT every 24h next: 12h healthy
+
+$ vana daemon stop
+ Daemon stopped.
+```
+
+**Auth failure handling:**
+
+When a scheduled collection hits auth failure:
+
+1. Source status → `needs_reauth`
+2. Write to `~/.vana/notifications.json` (append-only log)
+3. Next `vana status` shows it prominently
+4. `vana connect ` re-authenticates
+
+```
+$ vana status
+ ...
+ ChatGPT needs re-login (session expired 6h ago)
+
+ Next: `vana connect chatgpt`
+```
+
+No push notifications in v1. The status command is the notification
+surface. The next-prompt skill checks status and alerts the agent.
+
+### Round 2 Plan
+
+Phase 1: `--detach`
+
+- `child_process.spawn()` with `detached: true`, `stdio: ['ignore', logFd, logFd]`
+- Use `process.execPath` + `process.argv[1]` for binary path
+- Write `~/.vana/sessions/{source}.json` with PID, start time, status
+- `unref()` the child so parent can exit
+- `vana status` reads session files, checks if PID is alive
+
+Phase 2: Connection health
+
+- Add to state: `connectionHealth: 'healthy' | 'needs_reauth' | 'error' | 'stale'`
+- `runConnect`/`runCollect`: on auth failure → needs_reauth, on success → healthy
+- `vana status`: show health prominently, surface re-auth needs
+
+Phase 3: `vana daemon start/stop/status`
+
+- `start`: fork process, write PID to `~/.vana/daemon.pid`
+- Daemon loop: read schedule from state, run `vana collect --quiet`
+ for due sources, sleep until next
+- `stop`: read PID, send SIGTERM
+- `status`: read PID, check alive, show schedule and health
+- Schedule stored in state file: `sources.github.schedule: "24h"`
+
+Phase 4: `vana schedule` commands
+
+- `vana schedule github every 24h` — set schedule
+- `vana schedule list` — show schedules
+- `vana schedule remove github` — remove
+
+### Round 2 Critical Assessment
+
+**What's right:**
+
+- Much simpler than Round 1. No Unix sockets, no IPC, no node-notifier.
+- Connection health is based on actual results, not heuristics.
+- Status command is the notification surface. No new dependencies.
+- The daemon is a dumb loop. Easy to implement, easy to debug.
+- Agent integration: the next-prompt skill reads status, surfaces issues.
+
+**What's wrong:**
+
+1. **No push notifications.** If the user doesn't run `vana status` for
+ a week, they won't know ChatGPT needs re-auth. The agent covers
+ this IF the agent is running, but the user may not use an agent.
+2. **PID file approach is fragile.** Research showed socket-based liveness
+ is more robust. But sockets add complexity.
+3. **The daemon is a single point of failure.** If it crashes and no one
+ runs `vana status`, scheduled collections stop silently.
+4. **Schedule in state file mixes concerns.** State file is for collection
+ results, not configuration.
+
+---
+
+## Round 3: Brainstorm (final)
+
+What if the daemon and --detach are the SAME thing?
+
+`vana connect chatgpt --detach` spawns a background process for that
+one collection. Done. No permanent daemon.
+
+`vana collect --schedule 24h` adds a crontab entry (or launchd plist)
+that runs `vana collect --all --quiet` daily. OS handles scheduling.
+No custom daemon.
+
+`vana status` reads state files. Shows health. Shows what's scheduled
+(by reading crontab or launchd).
+
+This is the Whenever gem pattern from the research. The OS is the
+daemon. The CLI just configures it and reads results.
+
+### Round 3 Design
+
+**`--detach` for background connects:**
+
+```bash
+$ vana connect chatgpt --detach
+ Connecting ChatGPT in the background.
+ Check progress: vana status
+```
+
+Implementation: `child_process.spawn()` with detached + unref.
+Writes progress to `~/.vana/sessions/{source}.json`.
+
+**`vana schedule` for recurring collection:**
+
+```bash
+$ vana schedule add --every 24h
+ Added daily collection schedule.
+ Runs: vana collect --all --quiet
+ Managed by: launchd (macOS) / cron (Linux)
+
+$ vana schedule list
+ Daily collection every 24h next: 14h
+ Managed by: ~/Library/LaunchAgents/com.vana.collect.plist
+
+$ vana schedule remove
+ Removed daily collection schedule.
+```
+
+macOS: generates a launchd plist with `StartInterval: 86400`.
+Linux: adds a crontab entry.
+No custom daemon process. The OS runs `vana collect --all --quiet`.
+
+**Connection health in `vana status`:**
+
+```
+$ vana status
+ Vana Connect
+
+ Runtime installed
+ Personal Server http://localhost:8080
+
+ GitHub healthy collected 2h ago
+ ChatGPT needs login collected 3d ago
+ LinkedIn healthy collected 1d ago
+
+ Next: `vana connect chatgpt`
+```
+
+When a source needs re-auth, `vana status` shows it. The scheduled
+collection writes `needs_reauth` to state on auth failure and skips
+that source on subsequent runs until re-authenticated.
+
+### Round 3 Plan
+
+Phase 1: `--detach` flag on connect and collect
+
+- Spawn detached child process
+- Write session progress to `~/.vana/sessions/{source}.json`
+- Status reads session files + checks PID liveness
+- Bell on complete (if terminal still open)
+
+Phase 2: Connection health
+
+- `connectionHealth` field in source state
+- Set on every collection: healthy, needs_reauth, error
+- `vana status` shows health per source
+- `vana status --json` includes health for agent consumption
+
+Phase 3: `vana schedule add/list/remove`
+
+- macOS: generate launchd plist at ~/Library/LaunchAgents/com.vana.collect.plist
+- Linux: add crontab entry via `crontab -l | ... | crontab -`
+- Both run: `vana collect --all --quiet`
+- `schedule list` reads from launchd/crontab
+- `schedule remove` deletes plist/crontab entry
+
+Phase 4: Status as notification surface
+
+- `vana status` prominently shows sources needing re-auth
+- next-prompt skill checks health and alerts
+- No push notification dependencies in v1
+
+### Round 3 Critical Assessment
+
+**Is this the right design?**
+
+Yes. It follows the research closely:
+
+- `--detach` matches Docker/Vercel/Railway patterns (block by default, opt-out)
+- OS-level scheduling matches Whenever/Homebrew services (let the OS handle lifecycle)
+- Connection health matches Plaid's model (simplified to observable states)
+- No custom daemon (the OS IS the daemon)
+- No push notifications (status command + agent skill is the notification layer)
+
+**Risks:**
+
+1. **launchd/crontab generation is platform-specific.** Need to handle macOS,
+ Linux, and "neither" (Windows, unusual systems) gracefully.
+2. **`child_process.spawn()` with detached in SEA binaries** — needs testing.
+ The binary path detection must work for installed CLI, dev checkout, and
+ pnpm dlx.
+3. **Crontab manipulation can fail** — permission issues, no crontab installed,
+ user has existing entries.
+
+**Mitigations:**
+
+1. Detect platform, generate appropriate config, fail gracefully with
+ manual instructions if neither launchd nor crontab is available.
+2. Use `process.execPath` for SEA, detect `pnpm dlx` and warn that
+ scheduling requires an installed CLI.
+3. Use `crontab -l | grep -v vana | cat - new_entry | crontab -` pattern
+ to preserve existing entries.
+
+**Final answer: implement --detach + connection health + vana schedule
+using OS-native scheduling. No custom daemon.**
diff --git a/research/async-cli/design-rounds.md b/research/async-cli/design-rounds.md
new file mode 100644
index 00000000..3456de2d
--- /dev/null
+++ b/research/async-cli/design-rounds.md
@@ -0,0 +1,225 @@
+# Async CLI Design: Three Rounds
+
+## Research Summary (68 findings)
+
+**Sync vs async default:** 9/11 CLIs block by default. `--detach` is the standard opt-out flag. Vercel uses `--no-wait`. Docker uses `-d`. AWS is the exception (async-by-default with `wait` subcommands).
+
+**Long-running ops:** Vercel, Railway, Fly all stream progress during sync mode. Heroku's ctrl+c detaches without canceling (server-side continues). Docker has 5 progress output modes (auto/plain/tty/quiet/rawjson).
+
+**Notifications:** Terminal bell is universal fallback. macOS: terminal-notifier. Linux: notify-send. ntfy.sh for push. `gh run watch --exit-status` exits on completion for scripting. No CLI was found that does Slack/email by default.
+
+**Scheduled tasks:** Three tiers: crontab generation (Whenever), daemon process (PM2), cloud service (Vercel Cron, GH Actions). K8s CronJobs have the richest model. The simplest CLI pattern is generating crontab entries.
+
+**Process lifecycle:** PM2's God Daemon over Unix sockets is the gold standard for Node.js. PID files are fragile (stale PIDs, recycling). Socket-based liveness checks are more robust. Turborepo embeds a daemon that spawns on-demand.
+
+---
+
+## Round 1: Brainstorm
+
+What are ALL the ways `vana` could handle async/background/scheduled operations?
+
+1. **Status quo + longer timeout.** Don't change anything. The connect flow blocks. Users wait 1-10 minutes. Simple.
+
+2. **`--detach` flag.** `vana connect github --detach` starts collection, prints a session ID, returns immediately. `vana connect --status` or `vana status` shows progress. Terminal bell on completion.
+
+3. **Daemon mode.** `vana daemon start` runs a persistent background process (like PM2/Turborepo). Handles scheduled collections, notifications, IPC for credentials. `vana daemon stop` shuts it down.
+
+4. **Crontab generation.** `vana schedule github --every 24h` writes a crontab entry that runs `vana collect github` daily. No daemon, pure OS cron. Like Whenever gem.
+
+5. **OS service.** `vana service install` creates a launchd plist (macOS) or systemd unit (Linux) that runs vana as a service. Like Homebrew services.
+
+6. **Cloud-managed schedule.** Schedule runs on the Personal Server side. The CLI just configures it. Like Vercel Cron / GitHub Actions.
+
+7. **Agent-managed schedule.** The agent (Claude Code) runs `vana collect` on a schedule using its own mechanisms (hooks, loops). No CLI scheduling needed.
+
+8. **Fire-and-forget with webhook.** `vana connect github --notify webhook:https://...` starts collection and POSTs to a URL on completion.
+
+## Round 1: Design
+
+Given the research, the right approach for v1:
+
+**Block by default, offer `--detach`.** This matches 9/11 best-in-class CLIs. The connect flow already blocks. Add `--detach` that:
+
+- Forks the collection into a background process
+- Prints a session ID
+- Returns immediately
+- Writes progress to `~/.vana/sessions/{id}.json`
+- Terminal bell when done (if terminal is still open)
+
+**`vana connect --status [id]`** shows progress of detached sessions.
+
+**For scheduling: crontab generation (v1), daemon (v2).**
+
+- v1: `vana schedule add github --every 24h` generates a crontab entry
+- v2: `vana daemon` for richer scheduling, notifications, IPC
+
+## Round 1: Plan
+
+Phase 1: `--detach` flag
+
+- Fork child process with `child_process.fork()`
+- Write session state to `~/.vana/sessions/{id}.json`
+- Print session ID on detach
+- `vana status` shows active sessions
+- Terminal bell via `\a` on completion
+
+Phase 2: `vana schedule`
+
+- `vana schedule add --every `
+- Generates crontab entry: `0 */24 * * * vana collect --quiet`
+- `vana schedule list` shows scheduled tasks
+- `vana schedule remove `
+
+Phase 3: Notifications
+
+- Terminal bell by default (already done)
+- `--notify` flag: `vana connect github --notify bell` (default), `--notify desktop`, `--notify none`
+- Desktop notifications via node-notifier (lazy import)
+
+## Round 1: Critical Assessment
+
+**What's wrong with this design:**
+
+1. **`child_process.fork()` in a SEA binary is problematic.** The vana CLI is packaged as a single executable. Forking creates a new Node.js process that needs the same binary. `process.execPath` might not point to the right thing in all packaging scenarios.
+
+2. **Session state files add complexity.** Now we have `~/.vana/sessions/`, `~/.vana/results/`, `~/.vana/logs/`, state file, config. The data home is getting crowded.
+
+3. **Crontab generation is fragile.** Users who don't understand cron will struggle. `crontab -e` can destroy entries. No Windows support.
+
+4. **The daemon approach (v2) might be needed sooner.** If the agent wants to run `vana next` on a schedule, it needs something more than cron. PM2-style daemon with IPC is the right long-term answer.
+
+5. **`--detach` UX for agents is unclear.** If an agent detaches a connect, how does it know when it's done? Polling `vana connect --status`? That's the same polling problem as IPC.
+
+---
+
+## Round 2: Brainstorm (informed by Round 1 assessment)
+
+The fork/cron approach is too infrastructure-heavy for v1. What's the MINIMUM that solves the real problem?
+
+Real problems:
+
+- Users wait 1-10 minutes during connect (annoying but functional)
+- Agents can't run connect in the foreground (IPC solves this)
+- No way to re-collect on a schedule (users must remember to run commands)
+
+What if v1 is just:
+
+1. **Block by default** (already works)
+2. **Terminal bell on completion** (already done)
+3. **`vana collect --all`** re-collects all connected sources sequentially
+4. **The agent handles scheduling** via the next-prompt skill ("your GitHub data is 3 days old, recollect?")
+
+No `--detach`, no daemon, no cron. The agent IS the scheduler.
+
+## Round 2: Design
+
+**The minimal design:**
+
+`vana connect` blocks by default (no change). For agents using `--ipc`, the process runs in the background naturally (agent backgrounds it).
+
+`vana collect --all` re-collects all connected sources. The agent can run this periodically.
+
+`vana status --json` already shows `lastCollectedAt` per source. The next-prompt skill can check freshness and suggest recollection.
+
+No new infrastructure. The scheduling intelligence lives in the skill, not the CLI.
+
+**One addition: `--notify desktop`** for human users who run long connects.
+
+## Round 2: Plan
+
+Phase 1: `vana collect --all` (if not already implemented)
+
+- Iterate connected sources, run collect for each
+- Show per-source progress
+- Skip sources that need interactive auth (can't prompt mid-batch)
+
+Phase 2: Desktop notifications
+
+- `--notify desktop` flag on connect and collect
+- Lazy-import node-notifier
+- Default: terminal bell only (already done)
+
+Phase 3: Freshness in next-prompt skill
+
+- Skill checks `lastCollectedAt` for each source
+- If >24h old: "Your GitHub data is 3 days old. Recollect with `vana collect github`"
+- Agent can run the collect autonomously for interactive sources via IPC
+
+## Round 2: Critical Assessment
+
+**What's right:**
+
+- Minimal complexity. No new infrastructure.
+- The agent-as-scheduler is actually elegant. The skill already exists.
+- `collect --all` is a natural command users expect.
+
+**What's wrong:**
+
+1. **No solution for human users who want background connect.** They still wait 5 minutes watching a terminal. Every reference CLI offers detach for this.
+2. **Desktop notifications require a new dependency.** node-notifier is 12MB. Heavy for a CLI that prides itself on zero runtime deps.
+3. **Agent-as-scheduler requires the agent to be running.** If Claude Code isn't open, nothing collects. Not a background service.
+
+**The honest assessment:** The minimal design is probably right for RIGHT NOW. The connect flow is 1-10 minutes — not hours. Users can wait. Agents handle it via IPC. Scheduling via the skill works when the agent is running. When users demand background operation, that's the signal to build `--detach`.
+
+---
+
+## Round 3: Brainstorm (final)
+
+What if the answer is even simpler? Looking at the research again:
+
+- Heroku: `git push` blocks. No async. Nobody complains.
+- Stripe: `stripe listen` blocks. No async. Works fine.
+- Vercel: blocks by default. `--no-wait` exists but rarely used.
+
+The common thread: **if the operation takes <5 minutes, blocking is fine.** Async is for operations that take 10+ minutes or run indefinitely.
+
+Our connect flow: 1-5 minutes typically. That's blocking territory.
+
+The ONLY change needed: make the blocking experience BETTER.
+
+- The heartbeat bloom spinner already shows progress
+- Scope lines appear as data arrives
+- Terminal bell on completion
+- The experience IS designed
+
+For agents: IPC mode handles it (already built).
+For scheduling: the skill handles it (already built).
+
+## Round 3: Design
+
+**No new async infrastructure for v1.**
+
+Instead, improve what we have:
+
+1. **`vana collect --all`** — batch re-collection for convenience
+2. **Freshness awareness in next-prompt skill** — "your data is stale, recollect?"
+3. **Document the decision** — blocking is intentional, not missing
+
+Add to open issues for v2:
+
+- `--detach` when users report connect taking >10 minutes
+- `vana daemon` when scheduling demand emerges from real usage
+
+## Round 3: Plan
+
+Implement:
+
+1. Verify `vana collect --all` works (may already exist via `runCollectAll`)
+2. Add freshness check to next-prompt skill
+3. Document the async decision in CLI-OPEN-ISSUES.md
+
+## Round 3: Critical Assessment
+
+**Is this lazy or wise?**
+
+Wise. The research shows that blocking is the default for best-in-class CLIs when operations take <5 minutes. Adding async infrastructure for a <5 minute operation is over-engineering. The connect flow already has the best possible blocking UX (heartbeat spinner, scope manifest, terminal bell). IPC handles the agent case. The skill handles scheduling.
+
+**When would this become wrong?** When:
+
+- A connector regularly takes >10 minutes (large ChatGPT histories)
+- Users request background operation explicitly
+- The agent scheduling pattern proves unreliable
+
+Those are signals to build `--detach`, not predictions to engineer for now.
+
+**Final answer: implement `collect --all` + freshness in skill. File `--detach` for v2.**
diff --git a/research/async-cli/detach-ipc-design.md b/research/async-cli/detach-ipc-design.md
new file mode 100644
index 00000000..d06dc67c
--- /dev/null
+++ b/research/async-cli/detach-ipc-design.md
@@ -0,0 +1,105 @@
+# --detach + requestInput Design
+
+## The problem
+
+`vana connect chatgpt --detach` spawns a background process. The
+connector may need credentials (first time or expired session).
+The background process has no stdin. Who answers?
+
+## Scenarios
+
+### 1. First-time connect (no saved session)
+
+The connector will definitely need credentials. Detaching is pointless
+because the process will immediately pause waiting for auth.
+
+**What Plaid does:** You must complete Link (interactive auth) before
+any background data access works. There is no "detach the first auth."
+
+**Design:** Refuse to detach if no session exists. Tell the caller
+(human or agent) to authenticate first.
+
+### 2. Returning connect (saved session valid)
+
+The connector uses the saved browser cookies. No auth needed.
+Collection runs to completion in the background.
+
+**Design:** This is the happy path. `--detach` works perfectly.
+
+### 3. Returning connect (saved session expired)
+
+The connector starts, tries the saved session, it fails.
+Now what?
+
+**Option A: --detach implies --ipc**
+The connector writes a pending-input file and waits (up to 30 min).
+Problem: who polls for the file?
+
+- Human: unlikely to check within 30 minutes
+- Agent: possible if it's watching, but the agent may not be running
+- Result: likely timeout, wasted 30 minutes
+
+**Option B: --detach fails fast on auth**
+The connector hits auth failure, writes `needs_reauth` to state,
+exits with non-zero. No waiting.
+Problem: the user wanted background collection, got nothing.
+But: `vana status` shows the failure. Next interactive `vana connect`
+re-authenticates. Then `--detach` works again.
+
+**Option C: --detach + --ipc + notification**
+Like Option A but also writes a notification file. `vana status`
+shows "ChatGPT needs re-login (background collection paused)."
+Problem: still waits 30 minutes for nothing most of the time.
+
+**Option D: --detach fails fast + notifies**
+Like Option B but also emits a desktop notification or terminal bell
+on the original terminal (if still open).
+
+## What Plaid actually does
+
+When a background fetch hits auth failure:
+
+1. The fetch fails
+2. Item status changes to LOGIN_REQUIRED
+3. Plaid sends a webhook to the developer
+4. The developer shows a prompt to the user to re-authenticate
+5. User goes through Link update mode (abbreviated, not full re-setup)
+6. Background fetches resume
+
+Key insight: **Plaid does NOT wait for re-auth during the failed fetch.**
+It fails, records the state, notifies, and moves on. The re-auth
+happens separately, triggered by the user, at their convenience.
+
+## Design decision
+
+**Option B is correct.** It matches Plaid's pattern exactly:
+
+1. `--detach` spawns background process
+2. Connector tries saved session
+3. If auth fails: set `connectionHealth: 'needs_reauth'`, exit
+4. `vana status` shows it prominently
+5. User runs `vana connect chatgpt` to re-auth (interactive)
+6. Next `--detach` works again
+
+No IPC in detach mode. No 30-minute wait. Fail fast, record state,
+let the user re-auth at their convenience.
+
+`--ipc` remains available for agents that ARE actively watching
+(like Claude Code's background task flow).
+
+## Implementation
+
+`--detach` should:
+
+1. Check if source has been previously connected (has state entry)
+ - If not: refuse. "Run `vana connect chatgpt` first."
+2. Spawn child with `--json --quiet --no-input`
+ - NOT `--ipc`. If auth is needed, fail fast.
+3. Child runs, either succeeds or fails on auth
+4. On success: update state to healthy
+5. On auth failure: update state to needs_reauth
+6. `vana status` surfaces the result either way
+
+This means `--detach` is for RE-COLLECTION with existing sessions,
+not for first-time auth. Which is the correct mental model:
+detach = "do the thing I already set up, in the background."
diff --git a/research/async-cli/findings/merged.json b/research/async-cli/findings/merged.json
new file mode 100644
index 00000000..dbb3a11a
--- /dev/null
+++ b/research/async-cli/findings/merged.json
@@ -0,0 +1,1077 @@
+{
+ "files_merged": 8,
+ "total_findings": 105,
+ "deduplicated_from": 175,
+ "agent_questions": [
+ "Background process lifecycle management in CLIs",
+ "How best-in-class CLIs handle long-running operations",
+ "CLI task completion notification patterns",
+ "CLI scheduled and recurring task patterns",
+ "Sync vs async mode patterns in CLIs",
+ "Auth expiry and re-authentication patterns in data aggregation services",
+ "Daemon and background service patterns for Node.js CLIs"
+ ],
+ "findings": [
+ {
+ "name": "Apple HealthKit — persistent permissions, no expiry",
+ "what": "HealthKit uses a persistent authorization model: once a user grants read/write access to health data types, the permission persists until explicitly revoked in Settings > Health > Data Access & Devices. There is no token expiry or re-consent cycle. Apps can become 'inactive' data sources after iOS upgrades (a known bug/behavior), requiring users to toggle permissions in Health settings. Apps should check authorization status on foreground entry (willEnterForeground notification) to detect user-initiated revocations. No push notification system for permission changes.",
+ "evidence": "Apple Developer docs on HealthKit authorization. Apple Community forums document inactive data source behavior after iOS upgrades.",
+ "status": "live",
+ "source_url": "https://developer.apple.com/documentation/healthkit/authorizing-access-to-health-data",
+ "sub_theme": "persistent_permission_model"
+ },
+ {
+ "name": "AWS CLI: S3 transfers show progress by default with opt-out",
+ "what": "aws s3 cp shows file transfer progress by default. For large files (>100MB), it uses multipart upload and shows 'Completed X of Y part(s) with Z file(s) remaining'. --no-progress disables the progress display. --quiet suppresses all output. --only-show-errors shows only errors. --progress-frequency controls update interval. --progress-multiline shows progress on multiple lines.",
+ "evidence": "AWS docs: 'File transfer progress is displayed by default.' '--no-progress: File transfer progress is not displayed.' For large uploads: 'Completed 9896 of 9896 part(s) with 1 file(s) remaining.'",
+ "status": "live",
+ "source_urls": [
+ "https://docs.aws.amazon.com/cli/latest/reference/s3/cp.html"
+ ],
+ "sub_theme": "progress display",
+ "merged_from": 2
+ },
+ {
+ "name": "AWS CLI: wait subcommands poll silently until condition is met",
+ "what": "AWS CLI provides 'wait' subcommands for services with long-running operations (e.g., aws cloudformation wait stack-create-complete). These poll the API at fixed intervals (30 seconds for CloudFormation) until the condition is met or max attempts exceeded (120 for CloudFormation, producing exit code 255). The wait command produces NO terminal output -- it blocks silently. Operations themselves (e.g., aws cloudformation create-stack) return immediately with an ID; waiting is always opt-in via a separate command.",
+ "evidence": "AWS docs for stack-create-complete: 'It will poll every 30 seconds until a successful state has been reached. This will exit with a return code of 255 after 120 failed checks.' 'This command produces no output.'",
+ "status": "live",
+ "source_urls": [
+ "https://docs.aws.amazon.com/cli/latest/reference/cloudformation/wait/stack-create-complete.html"
+ ],
+ "sub_theme": "silent polling wait",
+ "merged_from": 2
+ },
+ {
+ "name": "AWS CloudFormation: deploy blocks, no official --no-wait",
+ "what": "aws cloudformation deploy blocks by default, waiting for the stack operation to complete. There is NO official --no-wait flag despite long-standing feature requests (since 2014). Workaround: use create-stack/update-stack (which return immediately) + custom polling with describe-stacks. The --no-execute-changeset flag is related but different (previews without executing).",
+ "evidence": "GitHub issue #895 (2014) requests synchronous create/update. Issue #12037 on serverless framework requests async deploys. AWS has not shipped --no-wait for deploy as of 2026.",
+ "status": "live",
+ "source_urls": ["https://github.com/aws/aws-cli/issues/895"],
+ "sub_theme": "foreground-default-no-async-escape",
+ "merged_from": 2
+ },
+ {
+ "name": "Banking app session expiry UX patterns",
+ "what": "Common patterns for session expiry in banking/financial apps: (1) Pre-timeout warning — gentle nudge before auto-logout with option to extend, (2) Biometric re-authentication on mobile — Face ID/Touch ID for frictionless re-auth without full login, (3) Silent token renewal in background — refresh tokens used transparently, (4) Context-aware security — adjust auth requirements based on user behavior, location, device trust level, (5) Push notification for suspicious login attempts (approve/deny model). Data-sensitive apps (banking) tend to auto-logout after idle periods rather than silently renewing, prioritizing security over convenience.",
+ "evidence": "Smashing Magazine 'Rethinking Authentication UX', OneSignal banking push notification patterns, HID Global blog on UX sweet spot for online banking.",
+ "status": "live",
+ "source_url": "https://www.smashingmagazine.com/2022/08/authentication-ux-design-guidelines/",
+ "sub_theme": "session_expiry_ux"
+ },
+ {
+ "name": "Bree — worker-thread-based Node.js job scheduler with graceful shutdown",
+ "what": "Bree spawns each job in a separate worker thread (sandboxed), supports cron expressions, human-friendly intervals, async/await, retries, throttling, and concurrency control. Graceful shutdown stops all workers cleanly. No daemon required — runs in-process. Croner is lighter (used by PM2 itself, Uptime Kuma) and works in Node, Deno, and browser. node-cron is the simplest (66KB) for basic cron scheduling. For a CLI daemon, Bree or Croner provides the scheduling layer while the daemon process itself handles lifecycle.",
+ "evidence": "Bree v9.1.3: worker threads, cron/dates/ms/human-friendly, retries, throttling, concurrency, graceful shutdown. Croner: used by PM2, Uptime Kuma, cross-runtime. node-cron v4.2.1: 66KB, basic cron.",
+ "status": "live",
+ "source_url": "https://jobscheduler.net/",
+ "sub_theme": "job_schedulers"
+ },
+ {
+ "name": "Claude Code hooks system (reference implementation)",
+ "what": "Claude Code implements a multi-level notification system: (1) terminal bell as default, (2) OSC 9/777 for desktop notifications, (3) configurable hooks that run shell commands on 'Stop' event. Users configure via `claude config set --global preferredNotifChannel terminal_bell`. Hook scripts can call terminal-notifier, notify-send, or ntfy.sh. This is the closest prior art to what Vana Connect CLI needs.",
+ "evidence": "Multiple blog posts documenting setup. GitHub gist by michael-swann-rp shows complete multi-level notification config. Claude Code docs at code.claude.com/docs/en/hooks-guide.",
+ "status": "live",
+ "source_urls": ["https://code.claude.com/docs/en/hooks-guide"],
+ "sub_theme": "reference-implementation",
+ "merged_from": 2
+ },
+ {
+ "name": "CLI convention: flag naming consensus",
+ "what": "Three dominant flag names emerge: --detach/-d (Docker, Fly.io, Railway), --no-wait (Vercel, AWS requests), and --background (less common, mostly OS-level). --detach implies the process continues server-side. --no-wait implies you don't care about the outcome. --background implies a local background process. For deploy/connect operations that run server-side, --detach is the most conventional name.",
+ "evidence": "clig.dev: 'Use standard names for flags, if there is a standard. If another commonly used command uses a flag name, it's best to follow that existing pattern.'",
+ "status": "live",
+ "source_urls": ["https://clig.dev/"],
+ "sub_theme": "naming-conventions",
+ "merged_from": 2
+ },
+ {
+ "name": "Cronie (modern cron daemon)",
+ "what": "Modern implementation of the standard UNIX cron daemon (evolved from vixie-cron). Provides `crontab` CLI for editing/viewing jobs with syntax validation and automatic backup, plus `cronnext` utility for querying upcoming execution times. Adds PAM/SELinux integration, tilde (~) operator for randomization within ranges, support for up to 10,000 entries per user, and NO_MAIL_OUTPUT env var. Not a scheduling CLI itself but the underlying daemon that crontab-wrapper CLIs (like Whenever) write to.",
+ "evidence": "cronnext utility for querying upcoming job execution times. Tilde operator (~) for randomization within ranges. Up to 10,000 crontab entries per user. PAM and SELinux integration. 56 contributors, 611 commits.",
+ "status": "live",
+ "source_urls": ["https://github.com/cronie-crond/cronie"],
+ "sub_theme": "cron daemon",
+ "merged_from": 2
+ },
+ {
+ "name": "Cross-cutting pattern: notification channel preferences",
+ "what": "Services use different notification channels for auth expiry: (1) Webhooks to developer servers — Plaid (PENDING_EXPIRATION, ITEM:ERROR), Stripe (account.application.deauthorized), MX (connection status change callbacks). This is the primary channel for B2B2C services. (2) In-app prompts — shown when user next opens the app, common in consumer apps (banking, fitness). (3) Push notifications — used by banking apps for session expiry and login approval. (4) Email — used by some services for consent renewal reminders. (5) No notification (silent failure) — Google silently invalidates tokens, Strava returns 401 on next API call. For a CLI tool doing background collection, the closest analog is the webhook model: the system detects failure and queues a notification for the user's next interactive session.",
+ "evidence": "Plaid webhooks, Stripe webhooks, MX callbacks, banking app push notifications, Google silent revocation patterns.",
+ "status": "live",
+ "source_url": "",
+ "sub_theme": "notification_channels"
+ },
+ {
+ "name": "Cross-cutting pattern: re-auth UX spectrum from minimal to full",
+ "what": "Re-authentication complexity varies: (1) Silent/automatic — refresh token exchange, no user involvement (Strava, Google, Salesforce when tokens valid), (2) Minimal/abbreviated — single MFA prompt or consent checkbox without full re-login (Plaid UK update mode, biometric re-auth in banking apps), (3) Redirect to provider — user sent to institution's OAuth page to re-authorize (Plaid EU, Garmin-Strava reconnect), (4) Full re-setup — complete teardown and reconnect from scratch (MX PREVENTED status after 3 failures, Strava disconnect+reconnect). For browser-automation-based collection: the equivalent of (1) is session cookies still valid, (2) is re-entering MFA/OTP in automated browser, (3) is launching a full interactive browser session for login, (4) is clearing all cookies and starting over.",
+ "evidence": "Plaid abbreviated update mode, MX PREVENTED requiring new aggregation, Strava disconnect/reconnect pattern, banking biometric re-auth.",
+ "status": "live",
+ "source_url": "",
+ "sub_theme": "reauth_complexity_spectrum"
+ },
+ {
+ "name": "Cross-cutting pattern: three-tier connection health model",
+ "what": "Across all services studied, connection health follows a three-tier model: (1) HEALTHY — data flows normally, background updates succeed; (2) DEGRADED/PENDING — connection will expire soon or is experiencing intermittent failures, advance warning sent (Plaid: 7 days, Google: none/silent); (3) BROKEN — requires user interaction to repair (re-login, re-consent, MFA). The key differentiator is whether the transition from DEGRADED to BROKEN is predictable (consent expiry with known date) or unpredictable (user changes password, admin revokes, platform security flag). Services with predictable expiry (Plaid, Salesforce) can send advance warnings. Services with unpredictable revocation (Google, Stripe) can only react after the fact.",
+ "evidence": "Synthesized from Plaid (PENDING_EXPIRATION -> ITEM_LOGIN_REQUIRED), MX (CONNECTED -> CHALLENGED/DENIED/EXPIRED), Google (valid -> invalid_grant), Strava (valid -> 401).",
+ "status": "live",
+ "source_url": "",
+ "sub_theme": "universal_state_model"
+ },
+ {
+ "name": "Cross-platform startup — PM2 startup for *nix, launchd plist, systemd unit, Windows Task Scheduler",
+ "what": "For a Node.js CLI daemon, the cross-platform startup matrix: macOS uses launchd (~/Library/LaunchAgents for user-level, /Library/LaunchDaemons for system), Linux uses systemd user units (~/.config/systemd/user/), Windows uses Task Scheduler or node-windows for native service registration. PM2's pm2 startup command abstracts this by detecting the init system and generating the right script. For a standalone CLI, generating platform-specific service files (plist XML on macOS, .service INI on Linux) gives maximum control. The brew services pattern of thin wrappers around OS service managers is the most maintainable approach.",
+ "evidence": "PM2 auto-detects init system: systemd, launchd, upstart, systemv. brew services generates plists for macOS, systemd units for Linux. Windows requires Task Scheduler or node-windows/pm2-windows-service.",
+ "status": "live",
+ "source_url": "https://blog.appsignal.com/2022/03/09/a-complete-guide-to-nodejs-process-management-with-pm2.html",
+ "sub_theme": "cross_platform_startup"
+ },
+ {
+ "name": "Dkron (distributed cron)",
+ "what": "Distributed, fault-tolerant job scheduling system built in Go using Raft consensus and Serf gossip protocol. Provides web UI, REST API, and CLI for job management. Eliminates single points of failure by distributing scheduling across multiple nodes. Scales to thousands of nodes. Deployed via Docker Compose with `docker compose up -d --scale dkron-server=4 --scale dkron-agent=10`. Web UI at port 8080. Primarily designed for infrastructure teams, not end-user CLIs. 4.7k GitHub stars.",
+ "evidence": "docker compose up -d --scale dkron-server=4 --scale dkron-agent=10. Web UI: http://localhost:8080/ui. Built on Raft consensus + Serf gossip protocol. 'Able to handle high volumes of scheduled jobs and thousands of nodes.'",
+ "status": "live",
+ "source_urls": ["https://github.com/distribworks/dkron"],
+ "sub_theme": "distributed scheduler",
+ "merged_from": 2
+ },
+ {
+ "name": "docker compose up: foreground default, -d for detach",
+ "what": "docker compose up blocks in the foreground by default, streaming logs from all services. The -d/--detach flag starts containers in the background. This mirrors docker run's pattern and is the most widely-known example of the --detach convention.",
+ "evidence": "Docker docs: 'Running docker compose up --detach starts the containers in the background and leaves them running.'",
+ "status": "live",
+ "source_urls": [
+ "https://docs.docker.com/reference/cli/docker/compose/up/"
+ ],
+ "sub_theme": "foreground-default-detach-opt-in",
+ "merged_from": 2
+ },
+ {
+ "name": "Docker daemon (dockerd)",
+ "what": "Runs as a persistent daemon process managed by systemd. Uses --pidfile flag (default /var/run/docker.pid) to store process ID. Communicates via Unix domain socket at /var/run/docker.sock, with optional TCP binding for remote access. Supports systemd socket activation via fd:// syntax. SIGHUP reloads configuration without restart; systemd handles crash recovery and auto-restart. Multiple daemon instances require unique PID file and socket paths. The docker CLI is a thin client that sends commands over the socket; the daemon does all container management.",
+ "evidence": "dockerd docs: '--pidfile=/var/run/docker.pid' flag, '-H unix:///var/run/docker.sock' default socket, SIGHUP for config reload. PR #41465 added PIDFile to docker.service unit so systemd reliably cleans up stale PID files on crash.",
+ "status": "live",
+ "source_urls": ["https://docs.docker.com/reference/cli/dockerd/"],
+ "sub_theme": "daemon_with_systemd",
+ "merged_from": 2
+ },
+ {
+ "name": "Docker daemon — systemd socket activation for on-demand startup",
+ "what": "Docker supports systemd socket activation: docker.socket creates /var/run/docker.sock, and dockerd only starts when something connects. The daemon can listen on Unix, TCP, or fd:// (systemd file descriptors). PID file at /var/run/docker.pid prevents duplicate instances. The REST API is exposed over the Unix socket. Client-server model means CLI (docker) and daemon (dockerd) are fully decoupled — the daemon persists state independently of any CLI session.",
+ "evidence": "dockerd -H fd:// for systemd socket activation. Default socket at /var/run/docker.sock. PIDFile in docker.service unit. REST API over Unix socket. Supports unix, tcp, fd socket types.",
+ "status": "live",
+ "source_url": "https://docs.docker.com/reference/cli/dockerd/",
+ "sub_theme": "docker_daemon"
+ },
+ {
+ "name": "Docker restart policies (not true scheduling)",
+ "what": "Docker provides --restart flags (no, on-failure[:max-retries], always, unless-stopped) for automatic container restart, but these are recovery mechanisms, not schedulers. An increasing delay (doubling from 100ms, max 1 min) is added before each restart, resetting after 10 seconds of successful runtime. For actual scheduled/recurring container execution, Docker relies on external tools (host cron, Kubernetes CronJobs, Supercronic inside containers). Health checks (--health-cmd, --health-interval, --health-retries) complement restart policies but don't provide scheduling.",
+ "evidence": "docker run --restart=on-failure:10 redis. 'An increasing delay (double the previous delay, starting at 100 milliseconds) is added before each restart.' Health checks: --health-cmd, --health-interval, --health-timeout, --health-retries.",
+ "status": "live",
+ "source_urls": [
+ "https://docs.docker.com/reference/cli/docker/container/run/"
+ ],
+ "sub_theme": "container restart (not scheduling)",
+ "merged_from": 2
+ },
+ {
+ "name": "Docker: docker build --progress controls output format (auto/plain/tty/quiet/rawjson)",
+ "what": "docker build supports five progress output modes. 'auto' (default) selects tty for terminals, plain otherwise. 'tty' uses color and dynamic redrawing with progress bars. 'plain' prints raw build progress as plaintext with step numbers and timing. 'quiet' suppresses all output except final image ID. 'rawjson' outputs JSON lines for programmatic consumption. This demonstrates adaptive output based on terminal capabilities.",
+ "evidence": "Docker docs: '--progress: Set type of progress output (auto, plain, tty, quiet, rawjson). Use plain to show container output (default \"auto\").' 'auto: Uses tty mode if the client is a TTY, otherwise uses plain.'",
+ "status": "live",
+ "source_urls": [
+ "https://docs.docker.com/reference/cli/docker/buildx/build/"
+ ],
+ "sub_theme": "adaptive output format",
+ "merged_from": 2
+ },
+ {
+ "name": "Docker: docker run -d (detached) vs foreground is the canonical pattern",
+ "what": "docker run defaults to foreground (attached) mode with stdin/stdout/stderr connected. The -d/--detach flag runs the container in the background, printing only the container ID. In detached mode, closing the terminal does not stop the container. Users check status with 'docker ps', view output with 'docker logs', and re-attach with 'docker attach'. This is the most well-known sync/async dual-mode pattern in CLIs.",
+ "evidence": "Docker docs: 'If you want to run the container in the background, you can use the --detach (or -d) flag.' 'Detached mode means that a Docker container runs in the background of your terminal. It does not receive input or display output.'",
+ "status": "live",
+ "source_urls": ["https://docs.docker.com/engine/containers/run/"],
+ "sub_theme": "canonical detach pattern",
+ "merged_from": 2
+ },
+ {
+ "name": "Docker: foreground by default, -d to detach",
+ "what": "docker run blocks in foreground by default, attaching stdin/stdout/stderr to the terminal. The -d (--detach) flag runs the container in the background and prints only the container ID. Rationale: foreground is the safe, observable default — you see errors immediately. Detached mode is for production/daemon use cases.",
+ "evidence": "Docker docs: 'By default, Docker runs the container in attached mode.' The -d flag 'means that a Docker container runs in the background of your terminal. It does not receive input or display output.' After -d, the user sees only the container ID hash.",
+ "status": "live",
+ "source_urls": ["https://docs.docker.com/engine/containers/run/"],
+ "sub_theme": "foreground-default-detach-opt-in",
+ "merged_from": 2
+ },
+ {
+ "name": "dschep/ntfy (automatic shell notifications)",
+ "what": "Python utility that auto-detects long-running commands (>10s default) and sends desktop notifications when they finish. Shell integration for bash/zsh hooks into PROMPT_COMMAND / precmd. Supports backends: desktop (dbus), Pushover, Pushbullet, Slack, Telegram. Only notifies when terminal is NOT focused. The most 'magic' solution — requires no per-command opt-in.",
+ "evidence": "GitHub repo dschep/ntfy. auto-ntfy-done.sh provides automatic shell integration. Configurable via LONG_RUNNING_COMMAND_TIMEOUT.",
+ "status": "live",
+ "source_urls": ["https://github.com/dschep/ntfy"],
+ "sub_theme": "automatic-shell-integration",
+ "merged_from": 2
+ },
+ {
+ "name": "Electron tray pattern — minimize to system tray for persistent background operation",
+ "what": "Electron apps persist in the background by intercepting window-all-closed (don't quit) and window close events (hide instead of destroy). A Tray icon in the OS system tray (macOS menu bar, Windows taskbar) provides a menu for re-opening windows or quitting. The app process stays alive with no visible windows. This pattern is relevant for a companion desktop app but not directly for a headless CLI daemon. The key insight: users expect background processes to be visible/controllable via system tray, not invisible.",
+ "evidence": "Listen to window-all-closed, don't call app.quit(). Hide windows on close instead of destroying. Create Tray with menu for show/quit. Process persists with no visible windows.",
+ "status": "live",
+ "source_url": "https://moinism.medium.com/how-to-keep-an-electron-app-running-in-the-background-f6a7c0e1ee4f",
+ "sub_theme": "electron_background"
+ },
+ {
+ "name": "Finicity (Mastercard Open Banking) — aggregation status codes",
+ "what": "Finicity uses numeric aggregation status codes to represent connection health. MFA challenges are handled inline during aggregation — the API returns MFA questions and an mfa_session token, and the app must relay answers back. Finicity's test bank continually returns MFA challenges for integration testing. OAuth connections are handled via institution-managed tokens. The aggregation status code system provides granular feedback on why a connection succeeded or failed, similar to HTTP status codes for data aggregation.",
+ "evidence": "Finicity docs (now Mastercard Open Banking): authentication-and-integration docs describe MFA challenge flow with mfa_session headers. GitHub client libraries show MFA handling patterns.",
+ "status": "live",
+ "source_url": "https://developer.mastercard.com/open-banking-us/documentation/",
+ "sub_theme": "mfa_challenge_handling"
+ },
+ {
+ "name": "Fly.io: --detach flag may not work as documented",
+ "what": "Community testing suggests the --detach flag may not function correctly. One user reported: 'it ignored the --detach flag completely -- and that matches what I see in the flyctl source code.' There is no documented re-attach mechanism after detaching.",
+ "evidence": "Fly.io community forum post: user reports --detach flag is non-functional based on testing and source code review.",
+ "status": "live",
+ "source_urls": [
+ "https://community.fly.io/t/how-fly-deploy-detach-works/25607"
+ ],
+ "sub_theme": "detach reliability",
+ "merged_from": 2
+ },
+ {
+ "name": "Fly.io: deploy blocks with live monitoring, --detach to return immediately",
+ "what": "fly deploy blocks by default, showing live deployment progress as machines transition through states until healthy. The --detach flag causes the command to 'return immediately instead of monitoring deployment progress.' Important: --detach also disables automatic rollback on failed health checks, which is poorly documented.",
+ "evidence": "Fly.io docs: '--detach: Return immediately instead of monitoring deployment progress.' Community discussion notes that --detach disabling rollbacks should be documented in CLI help.",
+ "status": "live",
+ "source_urls": ["https://fly.io/docs/flyctl/deploy/"],
+ "sub_theme": "foreground-default-detach-opt-in",
+ "merged_from": 2
+ },
+ {
+ "name": "Fly.io: fly deploy blocks with monitoring, supports --detach flag",
+ "what": "fly deploy builds and deploys, then enters a 'Monitoring Deployment' phase showing machine status counts (e.g., '1 desired, 1 placed, 1 healthy, 0 unhealthy') and health check results. The terminal displays 'You can detach the terminal anytime without stopping the deployment.' The --detach flag returns immediately instead of monitoring. --wait-timeout (default 5m0s) controls how long to wait for machines to become healthy. Smoke checks monitor machines for ~10 seconds after start.",
+ "evidence": "Fly docs: '--detach: Return immediately instead of monitoring deployment progress.' '--wait-timeout: Time duration to wait for individual machines to transition states and become healthy. (default 5m0s).' Community posts show output: 'You can detach the terminal anytime without stopping the deployment' followed by 'Monitoring deployment' with machine status counts.",
+ "status": "live",
+ "source_urls": ["https://fly.io/docs/flyctl/deploy/"],
+ "sub_theme": "blocking with detach",
+ "merged_from": 2
+ },
+ {
+ "name": "forever (Node.js)",
+ "what": "Legacy Node.js daemon manager (now recommends pm2/nodemon for new projects). Runs a Flatiron server as daemon to manage child processes. PID files stored in ~/.forever/ directory. Supports --pidFile for custom PID location. Has built-in cleanup for extraneous/stale PID files (forever.cleanUp() emits 'cleanUp' event). Auto-restarts crashed processes. 'forever start' daemonizes; 'forever stop/stopall' terminates; 'forever list' shows running processes.",
+ "evidence": "npm docs: 'A simple CLI tool for ensuring that a given script runs continuously (i.e. forever)', default files in $HOME/.forever, cleanUp functionality for stale PIDs.",
+ "status": "live",
+ "source_urls": ["https://github.com/foreversd/forever"],
+ "sub_theme": "node_daemon_manager",
+ "merged_from": 2
+ },
+ {
+ "name": "Gatsby develop (foreground long-running CLI)",
+ "what": "Runs as a foreground process with file watching via webpack. No daemon mode -- process lives in the terminal. Uses port-based conflict detection: checks if port 8000 is in use and warns 'Looks like develop for this site is already running'. Known issues with Ctrl+C not properly killing the process (zombie process problem). Relies on the user's terminal for lifecycle management. No PID file, no daemon mode, no auto-restart.",
+ "evidence": "GitHub issue #5810: 'Ctrl+C not quitting CLI'. Discussion #26869: infinite loop with stale port detection. Gatsby develop is purely foreground with webpack-dev-server.",
+ "status": "live",
+ "source_urls": ["https://www.gatsbyjs.com/docs/reference/gatsby-cli/"],
+ "sub_theme": "foreground_dev_server",
+ "merged_from": 2
+ },
+ {
+ "name": "gh run watch (GitHub CLI polling pattern)",
+ "what": "GitHub CLI's `gh run watch` polls a CI run and streams progress with a live-updating table, then exits 0/1 on completion. It does NOT send any notification itself — the user chains it: `gh run watch && notify-send 'done'`. This is the canonical 'poll-then-exit' pattern: the CLI's job is to block until done, and the user composes notification on top.",
+ "evidence": "gh run watch docs at cli.github.com. Stuart Leeks blog post shows chaining with Windows toast notifications. GitHub blog post from 2021 introducing the feature.",
+ "status": "live",
+ "source_urls": ["https://cli.github.com/manual/gh_run_watch"],
+ "sub_theme": "poll-then-exit",
+ "merged_from": 2
+ },
+ {
+ "name": "GitHub Actions scheduled workflows (gh CLI)",
+ "what": "GitHub Actions supports `schedule` triggers using POSIX cron syntax in workflow YAML files. The `gh` CLI does not create schedules directly—users write `on: schedule: - cron: '...'` in .github/workflows/*.yml. The CLI manages workflows after creation: `gh workflow list/view/run/enable/disable` and `gh run list -e schedule` to filter scheduled runs. State lives in GitHub's cloud. Minimum interval: every 5 minutes. UTC only. Scheduled workflows on public repos auto-disable after 60 days of inactivity. Runs can be delayed during high load. Failures show as failed runs in `gh run list -s failure`.",
+ "evidence": "on: schedule: - cron: '30 5 * * 1,3'. gh workflow list; gh workflow run ; gh run list -e schedule -s completed. 'The shortest interval you can run scheduled workflows is once every 5 minutes.' 'In a public repository, scheduled workflows are automatically disabled when no repository activity has occurred in 60 days.'",
+ "status": "live",
+ "source_urls": [
+ "https://docs.github.com/en/actions/using-workflows/events-that-trigger-workflows#schedule"
+ ],
+ "sub_theme": "cloud-managed scheduler",
+ "merged_from": 2
+ },
+ {
+ "name": "GitHub CLI: gh run watch polls and displays workflow progress",
+ "what": "gh run watch monitors a GitHub Actions workflow run until it completes. It polls at a configurable interval (--interval, default 3 seconds) and displays step-by-step progress. --compact shows only relevant/failed steps. --exit-status exits with non-zero on failure. Without arguments, interactively selects from active runs. Can be chained: 'gh run watch && notify-send \"done\"'. Closing the terminal does not cancel the workflow -- the operation is server-side.",
+ "evidence": "GitHub CLI docs: 'Watch a run until it completes, showing its progress.' '--interval: Refresh interval in seconds (default 3).' '--exit-status: Exit with non-zero status if run fails.'",
+ "status": "live",
+ "source_urls": ["https://cli.github.com/manual/gh_run_watch"],
+ "sub_theme": "separate watch command",
+ "merged_from": 2
+ },
+ {
+ "name": "GitHub CLI: gh workflow run fires and forgets, separate from watching",
+ "what": "gh workflow run triggers a workflow dispatch event and returns immediately. It does not wait for or monitor execution. Users must separately use 'gh run watch' or 'gh run view' to monitor. This is a clean separation: trigger is async, monitoring is opt-in via a separate command.",
+ "evidence": "GitHub CLI manual separates 'gh workflow run' (trigger) from 'gh run watch' (monitor) as distinct commands.",
+ "status": "live",
+ "source_urls": ["https://cli.github.com/manual/gh_workflow_run"],
+ "sub_theme": "fire-and-forget with separate monitor",
+ "merged_from": 2
+ },
+ {
+ "name": "GitHub CLI: trigger is async, watch is opt-in sync",
+ "what": "gh workflow run triggers a workflow and returns immediately (async). gh run watch blocks and polls every 3 seconds showing live progress until the run completes. gh run list shows a snapshot of recent runs. This is the two-command pattern: fire-and-forget trigger + optional blocking monitor.",
+ "evidence": "GitHub CLI docs: 'gh run watch watches a run until it completes, showing its progress.' The interface refreshes every 3 seconds. --compact shows only relevant/failed steps.",
+ "status": "live",
+ "source_urls": ["https://cli.github.com/manual/gh_run_watch"],
+ "sub_theme": "async-default-opt-in-watch",
+ "merged_from": 2
+ },
+ {
+ "name": "Google OAuth — multiple silent revocation triggers",
+ "what": "Google refresh tokens can be revoked/expired by: (1) 6-month inactivity — unused tokens auto-invalidate, (2) Password change — revokes tokens with Gmail scopes, (3) User-initiated removal from Google Account settings, (4) Testing-mode apps — 7-day automatic expiry, (5) 100-token-per-client ceiling — oldest tokens silently invalidated when exceeded, (6) Workspace admin policy changes restricting scopes or setting time-limited access, (7) Undocumented security heuristics. Access tokens expire after 1 hour (extendable to 12 hours). Mitigation: touch tokens every few days, persist rotated refresh tokens, monitor invalid_grant spikes, move to production status.",
+ "evidence": "Nango blog 'Google OAuth invalid_grant': documents all revocation triggers including 6-month inactivity, 100-token ceiling, testing-mode 7-day expiry. Google OAuth docs confirm 1-hour access token, 6-month inactivity rule.",
+ "status": "live",
+ "source_url": "https://nango.dev/blog/google-oauth-invalid-grant-token-has-been-expired-or-revoked",
+ "sub_theme": "silent_revocation_triggers"
+ },
+ {
+ "name": "Heroku Scheduler",
+ "what": "Free add-on that runs one-off dyno commands at fixed intervals (every 10 min, hourly, or daily). Configuration is done via a web dashboard opened with `heroku addons:open scheduler`—there is no declarative config file or CLI flag for defining schedules. State lives entirely in Heroku's cloud. Monitoring is via `heroku logs --ps scheduler.1`. No built-in retry, no failure notifications, and execution is best-effort ('known to occasionally miss'). For reliable scheduling, Heroku recommends a custom clock process instead. Key limitation: only 3 interval choices, no cron expressions, dashboard-only config.",
+ "evidence": "heroku addons:create scheduler:standard; heroku addons:open scheduler; heroku logs --ps scheduler.1. Task definition: Task: rake update_feed, Frequency: Hourly, Time: :30. Jobs execute as one-off dynos named scheduler.X visible via heroku ps.",
+ "status": "live",
+ "source_urls": ["https://devcenter.heroku.com/articles/scheduler"],
+ "sub_theme": "cloud-managed scheduler",
+ "merged_from": 2
+ },
+ {
+ "name": "Heroku: git push blocks, but Ctrl+C detaches without canceling",
+ "what": "git push heroku main blocks by default, streaming build output in real-time (compiling, installing dependencies, etc.). Pressing Ctrl+C detaches from the build output but does NOT cancel the build or deploy -- it continues server-side and creates a new release when complete. Status can be checked afterward with 'heroku releases' (release history), 'heroku ps' (running processes), and 'heroku releases:output' (build output for a specific release).",
+ "evidence": "Heroku docs: 'After you initiate a Heroku deploy with git push, you can detach from the resulting build process by pressing Ctrl + C. Detaching doesn't cancel the build or the deploy. The build continues in the background and creates a new release as soon as it completes.' | Heroku docs: 'After you initiate a Heroku deploy with git push, you can detach from the resulting build process by pressing Ctrl + C. Detaching doesn't cancel the build or the deploy.' | Heroku docs: 'After you initiate a Heroku deploy with git push, you can detach from the resulting build process by pressing Ctrl + C. Detaching doesn't cancel the build or the deploy. The build continues in the background and creates a new release as soon as it completes.' | Heroku docs: 'After you initiate a Heroku deploy with git push, you can detach from the resulting build process by pressing Ctrl + C. Detaching doesn't cancel the build or the deploy.'",
+ "status": "live",
+ "source_urls": ["https://devcenter.heroku.com/articles/git"],
+ "sub_theme": "Ctrl+C detach pattern",
+ "merged_from": 3
+ },
+ {
+ "name": "Homebrew Services (brew services)",
+ "what": "CLI that manages OS-native services (launchd on macOS, systemd on Linux). Commands: `brew services start/stop/restart/run/list/info/kill/cleanup`. `start` both launches and registers for auto-start at login/boot. `run` launches without auto-start registration. Automatically generates plist (macOS) or systemd unit files. Service files live in ~/Library/LaunchAgents or ~/.config/systemd/user (user-level) or system directories (root). Monitoring via `brew services list --json` and OS-level tools (journalctl, launchctl). No built-in cron/scheduling syntax—it delegates to the formula's service definition. Mainly manages long-running daemons, not one-shot scheduled tasks.",
+ "evidence": "Plist includes KeepAlive and RunAtLoad. Files at ~/Library/LaunchAgents/homebrew.mxcl..plist. Linux uses systemd user units. | brew services start ; brew services list --json; brew services run (no auto-start). macOS: ~/Library/LaunchAgents. Linux: ~/.config/systemd/user. --file for custom service files. brew services cleanup removes unused definitions. | Plist includes KeepAlive and RunAtLoad. Files at ~/Library/LaunchAgents/homebrew.mxcl..plist. Linux uses systemd user units. | brew services start ; brew services list --json; brew services run (no auto-start). macOS: ~/Library/LaunchAgents. Linux: ~/.config/systemd/user. --file for custom service files. brew services cleanup removes unused definitions.",
+ "status": "live",
+ "source_urls": [
+ "https://www.dorokhovich.com/blog/homebrew-services",
+ "https://docs.brew.sh/Manpage#services-subcommand"
+ ],
+ "sub_theme": "OS service integration",
+ "merged_from": 3
+ },
+ {
+ "name": "Homebrew services — thin wrapper generating launchd plists and systemd units",
+ "what": "brew services start generates a .plist XML file and symlinks it to ~/Library/LaunchAgents (user-level) or /Library/LaunchDaemons (root/boot-level). The plist contains: Label, ProgramArguments, RunAtLoad, KeepAlive, WorkingDirectory, StandardOutPath, StandardErrorPath. On Linux, brew services generates systemd user units instead. The formula's #startup_plist method defines the service configuration. It's a thin mapping layer — all actual supervision is delegated to launchd/systemd.",
+ "evidence": "brew services creates .plist symlinks to ~/Library/LaunchAgents or /Library/LaunchDaemons. On Linux, uses systemd user units. Formula implements #startup_plist for config.",
+ "status": "live",
+ "source_url": "https://www.dorokhovich.com/blog/homebrew-services",
+ "sub_theme": "homebrew_services"
+ },
+ {
+ "name": "Inngest — serverless event-driven workflow alternative",
+ "what": "Inngest takes a serverless, event-driven approach vs Temporal's dedicated-server model. Uses native language primitives (no runtime proxying), making debugging simpler. The Inngest platform manages event log, state storage, and scheduling. Better for cloud-deployed workflows. Less relevant for a local CLI daemon, but the event-driven step function model (each step is independently retryable) is a useful design pattern for scheduled collection tasks.",
+ "evidence": "Inngest uses native language primitives for direct execution. Platform handles event log, state storage, scheduling. Serverless model vs Temporal's worker model.",
+ "status": "live",
+ "source_url": "https://www.inngest.com/compare-to-temporal",
+ "sub_theme": "workflow_orchestration"
+ },
+ {
+ "name": "IPC mechanism comparison — Unix sockets best for same-machine Node.js",
+ "what": "Benchmarks show Unix sockets and named pipes have virtually identical performance, while TCP is 20-40% slower. The bottleneck is serialization/deserialization, not the transport. For small messages (<100 bytes), named pipes edge out; for larger messages (10KB+), sockets are faster. The node-ipc library provides a unified API that auto-converts Unix socket paths to Windows named pipes. For a Node.js CLI daemon, Unix domain sockets are the recommended IPC mechanism: fast, simple, no port conflicts, and the socket file doubles as a liveness indicator.",
+ "evidence": "Practically no performance difference between native pipe and unix sockets; TCP is 20-40% slower. node-ipc auto-converts Unix socket paths to Windows named pipes. Bottleneck is serialization, not transport.",
+ "status": "live",
+ "source_url": "https://60devs.com/performance-of-inter-process-communications-in-nodejs.html",
+ "sub_theme": "ipc_mechanisms"
+ },
+ {
+ "name": "Jobber (cron alternative with error handling)",
+ "what": "Go-based cron alternative with built-in job execution history, exponential backoff on failure, and configurable failure notifications. Jobs defined in a YAML-like jobfile. Three error handling modes: Stop (disable job after failure), Backoff (exponential delay before retry), Continue (keep running on schedule). Users can configure notifications on every failure or only when a job gets disabled. CLI for listing job status and history. Currently unmaintained (seeking new maintainer). Last release v1.4.4 in June 2020.",
+ "evidence": "Error handling: 'after an initial failure of a job, Jobber can schedule future runs using an exponential backoff algorithm.' Three onError modes: Stop, Backoff, Continue. Notification on failure or on disable. 444 commits, 1.4k stars.",
+ "status": "unmaintained",
+ "source_urls": ["https://github.com/dshearer/jobber"],
+ "sub_theme": "cron alternative",
+ "merged_from": 2
+ },
+ {
+ "name": "Key design decision — embedded daemon vs. external process manager",
+ "what": "Two viable approaches: (A) Embedded daemon: CLI itself can fork a background process (like Turborepo). Simpler for users (no PM2 dependency), but you own all lifecycle management, crash recovery, and log rotation. (B) External PM2: Users install PM2 and run 'pm2 start vana -- daemon'. PM2 handles crash recovery, log rotation, startup scripts. More dependencies, but battle-tested. Recommendation: Start with embedded daemon (approach A) for v1 — it's simpler for users and avoids the PM2 dependency. Add 'vana daemon install' to generate OS service files (approach inspired by Homebrew services) for boot persistence.",
+ "evidence": "Turborepo and Docker both use embedded daemons. PM2 is itself an embedded daemon that manages other processes. The trend in modern CLI tools is toward self-contained daemon management rather than requiring external process managers.",
+ "status": "recommendation",
+ "source_url": "",
+ "sub_theme": "design_decision"
+ },
+ {
+ "name": "kubectl watch pattern",
+ "what": "kubectl uses --watch/-w flag for long-running observation of resource changes. This is NOT a background process -- it's a foreground streaming connection using Kubernetes watch API (HTTP long-poll with chunked transfer). 'kubectl get pods -w' streams state changes in real-time. The CLI maintains a persistent connection to the API server. No PID management needed since it runs in the foreground. For background monitoring, users rely on external tools (Kubernetes controllers, operators) that run as pods themselves.",
+ "evidence": "kubectl docs: '--watch' flag starts watching updates. 'kubectl get events --watch --field-selector involvedObject.name=myapp-pod' for targeted watching. Kubernetes controllers handle actual background supervision.",
+ "status": "live",
+ "source_urls": ["https://kubernetes.io/docs/reference/kubectl/"],
+ "sub_theme": "foreground_streaming",
+ "merged_from": 2
+ },
+ {
+ "name": "Kubernetes CronJobs (kubectl)",
+ "what": "First-class Kubernetes resource for recurring container execution using standard cron syntax. Created via `kubectl create cronjob --image= --schedule='...'` or declarative YAML. Rich configuration surface: concurrencyPolicy (Allow/Forbid/Replace), startingDeadlineSeconds, successfulJobsHistoryLimit, failedJobsHistoryLimit, backoffLimit, timeZone, suspend. State lives in etcd (cluster state). Monitoring via `kubectl get cronjobs`, `kubectl describe cronjob`, `kubectl logs`. Suspend/resume via `kubectl patch cronjob -p '{\"spec\":{\"suspend\":true}}'`. The most complete scheduling model of any CLI tool surveyed—covers concurrency, deadlines, history retention, timezone, and suspension.",
+ "evidence": "kubectl create cronjob hello --image=busybox:1.28 --schedule='* * * * *'. YAML spec fields: schedule, concurrencyPolicy (Allow|Forbid|Replace), startingDeadlineSeconds, successfulJobsHistoryLimit (default 3), failedJobsHistoryLimit (default 1), suspend, timeZone. Macros: @yearly, @monthly, @weekly, @daily, @hourly.",
+ "status": "live",
+ "source_urls": [
+ "https://kubernetes.io/docs/concepts/workloads/controllers/cron-jobs/"
+ ],
+ "sub_theme": "infrastructure scheduler",
+ "merged_from": 2
+ },
+ {
+ "name": "Kubernetes: kubectl apply returns immediately (async default), rollout status blocks",
+ "what": "kubectl apply is async by default — it submits the manifest and returns immediately without waiting for pods to be ready. To block, you compose a second command: kubectl rollout status deployment/NAME, which watches until the rollout completes. kubectl wait provides generic condition-based waiting. This is the opposite pattern from most deploy CLIs.",
+ "evidence": "Kubernetes docs: 'kubectl apply doesn't have a --wait option, so the success of a deploy doesn't actually indicate the deploy succeeded, as the rollout happens asynchronously.' kubectl rollout status 'watches the status of the latest rollout until it's done.'",
+ "status": "live",
+ "source_urls": [
+ "https://kubernetes.io/docs/reference/kubectl/generated/kubectl_rollout/kubectl_rollout_status/"
+ ],
+ "sub_theme": "async-default-opt-in-wait",
+ "merged_from": 2
+ },
+ {
+ "name": "launchd plist integration pattern",
+ "what": "macOS CLIs register as LaunchAgents (user-level) or LaunchDaemons (system-level). Plist XML defines: Label (unique ID), ProgramArguments (command), RunAtLoad (start on login/boot), KeepAlive (auto-restart on crash), StandardOutPath/StandardErrorPath (log files), WorkingDirectory. 'launchctl load/unload ' manages lifecycle. Files at ~/Library/LaunchAgents/ (user) or /Library/LaunchDaemons/ (system). No PID file needed -- launchd tracks child PIDs internally.",
+ "evidence": "Homebrew services generates plists with KeepAlive=true and RunAtLoad=true. PostgreSQL example shows full plist structure with ProgramArguments, log paths, and working directory.",
+ "status": "live",
+ "source_urls": ["https://www.dorokhovich.com/blog/homebrew-services"],
+ "sub_theme": "os_service_integration",
+ "merged_from": 2
+ },
+ {
+ "name": "MX Platform — 13+ member connection statuses",
+ "what": "MX uses a granular connection_status enum for members (user-institution connections): CONNECTED (healthy), CHALLENGED (MFA required — security question or access code), DENIED (invalid credentials), DISCONNECTED (institution not updating data), PREVENTED (3 consecutive failed attempts, locked out), EXPIRED, LOCKED, IMPEDED, IMPAIRED, REJECTED, IMPORTED, DISABLED, DISCONTINUED, CLOSED. Terminal states requiring new aggregation and possible user input: PREVENTED, DENIED, IMPEDED, IMPAIRED, REJECTED, EXPIRED, LOCKED, IMPORTED, DISABLED, DISCONTINUED, CLOSED. CHALLENGED is an interactive state requiring MFA response.",
+ "evidence": "MX Academy 'Member Connection Statuses Overview': lists all statuses. CHALLENGED requires user to answer MFA. PREVENTED occurs after 3 consecutive failures.",
+ "status": "live",
+ "source_url": "https://academy.mx.com/hc/en-us/articles/4708854368525-Member-Connection-Statuses-Overview",
+ "sub_theme": "connection_health_state_model"
+ },
+ {
+ "name": "node-notifier (cross-platform OS notifications from Node.js)",
+ "what": "Node.js library that sends native desktop notifications on macOS (Notification Center), Windows (Toaster/Balloons), and Linux (notify-send). Supports title, message, icon, sound, click-to-open URL. Has a separate CLI package (node-notifier-cli). 7M+ weekly npm downloads. This is the go-to for Node.js CLIs that want desktop notifications.",
+ "evidence": "npm page shows massive adoption. CLI variant allows `notify -t 'Title' -m 'Message' -s`. Used by webpack, Jest, and many build tools for completion notifications.",
+ "status": "live",
+ "source_urls": ["https://www.npmjs.com/package/node-notifier"],
+ "sub_theme": "os-notification-library",
+ "merged_from": 2
+ },
+ {
+ "name": "Node.js graceful shutdown patterns",
+ "what": "Best practice for Node.js daemons: listen for SIGINT and SIGTERM (cross-platform safe). On signal: (1) stop accepting new connections, (2) drain existing connections with timeout, (3) close database pools, (4) remove PID file, (5) call process.exit(). Use a forced-exit timeout (e.g., setTimeout(() => process.exit(1), 10000)) as safety net. PID file cleanup via process.on('exit') or atexit-style handlers. SIGUSR1/SIGUSR2 not safe on Windows. For containers (PID 1 trap): Node.js doesn't forward signals to children when running as PID 1 -- use tini or dumb-init as entrypoint.",
+ "evidence": "Medium article on PID 1 trap: 'Node.js doesn't forward signals to children when running as PID 1'. Node.js docs recommend only SIGINT and SIGTERM for cross-platform compatibility.",
+ "status": "live",
+ "source_urls": [
+ "https://medium.com/@etienne.rossignon/stop-killing-your-node-js-containers-the-pid-1-trap-5c18abffd72c"
+ ],
+ "sub_theme": "graceful_shutdown",
+ "merged_from": 2
+ },
+ {
+ "name": "nohup and terminal multiplexer patterns",
+ "what": "Lightweight background process approaches without daemon infrastructure: (1) nohup -- detaches process from terminal, redirects output to nohup.out, survives terminal close. No monitoring, no auto-restart, no PID management. (2) tmux/screen -- session persistence, user can reattach to see output. Some CLIs document 'run in tmux' as official guidance for long-running tasks. (3) disown -- bash builtin to remove job from shell's job table. These are user-managed approaches with no supervision; commonly used for ad-hoc tasks rather than production services.",
+ "evidence": "Multiple Node.js deployment guides suggest nohup for simple cases, pm2/systemd for production. tmux/screen commonly recommended in CLI docs for long-running tasks that need output inspection.",
+ "status": "live",
+ "source_urls": [
+ "https://pixeljets.com/blog/using-supervisorctl-for-node-processes-common-gotchas/"
+ ],
+ "sub_theme": "lightweight_background",
+ "merged_from": 2
+ },
+ {
+ "name": "Notification channels for auth failure — desktop notifications + terminal bell + optional webhook",
+ "what": "When the daemon detects an auth failure mid-collection, it should: (1) Pause the scheduler (don't retry with bad creds). (2) Write state to disk (which connector failed, when, why). (3) Send desktop notification via node-notifier (cross-platform: macOS Notification Center, Linux libnotify, Windows toast). (4) On next CLI invocation, show a prominent banner about the auth failure. (5) Optionally, call a user-configured webhook URL for remote notification (Slack, email, etc.). The 'laptop may be off' constraint means notification must be stored durably and shown on next interaction, not just fire-and-forget.",
+ "evidence": "PM2 Plus sends email/Slack notifications on process events. Desktop notifications via node-notifier. Durable state means the daemon writes failure state to disk and the CLI reads it on next run.",
+ "status": "recommendation",
+ "source_url": "",
+ "sub_theme": "notification_design"
+ },
+ {
+ "name": "notify-send (Linux)",
+ "what": "Linux-native CLI for desktop notifications via D-Bus/libnotify. Pre-installed on most desktop Linux distros. Usage: `notify-send 'Title' 'Message'`. The canonical Linux approach — gh run watch docs suggest chaining with it.",
+ "evidence": "Used in gh run watch examples. opensource.com article on Linux desktop notifications from terminal. undistract-me uses it internally.",
+ "status": "live",
+ "source_urls": [
+ "https://opensource.com/article/22/1/linux-desktop-notifications"
+ ],
+ "sub_theme": "os-notification-library",
+ "merged_from": 2
+ },
+ {
+ "name": "npm/pnpm install: always blocks, no async mode",
+ "what": "npm install and pnpm install always block the terminal. There is no --background or --async flag. This makes sense because subsequent commands (build, test, run) depend on install completing. The user sees a progress bar/spinner and package resolution output.",
+ "evidence": "No --background, --async, or --no-wait flags exist in npm or pnpm CLI documentation. Install is a prerequisite step that must complete before anything else can run.",
+ "status": "live",
+ "source_urls": ["https://docs.npmjs.com/cli/commands/npm-install"],
+ "sub_theme": "foreground-only-no-async",
+ "merged_from": 2
+ },
+ {
+ "name": "ntfy.sh (HTTP push notifications)",
+ "what": "Self-hostable push notification service. Send notifications via simple HTTP PUT/POST: `curl -d 'Build done' ntfy.sh/mytopic`. Phone app subscribes to topics. CLI supports `ntfy publish --wait-cmd ` to auto-notify on completion, and `--wait-pid` to watch an already-running process. Free tier available. Great for remote/headless servers.",
+ "evidence": "ntfy.sh docs show CLI integration patterns. Integrations page lists Ansible, cron, systemd, and many CI tools. Self-hosting is well-documented.",
+ "status": "live",
+ "source_urls": ["https://ntfy.sh/"],
+ "sub_theme": "push-notification-service",
+ "merged_from": 2
+ },
+ {
+ "name": "OSC 9 / OSC 777 escape sequences",
+ "what": "Terminal escape sequences that trigger native OS desktop notifications without any external dependency. OSC 9 (iTerm2/Windows Terminal format): `\\033]9;Message\\007`. OSC 777 (rxvt-unicode/VSCode format): `\\033]777;notify;Title;Message\\007`. Works over SSH when terminal forwards escape sequences. VSCode integrated terminal forwards these from remote hosts.",
+ "evidence": "Claude Code hooks use both OSC 9 and OSC 777 for desktop notifications. Ghostty, iTerm2, Windows Terminal, VSCode integrated terminal, and foot terminal all support one or both. A VSCode extension (terminal-osc-notifier) exists specifically for this.",
+ "status": "live",
+ "source_urls": [
+ "https://github.com/ghostty-org/ghostty/discussions/3555"
+ ],
+ "sub_theme": "zero-dependency",
+ "merged_from": 2
+ },
+ {
+ "name": "Password managers — expiry tracking is nascent",
+ "what": "1Password added item expiry dates (Q1 2025) viewable in Watchtower, but does not send push/email notifications when passwords approach expiry — users must check Watchtower manually. Bitwarden has had long-standing community requests for password expiration dates and rotation reminders (GitHub issue #227, 2017) but the feature remains limited. Neither product offers automated credential rotation or proactive expiry alerts. This is relevant because it shows that even credential management tools lack mature expiry notification systems.",
+ "evidence": "1Password blog 'Q1 2025 usability updates': expiry dates visible in Watchtower only. Bitwarden community forums: feature requests for expiration dates span years without full implementation.",
+ "status": "live",
+ "source_url": "https://1password.com/blog/1password-q1-2025-usability-updates",
+ "sub_theme": "credential_rotation_alerts"
+ },
+ {
+ "name": "Pattern: CLIs that offer both sync and async for the same operation",
+ "what": "Multiple CLIs implement dual-mode patterns for the same operation: (1) Vercel: vercel deploy blocks by default, --no-wait returns immediately. (2) Railway: railway up attaches by default, --detach returns immediately. (3) Fly.io: fly deploy monitors by default, --detach returns immediately (though may be buggy). (4) Docker: docker run attaches by default, -d detaches. (5) AWS CloudFormation: create-stack returns immediately, 'wait stack-create-complete' blocks. The dominant pattern is block-by-default with opt-out, except AWS which is async-by-default with opt-in waiting.",
+ "evidence": "Compiled from official documentation for each CLI. Vercel: --no-wait flag. Railway: --detach flag. Fly.io: --detach flag. Docker: -d flag. AWS: separate wait subcommand.",
+ "status": "live",
+ "source_urls": ["https://vercel.com/docs/cli/deploy"],
+ "sub_theme": "cross-CLI pattern",
+ "merged_from": 2
+ },
+ {
+ "name": "Pattern: What happens when the terminal is closed mid-operation",
+ "what": "Server-side operations continue regardless of terminal state: Heroku builds continue after Ctrl+C. GitHub Actions workflows run independently of gh run watch. AWS CloudFormation stacks deploy independently of the CLI. Fly.io deployments continue if detached. Client-side operations stop: Stripe listen/logs tail stop when the terminal closes (they are local processes maintaining connections). The key distinction is whether the operation is server-side (survives terminal close) or client-side (dies with terminal).",
+ "evidence": "Heroku docs explicitly state: 'Detaching doesn't cancel the build or the deploy.' GitHub Actions workflows are server-side by nature. Stripe listen requires an active terminal connection.",
+ "status": "live",
+ "source_urls": ["https://devcenter.heroku.com/articles/git"],
+ "sub_theme": "terminal close behavior",
+ "merged_from": 2
+ },
+ {
+ "name": "PID file stale detection patterns",
+ "what": "Three main approaches: (1) kill(pid, 0) / kill -0 -- sends signal 0 which checks process existence without actually sending a signal. Returns success if process exists, ESRCH error if not. Caveat: non-root users may get EPERM for others' processes, which still means the process exists. (2) pgrep -F -- reads PID from file and checks if process exists, returning exit status 1 if stale. (3) File locking (flock/fcntl) -- process holds advisory lock on PID file; if lock is obtainable, previous process is dead. File locking is the most robust approach as it handles PID recycling. All approaches should be combined with atexit/signal-handler cleanup to remove PID file on normal exit.",
+ "evidence": "Python pid library uses flock. Node.js daemon-pid provides 'start-time verification to ensure the recorded process-id was not recycled by the OS'. Common pattern: check kill(pid, 0), if ESRCH then stale, if EPERM then process exists but owned by another user.",
+ "status": "live",
+ "source_urls": ["https://github.com/trbs/pid"],
+ "sub_theme": "pid_management",
+ "merged_from": 2
+ },
+ {
+ "name": "PID file stale detection — check /proc/$pid/cmdline, not just process existence",
+ "what": "Simply checking if a PID exists is insufficient — the OS may have recycled the PID to a different process. Robust stale detection: (1) read PID from lock file, (2) check if /proc/$PID/cmdline matches expected command, (3) if not matching or process dead, overwrite lock file with new PID. The pidlock library uses an atomic directory-rename strategy: create temp dir named after PID, atomically rename to lock path. On ungraceful kill (SIGKILL), the lock file remains but stale detection handles recovery. Never delete PID files in signal handlers — use atexit handlers instead, but note SIGKILL bypasses all handlers.",
+ "evidence": "pidlock creates directory with PID name, atomically renames to lock file. Validates lock authenticity by checking /proc/$old_pid/cmdline. SIGKILL bypasses all signal handlers and atexit, so stale detection on next startup is essential.",
+ "status": "live",
+ "source_url": "https://www.guido-flohr.net/never-delete-your-pid-file/",
+ "sub_theme": "pid_management"
+ },
+ {
+ "name": "Plaid consent expiry timelines vary by region and institution",
+ "what": "UK: 90-day consent cycle, 7-day advance notice, re-consent handled by TPP (Plaid) without bank redirect since FCA 2022 changes. EU: Extended from 90 to 180 days (July 2023 EBA change), but re-consent still requires bank authentication via TPP redirect. US: Some institutions (e.g. Bank of America) enforce 12-month consent expiry, but most US Items do not expire. OAuth-based connections are more likely to have consent expiry than credential-based ones.",
+ "evidence": "Plaid blog 'A more seamless way to reauthenticate in the UK': 7-day advance notification, TPP-managed re-consent. Plaid blog 'Now you can enjoy 180 days': EU 90->180 day extension, still requires bank auth. Plaid docs: consent_expiration_time is null for most US institutions.",
+ "status": "live",
+ "source_url": "https://plaid.com/blog/90-day-are-you-ready/",
+ "sub_theme": "consent_expiry_timelines"
+ },
+ {
+ "name": "Plaid Item health state model",
+ "what": "Plaid tracks Item health via a status object on /item/get containing last_successful_update and last_failed_update timestamps per product (transactions, investments). The update_type field indicates 'background' (automatic) vs 'user_present_required'. There is a consent_expiration_time field (ISO 8601) that is non-null for institutions enforcing consent expiry (common in EU, rare in US). Error states are surfaced via error objects with error_type, error_code, and error_code_reason. Key error codes: ITEM_LOGIN_REQUIRED (password changed, MFA expired, OAuth consent invalid), INVALID_CREDENTIALS, INVALID_MFA, ITEM_LOCKED (3-5 failed attempts), ACCESS_NOT_GRANTED, NO_ACCOUNTS, INSUFFICIENT_CREDENTIALS.",
+ "evidence": "Plaid /item/get API docs: status.transactions.last_successful_update, status.investments.last_failed_update, consent_expiration_time fields. Error codes documented at plaid.com/docs/errors/item/.",
+ "status": "live",
+ "source_url": "https://plaid.com/docs/api/items/",
+ "sub_theme": "connection_health_state_model"
+ },
+ {
+ "name": "Plaid Link update mode — abbreviated re-auth UX",
+ "what": "When an Item needs re-authentication, Plaid presents an abbreviated flow requesting only the minimum input needed to repair it. Example: if an OTP token expired, the user provides a new OTP without full re-login. Update mode can be triggered by access_token (single Item) or user_token (multi-Item). Successfully completing update mode resets consent_expiration_time as if the Item were newly created. Update mode also supports adding/removing accounts, granting new product permissions, and consent renewal.",
+ "evidence": "Plaid Link update mode docs: 'For most institutions, Plaid will present an abbreviated re-authentication flow requesting only the minimum user input required to repair the Item.'",
+ "status": "live",
+ "source_url": "https://plaid.com/docs/link/update-mode/",
+ "sub_theme": "reauth_ux_patterns"
+ },
+ {
+ "name": "Plaid webhook-driven expiry notification system",
+ "what": "Plaid sends proactive webhooks for connection health changes: (1) PENDING_DISCONNECT (US/CA) or PENDING_EXPIRATION (UK/EU) sent 7 days before consent expires, (2) ITEM:ERROR webhook when ITEM_LOGIN_REQUIRED occurs, (3) LOGIN_REPAIRED when an Item exits error state (even if repaired via another app), (4) NEW_ACCOUNTS_AVAILABLE when new shareable accounts detected, (5) USER_PERMISSION_REVOKED when user removes access. Developers must handle duplicate and out-of-order webhooks with idempotency.",
+ "evidence": "Plaid webhooks docs and Link update mode docs describe 7-day advance notice via PENDING_DISCONNECT/PENDING_EXPIRATION.",
+ "status": "live",
+ "source_url": "https://plaid.com/docs/api/webhooks/",
+ "sub_theme": "advance_warning_notifications"
+ },
+ {
+ "name": "PM2 (cron_restart + startup)",
+ "what": "Node.js process manager with daemon mode, cron-based restart scheduling, and OS init system integration. Scheduling via `pm2 start app.js --cron-restart='0 0 * * *'` or ecosystem.config.js `cron_restart` field. PM2 runs as a persistent daemon—not pure cron. `pm2 startup` generates init scripts for systemd/launchd/upstart/openrc, and `pm2 save` persists the process list for resurrection on reboot. Monitoring via `pm2 monit`, `pm2 logs`, `pm2 list`. Failure handling via exponential backoff restart delay, max_memory_restart, and stop_exit_codes. Key distinction: PM2's cron_restart restarts an already-running daemon on a schedule, rather than running a one-shot task.",
+ "evidence": "pm2 start app.js --cron-restart='0 0 * * *'. ecosystem.config.js: { cron_restart: '0 0 * * *' }. pm2 startup (detects systemd/launchd/upstart). pm2 save (persists process list). --exp-backoff-restart-delay=100 for exponential backoff.",
+ "status": "live",
+ "source_urls": [
+ "https://pm2.keymetrics.io/docs/usage/restart-strategies/"
+ ],
+ "sub_theme": "daemon process manager",
+ "merged_from": 2
+ },
+ {
+ "name": "PM2 (Node.js process manager)",
+ "what": "Two-process architecture: a persistent 'God Daemon' and a thin CLI client. Daemon communicates via two Unix sockets: ~/.pm2/rpc.sock (commands) and ~/.pm2/pub.sock (events/logs). PID files stored in ~/.pm2/pids/-.pid. On 'pm2 start', CLI connects to daemon (spawning it if needed), daemon forks the target process. Graceful shutdown sends SIGINT first, then SIGKILL after configurable kill_timeout (default 1.6s). Auto-restart on crash with exponential backoff option. 'pm2 startup' generates platform-specific init scripts (systemd/launchd/upstart). 'pm2 save' + 'pm2 resurrect' persists process list across reboots via dump.pm2 file.",
+ "evidence": "PM2 docs: 'PM2 is a daemon process manager', PID files at ~/.pm2/pids/, socket files at ~/.pm2/rpc.sock and ~/.pm2/pub.sock, 'pm2 startup' generates systemd/launchd/upstart scripts, exponential backoff via --exp-backoff-restart-delay, kill_timeout configurable.",
+ "status": "live",
+ "source_urls": [
+ "https://pm2.keymetrics.io/docs/usage/pm2-doc-single-page/"
+ ],
+ "sub_theme": "node_daemon_manager",
+ "merged_from": 2
+ },
+ {
+ "name": "PM2 crash recovery — exponential backoff with min_uptime reset",
+ "what": "PM2's handleExit method tracks prev_restart_delay and doubles it on each crash (e.g., 100ms -> 200ms -> 400ms). The backoff resets when a process stays alive longer than min_uptime. A Worker loop runs every 30 seconds to monitor health and reset delays for stable processes. max_restarts caps total attempts to prevent infinite loops. max_memory_restart triggers reload if a process exceeds a memory threshold.",
+ "evidence": "handleExit tracks prev_restart_delay for exponential backoff. Worker system runs every 30s monitoring health. Configurable via ecosystem.config.js: min_uptime, max_restarts, max_memory_restart.",
+ "status": "live",
+ "source_url": "https://deepwiki.com/Unitech/pm2/2-core-components",
+ "sub_theme": "pm2_crash_recovery"
+ },
+ {
+ "name": "PM2 ecosystem.config.js — declarative multi-process config",
+ "what": "ecosystem.config.js exports { apps: [...], deploy: {...} }. Each app entry specifies: name, script, exec_mode (fork|cluster), instances, max_memory_restart, cron_restart (cron expression for scheduled restarts), error_file, out_file, log_date_format, env variables, and watch options. The God daemon parses this via Configuration.getSync() to populate pm2_env for each process. Supports multiple apps in a single file with different configurations.",
+ "evidence": "module.exports = { apps: [{ name, script, exec_mode, instances, max_memory_restart, cron_restart, ... }], deploy: { production: { user, host, ref, repo, path } } }",
+ "status": "live",
+ "source_url": "https://deepwiki.com/Unitech/pm2/2-core-components",
+ "sub_theme": "pm2_config"
+ },
+ {
+ "name": "PM2 God Daemon — dual Unix socket architecture",
+ "what": "PM2 runs a persistent background process (the 'God Daemon') that communicates via two separate Unix sockets in ~/.pm2/: rpc.sock for synchronous request-reply commands (start, stop, restart, list) and pub.sock for event broadcasting (log output, process state changes). The CLI spawns the daemon on first use if not running. The axon library handles socket transport; axon-rpc layers an RPC protocol on top. God exposes methods like prepare(), getMonitorData(), startProcessId(), stopProcessId() over RPC. The daemon maintains a clusters_db hash map as its in-memory process registry.",
+ "evidence": "PM2 source: Daemon class initializes PUB socket at cst.DAEMON_PUB_PORT and RPC socket at cst.DAEMON_RPC_PORT. Client class auto-spawns daemon if connection fails. Socket files at $HOME/.pm2/pub.sock and $HOME/.pm2/rpc.sock.",
+ "status": "live",
+ "source_url": "https://deepwiki.com/Unitech/pm2/2-core-components",
+ "sub_theme": "pm2_architecture"
+ },
+ {
+ "name": "PM2 graceful shutdown — signal cascade and state dump",
+ "what": "Daemon's gracefullExit() method: (1) dumps process list to disk, (2) sends SIGINT to each managed process with a configurable kill_timeout, (3) falls back to SIGKILL after timeout, (4) cleans up socket files (rpc.sock, pub.sock), (5) removes PID file. Signal handlers: SIGTERM/SIGINT/SIGQUIT trigger graceful exit; SIGUSR2 triggers log rotation/reload.",
+ "evidence": "Daemon signal handlers: SIGTERM, SIGINT, SIGQUIT -> gracefullExit(). SIGUSR2 -> log reload. gracefullExit dumps process lists, stops managed processes, cleans socket files, removes PID files.",
+ "status": "live",
+ "source_url": "https://deepwiki.com/Unitech/pm2/2-core-components",
+ "sub_theme": "pm2_shutdown"
+ },
+ {
+ "name": "PM2 startup — cross-platform init script generation",
+ "what": "pm2 startup auto-detects the host OS init system (systemd, launchd, upstart, init.d, rcd, systemv) and generates appropriate service files. On systemd, it creates a unit file targeting multi-user.target. pm2 save serializes current process list to ~/.pm2/dump.pm2. On boot, the init script starts PM2 which reads dump.pm2 and resurrects all saved processes with their exact configurations. pm2-windows-service extends this to Windows.",
+ "evidence": "pm2 startup [systemd|launchd|upstart|systemv] generates platform-specific init scripts. pm2 save writes dump.pm2. pm2 resurrect restores from dump.",
+ "status": "live",
+ "source_url": "https://pm2.keymetrics.io/docs/usage/pm2-doc-single-page/",
+ "sub_theme": "pm2_startup"
+ },
+ {
+ "name": "PM2 web dashboard — RPC socket + Socket.io bridge",
+ "what": "PM2 Plus is the official SaaS dashboard for remote monitoring with email/Slack notifications, custom metrics, and exception tracking (free tier: 4 servers). For self-hosted, pm2-gui connects directly to PM2's RPC socket, reads process metrics, and serves a web UI over Socket.io for real-time CPU/memory monitoring and process control (restart/stop/delete). The dashboard bridges PM2's Unix socket IPC to WebSocket for browser consumption.",
+ "evidence": "pm2-gui communicates with PM2 through RPC socket directly, uses Socket.io between client and server. PM2 Plus provides issues tracking, deployment reporting, real-time logs, email/Slack notifications.",
+ "status": "live",
+ "source_url": "https://pm2.keymetrics.io/docs/usage/monitoring/",
+ "sub_theme": "pm2_dashboard"
+ },
+ {
+ "name": "Railway CLI: railway logs for monitoring after detach",
+ "what": "railway logs streams logs by default via WebSocket, or fetches historical logs when using --lines, --since, or --until flags. The -d flag shows deployment logs, -b shows build logs. Can target a specific deployment ID.",
+ "evidence": "Railway docs: 'Stream mode (the default) connects via WebSocket and streams logs in real-time.' 'Use railway logs with the deployment ID to stream the failed build/deploy logs.'",
+ "status": "live",
+ "source_urls": ["https://docs.railway.com/cli/logs"],
+ "sub_theme": "log streaming",
+ "merged_from": 2
+ },
+ {
+ "name": "Railway CLI: railway up has three modes: attached (default), --detach, --ci",
+ "what": "railway up uploads and deploys, defaulting to attached mode that streams build and deployment logs in real-time. --detach (-d) returns immediately after uploading; the deployment continues in the background and can be monitored via dashboard or 'railway logs'. --ci (-c) streams only build logs and exits when the build completes, designed for CI/CD pipelines.",
+ "evidence": "Railway docs: 'By default, railway up operates in attached mode and streams build and deployment logs to your terminal.' '--detach: return immediately after uploading.' '--ci: stream only build logs and exit when the build completes.'",
+ "status": "live",
+ "source_urls": ["https://docs.railway.com/cli/deploying"],
+ "sub_theme": "three-mode pattern",
+ "merged_from": 2
+ },
+ {
+ "name": "Railway: deploy blocks with log streaming, -d/--detach to skip",
+ "what": "railway up blocks by default, streaming build logs to the terminal. The -d/--detach flag 'returns immediately after uploading, with the deployment continuing in the background.' Also offers -c/--ci which streams only build logs and exits when build (not deploy) completes — a middle ground between full sync and full detach.",
+ "evidence": "Railway docs: '--detach prevents attaching to the log stream.' The -c/--ci flag 'streams only build logs and exits when the build completes' — useful for CI/CD.",
+ "status": "live",
+ "source_urls": ["https://docs.railway.com/cli/deploying"],
+ "sub_theme": "foreground-default-detach-opt-in",
+ "merged_from": 2
+ },
+ {
+ "name": "Recommended architecture for vana daemon — forked process + Unix socket + JSON-RPC + OS service registration",
+ "what": "Synthesis of research: (1) Daemon spawned on-demand by CLI via child_process.fork() (Turborepo pattern). (2) IPC via Unix domain socket with JSON-RPC protocol (VS Code pattern) — socket file also serves as liveness check. (3) PID file with /proc cmdline validation for stale detection (pidlock pattern). (4) Scheduler using Croner or Bree for cron-based collection triggers. (5) State persisted to JSON/SQLite file, restored on startup (PM2 dump.pm2 pattern). (6) Graceful degradation: CLI works without daemon, daemon crash doesn't break manual CLI usage (Turborepo pattern). (7) OS service registration via generated launchd plist / systemd user unit for boot persistence (Homebrew services pattern). (8) Auth failure detection pauses scheduler, sends notification (desktop notification API + optional webhook), resumes after re-auth.",
+ "evidence": "Composite pattern from PM2 (dual socket, state dump, startup scripts), Turborepo (on-demand spawn, graceful degradation, CI detection), Docker (socket activation), VS Code (JSON-RPC, lazy activation, persistent state), Homebrew (thin OS service wrappers).",
+ "status": "recommendation",
+ "source_url": "",
+ "sub_theme": "synthesis"
+ },
+ {
+ "name": "runit",
+ "what": "Lightweight cross-platform init scheme with service supervision. Each service is a directory containing a 'run' script. The runsv process supervises each service, automatically restarting it on exit. Optional './check' script for readiness polling. Optional './log/run' for dedicated log service per process. Can replace PID 1 or run alongside other init systems. Uses a simple directory-based state model: service is 'up' if its run script is running, 'down' if not.",
+ "evidence": "Arch Wiki: 'runit runs on GNU/Linux, *BSD, MacOSX, Solaris'. Gentoo wiki: 'Process supervision is a type of operating system service management in which some master process remains the parent of the service processes.'",
+ "status": "live",
+ "source_urls": ["https://smarden.org/runit/"],
+ "sub_theme": "process_supervisor",
+ "merged_from": 2
+ },
+ {
+ "name": "Salesforce — configurable refresh token expiry policies",
+ "what": "Salesforce connected apps offer 4 refresh token policies: (1) Valid until revoked (default — indefinite), (2) Immediately expire refresh token, (3) Expire if not used for N time (inactivity-based, resets on use), (4) Expire after N time (absolute). Admins can change policies at any time, retroactively affecting existing tokens. Access token lifetime is determined by session timeout value. This represents a 'platform-configurable' model where the service provider (not the aggregator) controls token lifetime.",
+ "evidence": "Salesforce help docs: 'Manage OAuth Access Policies for a Connected App' describes all 4 policies. Default is 'valid until revoked'.",
+ "status": "live",
+ "source_url": "https://help.salesforce.com/s/articleView?id=sf.connected_app_manage_oauth.htm&language=en_US&type=5",
+ "sub_theme": "platform_controlled_expiry"
+ },
+ {
+ "name": "Slack/Discord webhook notifications",
+ "what": "CLIs that POST to Slack/Discord webhooks on completion. cli-notify-slack wraps any command and sends Slack notification with user, hostname, command, output, and exit status. notify_slack pipes stdout to Slack at configurable intervals. Pattern: set SLACK_TOKEN + SLACK_CHANNEL env vars, then `cli-notify-slack `.",
+ "evidence": "GitHub repos EntilZha/cli-notify-slack and catatsuy/notify_slack. Kubernetes deployment pipelines commonly post to Slack on completion.",
+ "status": "live",
+ "source_urls": ["https://github.com/EntilZha/cli-notify-slack"],
+ "sub_theme": "team-notification",
+ "merged_from": 2
+ },
+ {
+ "name": "Strava OAuth — 6-hour access tokens with rotating refresh tokens",
+ "what": "Strava access tokens expire after 6 hours. Refresh tokens are rotated on every exchange — each refresh response includes a new refresh token, and the old one becomes invalid. Apps must persist the latest refresh token. During the transition window, both old and new access tokens work until the old one naturally expires. On 401 Unauthorized, apps should use the refresh token to obtain a new access token. No push notification for token expiry — apps must proactively check expiry before API calls. If a user's connection breaks (e.g., Garmin-Strava sync), the fix is typically disconnect and reconnect.",
+ "evidence": "Strava developer docs: 'Access tokens expire six hours after they are created'. 'Once a new refresh token code has been returned, the older code will no longer work.'",
+ "status": "live",
+ "source_url": "https://developers.strava.com/docs/authentication/",
+ "sub_theme": "short_lived_token_rotation"
+ },
+ {
+ "name": "Stripe CLI: stripe listen runs as a foreground-only long-lived process",
+ "what": "stripe listen maintains a persistent WebSocket connection to Stripe, forwarding webhook events to a local endpoint. It displays 'Ready! Your webhook signing secret is ... (^C to quit)' and then streams events as they arrive. There is no background/detach mode; the process runs until Ctrl+C. If the terminal is closed, the listener stops.",
+ "evidence": "Stripe docs state: 'Ready! Your webhook signing secret is {{WEBHOOK_SIGNING_SECRET}} (^C to quit)'. The listen command establishes a direct connection with Stripe, delivering webhook events to your computer directly.",
+ "status": "live",
+ "source_urls": ["https://docs.stripe.com/stripe-cli/use-cli"],
+ "sub_theme": "foreground-only long-lived process",
+ "merged_from": 2
+ },
+ {
+ "name": "Stripe CLI: stripe logs tail streams API logs in real-time",
+ "what": "stripe logs tail establishes a direct connection with Stripe and streams test mode API request logs in real-time. It supports filters (--filter-account, --filter-http-method, --filter-request-path, --filter-request-status). It runs in the foreground with no detach mode.",
+ "evidence": "Stripe docs: 'The logs tail command establishes a direct connection with Stripe and enables you to tail your test mode Stripe API request logs in real-time from your terminal.'",
+ "status": "live",
+ "source_urls": ["https://docs.stripe.com/cli/logs/tail"],
+ "sub_theme": "foreground-only long-lived process",
+ "merged_from": 2
+ },
+ {
+ "name": "Stripe Connect OAuth — tokens don't expire but can be revoked",
+ "what": "Stripe Connect Standard account access tokens do not expire. They remain valid until the user explicitly deauthorizes the connected account, which triggers an account.application.deauthorized webhook. Refresh tokens are rolled on every exchange, so the expiry is always 1 year from last use — effectively never expiring if used regularly. This is a fundamentally different model from credential-based connections: once authorized, connections persist indefinitely unless user-revoked. Stripe Apps (different context) use 1-hour access tokens with 1-year refresh tokens.",
+ "evidence": "Stripe Connect OAuth docs: 'the access token does not expire but may be revoked by the user at any time'. Refresh tokens roll on exchange, 1-year expiry from last use.",
+ "status": "live",
+ "source_url": "https://docs.stripe.com/connect/oauth-reference",
+ "sub_theme": "never_expire_model"
+ },
+ {
+ "name": "Supercronic (container cron)",
+ "what": "Crontab-compatible job runner designed for containers. Solves cron-in-Docker problems: preserves environment variables (traditional cron purges them), logs to stdout/stderr (not syslog), handles SIGTERM gracefully, prevents concurrent execution by default. Reads a standard crontab file. Supports second-level precision. CLI flags: -debug, -split-logs, -test (validate without running), -overlapping (allow concurrent), -inotify (auto-reload on file change), -sentry-dsn (error tracking). No daemon management—runs as PID 1 in a container. Key differentiator: drop-in cron replacement that works correctly in containerized environments.",
+ "evidence": "supercronic /path/to/crontab. Flags: -test (validate syntax), -overlapping (allow parallel), -inotify (reload on file change), -sentry-dsn (Sentry integration), -split-logs (stdout vs stderr). 'When a job exceeds its scheduled interval, the system logs a warning but delays the next execution until the current one completes.'",
+ "status": "live",
+ "source_urls": ["https://github.com/aptible/supercronic"],
+ "sub_theme": "container-native cron",
+ "merged_from": 2
+ },
+ {
+ "name": "Supervisor (supervisord)",
+ "what": "Client/server architecture: supervisord daemon manages child processes, supervisorctl CLI sends commands. Expects to be the direct parent of managed processes (children must NOT daemonize themselves). Provides pidproxy tool for processes that fork. Configuration via INI-style [program:x] sections. Supports process groups, event listeners for notifications, and XML-RPC API for programmatic control. Separate activity log (supervisord operations) and child process logs. Designed to start at boot but NOT as PID 1.",
+ "evidence": "Supervisor docs: 'client/server system that allows its users to monitor and control a number of processes on UNIX-like operating systems', supports [program:x] config, event listeners, XML-RPC API.",
+ "status": "live",
+ "source_urls": ["https://supervisord.org/"],
+ "sub_theme": "process_supervisor",
+ "merged_from": 2
+ },
+ {
+ "name": "systemd service integration pattern",
+ "what": "CLIs that install themselves as systemd services write a .service unit file to ~/.config/systemd/user/ (user) or /etc/systemd/system/ (system). Key directives: ExecStart (binary path), Restart=on-failure or always, RestartSec (delay between restarts), PIDFile (for forking daemons), Type=simple|forking|notify. Socket activation via .socket units allows on-demand daemon startup. 'systemctl --user enable ' persists across reboots. Journal logging (journalctl -u ) replaces custom log files. Health checks via systemd-notify for Type=notify services.",
+ "evidence": "Docker's docker.service uses PIDFile directive. Homebrew generates systemd user units on Linux. PM2's 'pm2 startup' generates systemd unit files. Turborepo and other modern CLIs prefer to manage their own daemon rather than relying on systemd.",
+ "status": "live",
+ "source_urls": ["https://docs.docker.com/reference/cli/dockerd/"],
+ "sub_theme": "os_service_integration",
+ "merged_from": 2
+ },
+ {
+ "name": "Temporal.io — durable execution with event-sourced workflow recovery",
+ "what": "Temporal's architecture: a Temporal Cluster tracks all workflow state, and long-lived Worker processes execute Workflows (orchestration logic) and Activities (individual steps like API calls). Every state change is recorded in an append-only Event History, enabling exact-point-of-failure recovery. Workers can be restarted and will resume workflows from their last recorded state. TypeScript SDK available. This is a server-side pattern requiring infrastructure (Temporal Cluster), not suitable for a local CLI daemon, but the event-sourcing pattern for state recovery is highly relevant.",
+ "evidence": "Temporal records every state change in Event History, a complete append-only log. Workers execute Workflows and Activities. TypeScript SDK for Node.js. Requires running Temporal Cluster.",
+ "status": "live",
+ "source_url": "https://docs.temporal.io/workflows",
+ "sub_theme": "workflow_orchestration"
+ },
+ {
+ "name": "Terminal bell (\\a / BEL)",
+ "what": "The simplest, most universal notification: print \\a (ASCII 7) to stdout. Modern terminals map this to OS-native behavior — visual flash, taskbar urgency hint, or native notification depending on config. Zero dependencies. Works over SSH. Warp disables audible bell by default but supports it. iTerm2, Ghostty, Alacritty, Windows Terminal, and foot all support it. Claude Code uses `preferredNotifChannel: terminal_bell` as its primary mechanism.",
+ "evidence": "Claude Code ships terminal_bell as a configurable notification channel. Warp docs show audible bell as a separate setting from desktop notifications. The muxup.com article demonstrates mapping BEL to urgency hints in tiling WMs. Ghostty maps BEL to macOS native notifications when unfocused.",
+ "status": "live",
+ "source_urls": [
+ "https://muxup.com/2023q4/let-the-terminal-bells-ring-out"
+ ],
+ "sub_theme": "zero-dependency",
+ "merged_from": 2
+ },
+ {
+ "name": "Terminal-native command completion notifications (iTerm2, Warp, Ghostty)",
+ "what": "Modern terminals have built-in 'notify when command finishes' features that require shell integration (OSC 133 marks) but no app-side code. iTerm2: Cmd+Alt+A toggles 'Alert on next mark'. Warp: configurable threshold (e.g. 5s), only fires when app unfocused. Ghostty (1.3+): `notify-on-command-finish = unfocused | always | never` with `notify-on-command-finish-after = 5s`. These are the gold standard UX — the terminal handles everything.",
+ "evidence": "Ghostty issue #8991 and discussion #3555 document the feature. Warp docs at docs.warp.dev/terminal/more-features/notifications. iTerm2 shell integration docs.",
+ "status": "live",
+ "source_urls": [
+ "https://docs.warp.dev/terminal/more-features/notifications"
+ ],
+ "sub_theme": "terminal-native",
+ "merged_from": 2
+ },
+ {
+ "name": "terminal-notifier (macOS-specific)",
+ "what": "macOS-only CLI tool for sending native Notification Center notifications. Simpler and more reliable than node-notifier on macOS since it's a native binary. Commonly used: `terminal-notifier -title 'Build' -message 'Done'`. Claude Code users frequently configure it as a hook.",
+ "evidence": "GitHub repo julienXX/terminal-notifier. Multiple blog posts show Claude Code hook integration. Andrea Grandi's blog post on using it with Claude Code.",
+ "status": "live",
+ "source_urls": ["https://github.com/julienXX/terminal-notifier"],
+ "sub_theme": "os-notification-library",
+ "merged_from": 2
+ },
+ {
+ "name": "Terraform: apply blocks, no CLI-level async flag",
+ "what": "terraform apply blocks by default, waiting for all resources to reach their expected lifecycle state. There is no --async or --background CLI flag. Some providers (e.g. OCI) support an async=true resource-level attribute that skips waiting per-resource, but this is provider-specific, not a terraform core feature. After async resource creation, state has nulls; you must run terraform refresh.",
+ "evidence": "Terraform docs: 'By default, when Terraform creates, updates, or deletes a resource it waits for that resource to reach its expected lifecycle state before proceeding.' OCI provider supports async=true at resource level only.",
+ "status": "live",
+ "source_urls": [
+ "https://developer.hashicorp.com/terraform/cli/commands/apply"
+ ],
+ "sub_theme": "foreground-default-no-async-escape",
+ "merged_from": 2
+ },
+ {
+ "name": "Turborepo daemon",
+ "what": "Background Rust daemon for file watching and package discovery. Communicates via gRPC over Unix domain sockets at .turbo/daemon.sock. Uses PID lock file at .turbo/daemon.pid to prevent multiple instances. CLI spawns daemon on first connect if not running (DaemonConnector.connect()). Daemon provides FileSystemWatcher, PackageWatcher, HashWatcher, and GlobWatcher services. Uses cookie files in .turbo/cookies/ for synchronization between watchers. --daemon/--no-daemon flags control usage; automatically disabled in CI. On crash, CLI detects disconnection and can continue without daemon or spawn a replacement.",
+ "evidence": "DeepWiki: socket at '.turbo/daemon.sock', PID lock at '.turbo/daemon.pid', gRPC IPC via TurboGrpcService, DaemonConnector manages CLI connections, disabled in CI automatically.",
+ "status": "live",
+ "source_urls": ["https://deepwiki.com/vercel/turborepo/2-architecture"],
+ "sub_theme": "cli_embedded_daemon",
+ "merged_from": 2
+ },
+ {
+ "name": "Turborepo daemon — on-demand gRPC over Unix socket with graceful degradation",
+ "what": "Turborepo's Rust daemon spawns transparently via DaemonConnector: if no daemon exists, the CLI forks one. Communication uses gRPC (Protocol Buffers) over a Unix domain socket at .turbo/daemon.sock (named pipe on Windows). A .turbo/daemon.pid lock file prevents multiple instances; stale PID detection allows recovery. The daemon auto-disables in CI (checks CI=true env var). If the daemon crashes or socket fails, the CLI falls back to synchronous operation — builds complete without caching/watching benefits but never fail due to daemon unavailability.",
+ "evidence": "DaemonConnector spawns daemon on-demand. TurboGrpcService listens on .turbo/daemon.sock. .turbo/daemon.pid for lock. CI detection disables daemon. Graceful degradation: CLI falls back to synchronous ops if daemon unavailable.",
+ "status": "live",
+ "source_url": "https://deepwiki.com/vercel/turborepo/2-architecture",
+ "sub_theme": "turborepo_daemon"
+ },
+ {
+ "name": "Turborepo file watching — layered watchers with cookie synchronization",
+ "what": "The daemon's FileSystemWatcher uses specialized sub-watchers: PackageWatcher (package.json changes), HashWatcher (source file content hashes for cache invalidation), GlobWatcher (turbo.json task input/output patterns). A CookieWriter/CookieWatcher system uses marker files in .turbo/cookies/ to synchronize file events back to CLI clients through the daemon socket, ensuring consistent state between file system events and build decisions.",
+ "evidence": "PackageWatcher, HashWatcher, GlobWatcher, CookieWriter/CookieWatcher all under FileSystemWatcher. Cookie files in .turbo/cookies/ synchronize events to gRPC clients.",
+ "status": "live",
+ "source_url": "https://deepwiki.com/vercel/turborepo/2-architecture",
+ "sub_theme": "turborepo_file_watching"
+ },
+ {
+ "name": "undistract-me (Ubuntu/Debian automatic notifications)",
+ "what": "Shell functions that hook into bash/zsh execution cycle. Auto-notifies for any command >10s when the terminal window is not focused. Configurable threshold via LONG_RUNNING_COMMAND_TIMEOUT. Can play sound via UDM_PLAY_SOUND. Ignore list for commands like `vim`. Pre-packaged in Ubuntu/Debian repos.",
+ "evidence": "GitHub repo jml/undistract-me. Available in Ubuntu repos (`apt install undistract-me`). Configurable ignore list and timeout.",
+ "status": "live",
+ "source_urls": ["https://github.com/jml/undistract-me"],
+ "sub_theme": "automatic-shell-integration",
+ "merged_from": 2
+ },
+ {
+ "name": "Unix socket IPC pattern (daemon communication)",
+ "what": "Production CLIs overwhelmingly prefer Unix domain sockets over PID files for daemon communication. Pattern: daemon creates socket file (e.g., /var/run/docker.sock, ~/.pm2/rpc.sock, .turbo/daemon.sock), CLI connects as client. Socket existence + successful connect = daemon is alive (no need for PID file to check liveness). Socket files are automatically cleaned up by the OS on process termination if using abstract sockets. For file-based sockets, daemon removes on clean shutdown; stale sockets detected by failed connect attempt. Advantages over PID files: bidirectional communication, no PID recycling problem, atomic liveness check.",
+ "evidence": "Docker uses /var/run/docker.sock, PM2 uses ~/.pm2/rpc.sock + ~/.pm2/pub.sock, Turborepo uses .turbo/daemon.sock. All three use socket connectivity as the primary liveness check rather than PID files.",
+ "status": "live",
+ "source_urls": ["https://docs.docker.com/reference/cli/dockerd/"],
+ "sub_theme": "ipc_pattern",
+ "merged_from": 2
+ },
+ {
+ "name": "Vercel CLI: deploy blocks by default, has --no-wait flag",
+ "what": "vercel deploy blocks by default, showing progress phases: uploading, building, completing. Terminal output shows an Inspect URL and Preview/Production URL with timing. Example output: 'Inspect: https://vercel.com/myorg/project/2x6dq3wmp [3s]' then 'Preview: https://my-project.vercel.app [6s]'. The --no-wait flag exits immediately after initiating the deploy without waiting for completion. The --logs flag additionally prints build logs. stdout is always the deployment URL, enabling piping (vercel > deployment-url.txt).",
+ "evidence": "Vercel docs: '--no-wait: does not wait for a deployment to finish before exiting from the deploy command.' Also: 'When deploying, stdout is always the Deployment URL.' Example output shows Inspect and Preview URLs with timing brackets.",
+ "status": "live",
+ "source_urls": ["https://vercel.com/docs/cli/deploy"],
+ "sub_theme": "blocking with opt-out",
+ "merged_from": 2
+ },
+ {
+ "name": "Vercel CLI: status check via vercel inspect after --no-wait",
+ "what": "After using --no-wait, users can check deployment status with vercel inspect, which shows deployment details and state. There is no built-in 'watch' or re-attach command for monitoring a deployment initiated with --no-wait.",
+ "evidence": "Vercel CLI docs list vercel inspect as a separate command for examining deployments. The --no-wait docs do not describe a re-attach mechanism.",
+ "status": "live",
+ "source_urls": ["https://vercel.com/docs/cli/deploy"],
+ "sub_theme": "status checking",
+ "merged_from": 2
+ },
+ {
+ "name": "Vercel Cron Jobs",
+ "what": "Cron jobs defined in vercel.json that trigger HTTP GET requests to Vercel Functions at the production URL. Configuration is declarative in vercel.json `crons` array with `path` and `schedule` fields. No CLI commands for managing crons—changes require editing vercel.json and redeploying. State lives in Vercel's cloud. Monitoring via dashboard Settings > Cron Jobs and runtime logs. No automatic retry on failure. Hobby tier: once per day only with hourly precision (±59 min). Pro/Enterprise: per-minute precision. 100 cron jobs per project on all plans. Security via CRON_SECRET env var. Duplicate invocations possible—idempotency recommended. No concurrency control built in (suggests Redis locks).",
+ "evidence": "vercel.json: {\"crons\": [{\"path\": \"/api/cron\", \"schedule\": \"0 5 * * *\"}]}. Hobby: once/day, ±59min precision. Pro: once/min, per-minute precision. 100 cron jobs per project. 'Vercel will not retry an invocation if a cron job fails.' User-Agent: vercel-cron/1.0.",
+ "status": "live",
+ "source_urls": ["https://vercel.com/docs/cron-jobs"],
+ "sub_theme": "cloud-managed scheduler",
+ "merged_from": 2
+ },
+ {
+ "name": "Vercel: blocks by default, --no-wait to skip",
+ "what": "vercel deploy blocks by default, waiting for the deployment to finish. stdout is always the deployment URL. The --no-wait flag 'does not wait for a deployment to finish before exiting.' The --logs flag additionally streams build logs. Default shows a progress indicator and prints the URL on completion.",
+ "evidence": "Vercel CLI docs for deploy: '--no-wait: Does not wait for a deployment to finish before exiting from the deploy command.' Separate --logs flag prints build logs. CI/CD scripts capture the deployment URL from stdout.",
+ "status": "live",
+ "source_urls": ["https://vercel.com/docs/cli/deploy"],
+ "sub_theme": "foreground-default-no-wait-opt-in",
+ "merged_from": 2
+ },
+ {
+ "name": "VS Code extension host — process isolation with lazy activation and JSON-RPC IPC",
+ "what": "VS Code runs extensions in a separate Extension Host process (Node.js), isolating them from the UI. Communication uses JSON-RPC over IPC (extension API calls serialized to JSON requests). Extensions activate lazily based on activation events (onLanguage, onCommand, etc.) — unused extensions consume zero memory. Each window gets its own extension host. Extensions can spawn child processes. ExtensionMemento provides persistent state storage per extension. This pattern of lazy activation + process isolation + persistent state is directly applicable to a CLI daemon.",
+ "evidence": "Extension Host runs in separate Node.js process. Communication via JSON-RPC over IPC. Lazy activation via activation events. ExtensionMemento for persistent state. One extension host per window.",
+ "status": "live",
+ "source_url": "https://code.visualstudio.com/api/advanced-topics/extension-host",
+ "sub_theme": "vscode_extension_host"
+ },
+ {
+ "name": "Webhook/callback URL pattern",
+ "what": "Some async APIs accept a callback URL in the initial request and POST results there on completion. Not common in CLI tools directly, but the pattern is well-established in API design (Stripe, Twilio, etc.). A CLI could accept --webhook-url and POST on completion. This is heavyweight for most CLI use cases but valuable for CI/CD integration.",
+ "evidence": "Hookdeck guide on webhooks vs callbacks. Standard in cloud APIs but rarely seen in CLI tools themselves.",
+ "status": "pattern",
+ "source_urls": [
+ "https://hookdeck.com/webhooks/guides/webhooks-callbacks"
+ ],
+ "sub_theme": "machine-to-machine",
+ "merged_from": 2
+ },
+ {
+ "name": "Whenever (Ruby crontab DSL)",
+ "what": "Ruby gem that provides a human-readable DSL for defining cron jobs in config/schedule.rb, then writes them directly to the system crontab. CLI: `whenever` (preview cron output), `whenever --update-crontab` (write to crontab), `whenever --clear-crontab` (remove entries). DSL supports `every 3.hours do`, `every :sunday, at: '12pm'`, `every 1.day, at: '4:30 am'`. Three job types: runner (Rails method), rake (Rake task), command (shell). Uses identifier-based namespacing so multiple apps can share a crontab without conflicts. Integrates with Capistrano for deploy-time crontab updates. State is the system crontab itself. No daemon—pure cron. No built-in failure handling or monitoring.",
+ "evidence": "schedule.rb: every 3.hours do; runner 'MyModel.some_process'; end. CLI: whenever --update-crontab --user app. Custom job types: job_type :awesome, '/usr/local/bin/awesome :task :fun_level'. Capistrano: require 'whenever/capistrano'.",
+ "status": "live",
+ "source_urls": ["https://github.com/javan/whenever"],
+ "sub_theme": "crontab wrapper/generator",
+ "merged_from": 2
+ },
+ {
+ "name": "zsh-notify (zsh plugin)",
+ "what": "Zsh-specific plugin that sends desktop notifications for long-running commands. Uses macOS Notification Center or notify-send on Linux. Distinguishes success/failure with different notification styles. Threshold-based (default 30s).",
+ "evidence": "GitHub repo marzocchi/zsh-notify. Active maintenance.",
+ "status": "live",
+ "source_urls": ["https://github.com/marzocchi/zsh-notify"],
+ "sub_theme": "automatic-shell-integration",
+ "merged_from": 2
+ }
+ ],
+ "gaps": [
+ {
+ "topic": "s6 supervision suite",
+ "note": "s6 is a modern alternative to runit/daemontools popular in Alpine/container images. Did not deep-dive into its readiness notification (s6-notifyoncheck) and dependency management which are more sophisticated than runit."
+ },
+ {
+ "topic": "Windows service integration",
+ "note": "Focused on Unix patterns. Windows has different mechanisms: Windows Service Control Manager, NSSM (Non-Sucking Service Manager), and named pipes instead of Unix sockets. Relevant if CLI targets Windows."
+ },
+ {
+ "topic": "Electron/desktop app daemon patterns",
+ "note": "Apps like VS Code, Slack, and Docker Desktop manage background processes differently (tray icon, auto-update daemons). Not explored."
+ },
+ {
+ "topic": "Health check endpoints",
+ "note": "Many daemons expose HTTP health check endpoints (e.g., Docker's /_ping, Kubernetes liveness/readiness probes). Patterns for CLI-started daemons with health check URLs not deeply explored."
+ },
+ {
+ "topic": "Log rotation and management",
+ "note": "PM2's pm2-logrotate module and systemd's journald handle log rotation, but specific patterns for CLI-managed daemon logs (file size limits, rotation policies) not deeply covered."
+ },
+ {
+ "description": "Exact terminal output of Vercel CLI during the build phase (between Inspect and Preview URLs) is not documented in detail -- only the final Inspect/Preview URLs with timing are shown in docs and blog posts.",
+ "attempted_sources": [
+ "https://vercel.com/docs/cli/deploy",
+ "https://www.alexchantastic.com/deploying-with-vercel-cli"
+ ]
+ },
+ {
+ "description": "Fly.io's exact monitoring output format (what the step-by-step progress looks like) is described only in community forum posts, not in official documentation.",
+ "attempted_sources": [
+ "https://fly.io/docs/flyctl/deploy/",
+ "https://community.fly.io/t/how-fly-deploy-detach-works/25607"
+ ]
+ },
+ {
+ "description": "Railway CLI's exact terminal output format during railway up (what the progress looks like) is not shown in the docs -- docs say 'real-time deployment logs' but don't show example output.",
+ "attempted_sources": ["https://docs.railway.com/cli/deploying"]
+ },
+ {
+ "description": "Whether Stripe CLI has any reconnection/retry logic when the listen or logs tail connection drops is not documented.",
+ "attempted_sources": ["https://docs.stripe.com/stripe-cli/use-cli"]
+ },
+ {
+ "description": "How Heroku users check the status of a build that was detached via Ctrl+C -- heroku releases and heroku ps are documented separately but not in the context of post-detach monitoring.",
+ "attempted_sources": [
+ "https://devcenter.heroku.com/articles/git",
+ "https://devcenter.heroku.com/articles/releases"
+ ]
+ },
+ "No quantitative data on what percentage of users have terminal bell mapped to notifications vs. disabled",
+ "No survey data on CLI notification preferences (Slack vs desktop vs push vs email)",
+ "No data on how many terminal users have shell integration (OSC 133) enabled, which is required for terminal-native auto-notifications",
+ "Email notification for CLI completion: found no CLI tools that implement this — it appears to be a dead pattern for CLI tools (exists only in web-based CI/CD like GitHub Actions)",
+ "No data on notification fatigue — at what frequency do notifications become annoying rather than helpful",
+ "Windows Terminal notification support for OSC 9 is documented but real-world adoption/reliability is unclear",
+ "Linear CLI: could not access Linear docs on cycles/automation (404 errors). Linear has cycles and auto-close/auto-archive workflows but unclear if these are exposed via any CLI tool.",
+ "Jobber jobfile format: detailed YAML configuration docs were inaccessible (readthedocs 404). The error handling model (Stop/Backoff/Continue with notifications) is confirmed from the GitHub README but specific YAML syntax examples were not retrieved.",
+ "OpenClaw/NanoClaw: no relevant results found. The agent framework landscape for cron-scheduled tasks was not explored in this wave.",
+ "systemd timers as a scheduling mechanism: not researched as a standalone pattern. systemd timers are an alternative to cron with richer features (OnCalendar, Persistent=true for catching up missed runs, RandomizedDelaySec). Would be worth a dedicated investigation.",
+ "Temporal / Inngest / Trigger.dev: cloud-based workflow engines with scheduled triggers that could be relevant as 'schedule from CLI, runs in cloud' patterns. Not covered in this wave.",
+ "macOS launchd scheduled agents: launchd plist files with StartCalendarInterval provide native macOS scheduling without cron. Homebrew Services generates these but the raw launchd scheduling model was not deeply explored.",
+ "Could not find detailed Vercel --no-wait user experience (what exactly prints after --no-wait — just the URL, or also a message?)",
+ "Heroku has no explicit async flag — only the Ctrl+C convention. Could not verify if newer Heroku CLI versions added one.",
+ "Did not research Render, Netlify, or Cloudflare Workers deploy CLIs which may have additional patterns.",
+ "Did not research long-running data pipeline CLIs (Spark, dbt) which may have different async conventions relevant to data connector use cases.",
+ {
+ "topic": "MX webhook payload structure and connection status change callbacks",
+ "reason": "MX Academy and docs.mx.com pages were not fully accessible. Could not extract webhook payload schemas or callback mechanisms for connection status transitions.",
+ "suggested_source": "https://docs.mx.com/resources/webhooks/"
+ },
+ {
+ "topic": "Finicity aggregation status codes — full numeric code list",
+ "reason": "Finicity docs redirect to Mastercard Open Banking portal which did not expose the aggregation status codes page directly. The specific numeric codes (e.g., 103=MFA required, 185=credentials changed) were not extracted.",
+ "suggested_source": "https://developer.mastercard.com/open-banking-us/documentation/api-reference/"
+ },
+ {
+ "topic": "Quantitative data on re-auth conversion rates",
+ "reason": "Plaid and other services do not publish conversion/drop-off rates for re-authentication flows. Industry benchmarks for re-auth completion rates are not publicly available.",
+ "suggested_source": "Plaid sales team or industry reports"
+ },
+ {
+ "topic": "Browser automation session longevity by platform",
+ "reason": "No research was found on how long browser sessions/cookies typically last for specific platforms (GitHub, ChatGPT, LinkedIn) before requiring re-authentication. This is critical for the CLI's scheduling design.",
+ "suggested_source": "Empirical testing or platform-specific documentation"
+ },
+ {
+ "topic": "Yodlee connection health model details",
+ "reason": "Yodlee developer portal content was not fully extractable from search results. Their error code taxonomy and connection lifecycle management details remain unclear.",
+ "suggested_source": "https://developer.yodlee.com/resources/yodlee/error-codes/docs"
+ },
+ {
+ "name": "SQLite vs JSON file for daemon state persistence",
+ "what": "Need to evaluate whether daemon state (schedule config, last run times, auth status, collection history) warrants SQLite (better for concurrent access, querying) vs a simple JSON file (simpler, no native dependency). SQLite has the better-sqlite3 package which is synchronous and fast but requires native compilation. JSON is zero-dependency but risks corruption on ungraceful shutdown without write-ahead patterns."
+ },
+ {
+ "name": "Windows support complexity",
+ "what": "Windows lacks Unix domain sockets (uses named pipes instead) and has no launchd/systemd equivalent (Task Scheduler is cron-like, not a process supervisor). Need to decide if Windows is a v1 target or can be deferred. node-ipc handles socket/pipe abstraction but Windows service registration (via node-windows or nssm) adds significant complexity."
+ },
+ {
+ "name": "Daemon log management and rotation",
+ "what": "Long-running daemons accumulate logs. PM2 handles this with pm2-logrotate module. Need a strategy for the vana daemon: file-based logging with rotation, or structured logging to a bounded file. Also need to decide how 'vana daemon logs' surfaces these to the user."
+ },
+ {
+ "name": "Security of Unix socket — file permissions and access control",
+ "what": "The daemon's Unix socket file should be readable/writable only by the owning user (mode 0600). Need to verify Node.js net.createServer for Unix sockets respects umask or requires explicit chmod. PM2 has had issues with root-owned socket files after reboot (GitHub issue #4226)."
+ },
+ {
+ "name": "Daemon version mismatch handling",
+ "what": "If the user updates the CLI package, the running daemon may be an older version. Need a protocol version check on connection (like Docker API versioning) and a strategy for graceful daemon restart on version mismatch. Turborepo handles this but details of their approach were not found."
+ }
+ ]
+}
diff --git a/research/async-cli/findings/wave1-lifecycle.json b/research/async-cli/findings/wave1-lifecycle.json
new file mode 100644
index 00000000..3b88e439
--- /dev/null
+++ b/research/async-cli/findings/wave1-lifecycle.json
@@ -0,0 +1,147 @@
+{
+ "agent_question": "Background process lifecycle management in CLIs",
+ "findings": [
+ {
+ "name": "Docker daemon (dockerd)",
+ "what": "Runs as a persistent daemon process managed by systemd. Uses --pidfile flag (default /var/run/docker.pid) to store process ID. Communicates via Unix domain socket at /var/run/docker.sock, with optional TCP binding for remote access. Supports systemd socket activation via fd:// syntax. SIGHUP reloads configuration without restart; systemd handles crash recovery and auto-restart. Multiple daemon instances require unique PID file and socket paths. The docker CLI is a thin client that sends commands over the socket; the daemon does all container management.",
+ "evidence": "dockerd docs: '--pidfile=/var/run/docker.pid' flag, '-H unix:///var/run/docker.sock' default socket, SIGHUP for config reload. PR #41465 added PIDFile to docker.service unit so systemd reliably cleans up stale PID files on crash.",
+ "status": "live",
+ "source_url": "https://docs.docker.com/reference/cli/dockerd/",
+ "sub_theme": "daemon_with_systemd"
+ },
+ {
+ "name": "PM2 (Node.js process manager)",
+ "what": "Two-process architecture: a persistent 'God Daemon' and a thin CLI client. Daemon communicates via two Unix sockets: ~/.pm2/rpc.sock (commands) and ~/.pm2/pub.sock (events/logs). PID files stored in ~/.pm2/pids/-.pid. On 'pm2 start', CLI connects to daemon (spawning it if needed), daemon forks the target process. Graceful shutdown sends SIGINT first, then SIGKILL after configurable kill_timeout (default 1.6s). Auto-restart on crash with exponential backoff option. 'pm2 startup' generates platform-specific init scripts (systemd/launchd/upstart). 'pm2 save' + 'pm2 resurrect' persists process list across reboots via dump.pm2 file.",
+ "evidence": "PM2 docs: 'PM2 is a daemon process manager', PID files at ~/.pm2/pids/, socket files at ~/.pm2/rpc.sock and ~/.pm2/pub.sock, 'pm2 startup' generates systemd/launchd/upstart scripts, exponential backoff via --exp-backoff-restart-delay, kill_timeout configurable.",
+ "status": "live",
+ "source_url": "https://pm2.keymetrics.io/docs/usage/pm2-doc-single-page/",
+ "sub_theme": "node_daemon_manager"
+ },
+ {
+ "name": "Homebrew services (brew services)",
+ "what": "Thin wrapper that generates OS-native service definitions. On macOS: writes LaunchAgent plist files to ~/Library/LaunchAgents/homebrew.mxcl..plist (user) or /Library/LaunchDaemons/ (root). On Linux: writes systemd unit files to ~/.config/systemd/user/ (user) or /usr/lib/systemd/system/ (root). The plist/unit includes KeepAlive/Restart=always for auto-restart on crash, RunAtLoad for boot start, and log paths. 'brew services start' registers with launchctl/systemctl; 'brew services stop' unregisters. No custom PID management -- delegates entirely to OS service manager.",
+ "evidence": "Plist includes KeepAlive and RunAtLoad. Files at ~/Library/LaunchAgents/homebrew.mxcl..plist. Linux uses systemd user units.",
+ "status": "live",
+ "source_url": "https://www.dorokhovich.com/blog/homebrew-services",
+ "sub_theme": "os_service_wrapper"
+ },
+ {
+ "name": "Supervisor (supervisord)",
+ "what": "Client/server architecture: supervisord daemon manages child processes, supervisorctl CLI sends commands. Expects to be the direct parent of managed processes (children must NOT daemonize themselves). Provides pidproxy tool for processes that fork. Configuration via INI-style [program:x] sections. Supports process groups, event listeners for notifications, and XML-RPC API for programmatic control. Separate activity log (supervisord operations) and child process logs. Designed to start at boot but NOT as PID 1.",
+ "evidence": "Supervisor docs: 'client/server system that allows its users to monitor and control a number of processes on UNIX-like operating systems', supports [program:x] config, event listeners, XML-RPC API.",
+ "status": "live",
+ "source_url": "https://supervisord.org/",
+ "sub_theme": "process_supervisor"
+ },
+ {
+ "name": "runit",
+ "what": "Lightweight cross-platform init scheme with service supervision. Each service is a directory containing a 'run' script. The runsv process supervises each service, automatically restarting it on exit. Optional './check' script for readiness polling. Optional './log/run' for dedicated log service per process. Can replace PID 1 or run alongside other init systems. Uses a simple directory-based state model: service is 'up' if its run script is running, 'down' if not.",
+ "evidence": "Arch Wiki: 'runit runs on GNU/Linux, *BSD, MacOSX, Solaris'. Gentoo wiki: 'Process supervision is a type of operating system service management in which some master process remains the parent of the service processes.'",
+ "status": "live",
+ "source_url": "https://smarden.org/runit/",
+ "sub_theme": "process_supervisor"
+ },
+ {
+ "name": "Turborepo daemon",
+ "what": "Background Rust daemon for file watching and package discovery. Communicates via gRPC over Unix domain sockets at .turbo/daemon.sock. Uses PID lock file at .turbo/daemon.pid to prevent multiple instances. CLI spawns daemon on first connect if not running (DaemonConnector.connect()). Daemon provides FileSystemWatcher, PackageWatcher, HashWatcher, and GlobWatcher services. Uses cookie files in .turbo/cookies/ for synchronization between watchers. --daemon/--no-daemon flags control usage; automatically disabled in CI. On crash, CLI detects disconnection and can continue without daemon or spawn a replacement.",
+ "evidence": "DeepWiki: socket at '.turbo/daemon.sock', PID lock at '.turbo/daemon.pid', gRPC IPC via TurboGrpcService, DaemonConnector manages CLI connections, disabled in CI automatically.",
+ "status": "live",
+ "source_url": "https://deepwiki.com/vercel/turborepo/2-architecture",
+ "sub_theme": "cli_embedded_daemon"
+ },
+ {
+ "name": "forever (Node.js)",
+ "what": "Legacy Node.js daemon manager (now recommends pm2/nodemon for new projects). Runs a Flatiron server as daemon to manage child processes. PID files stored in ~/.forever/ directory. Supports --pidFile for custom PID location. Has built-in cleanup for extraneous/stale PID files (forever.cleanUp() emits 'cleanUp' event). Auto-restarts crashed processes. 'forever start' daemonizes; 'forever stop/stopall' terminates; 'forever list' shows running processes.",
+ "evidence": "npm docs: 'A simple CLI tool for ensuring that a given script runs continuously (i.e. forever)', default files in $HOME/.forever, cleanUp functionality for stale PIDs.",
+ "status": "live",
+ "source_url": "https://github.com/foreversd/forever",
+ "sub_theme": "node_daemon_manager"
+ },
+ {
+ "name": "PID file stale detection patterns",
+ "what": "Three main approaches: (1) kill(pid, 0) / kill -0 -- sends signal 0 which checks process existence without actually sending a signal. Returns success if process exists, ESRCH error if not. Caveat: non-root users may get EPERM for others' processes, which still means the process exists. (2) pgrep -F -- reads PID from file and checks if process exists, returning exit status 1 if stale. (3) File locking (flock/fcntl) -- process holds advisory lock on PID file; if lock is obtainable, previous process is dead. File locking is the most robust approach as it handles PID recycling. All approaches should be combined with atexit/signal-handler cleanup to remove PID file on normal exit.",
+ "evidence": "Python pid library uses flock. Node.js daemon-pid provides 'start-time verification to ensure the recorded process-id was not recycled by the OS'. Common pattern: check kill(pid, 0), if ESRCH then stale, if EPERM then process exists but owned by another user.",
+ "status": "live",
+ "source_url": "https://github.com/trbs/pid",
+ "sub_theme": "pid_management"
+ },
+ {
+ "name": "systemd service integration pattern",
+ "what": "CLIs that install themselves as systemd services write a .service unit file to ~/.config/systemd/user/ (user) or /etc/systemd/system/ (system). Key directives: ExecStart (binary path), Restart=on-failure or always, RestartSec (delay between restarts), PIDFile (for forking daemons), Type=simple|forking|notify. Socket activation via .socket units allows on-demand daemon startup. 'systemctl --user enable ' persists across reboots. Journal logging (journalctl -u ) replaces custom log files. Health checks via systemd-notify for Type=notify services.",
+ "evidence": "Docker's docker.service uses PIDFile directive. Homebrew generates systemd user units on Linux. PM2's 'pm2 startup' generates systemd unit files. Turborepo and other modern CLIs prefer to manage their own daemon rather than relying on systemd.",
+ "status": "live",
+ "source_url": "https://docs.docker.com/reference/cli/dockerd/",
+ "sub_theme": "os_service_integration"
+ },
+ {
+ "name": "launchd plist integration pattern",
+ "what": "macOS CLIs register as LaunchAgents (user-level) or LaunchDaemons (system-level). Plist XML defines: Label (unique ID), ProgramArguments (command), RunAtLoad (start on login/boot), KeepAlive (auto-restart on crash), StandardOutPath/StandardErrorPath (log files), WorkingDirectory. 'launchctl load/unload ' manages lifecycle. Files at ~/Library/LaunchAgents/ (user) or /Library/LaunchDaemons/ (system). No PID file needed -- launchd tracks child PIDs internally.",
+ "evidence": "Homebrew services generates plists with KeepAlive=true and RunAtLoad=true. PostgreSQL example shows full plist structure with ProgramArguments, log paths, and working directory.",
+ "status": "live",
+ "source_url": "https://www.dorokhovich.com/blog/homebrew-services",
+ "sub_theme": "os_service_integration"
+ },
+ {
+ "name": "kubectl watch pattern",
+ "what": "kubectl uses --watch/-w flag for long-running observation of resource changes. This is NOT a background process -- it's a foreground streaming connection using Kubernetes watch API (HTTP long-poll with chunked transfer). 'kubectl get pods -w' streams state changes in real-time. The CLI maintains a persistent connection to the API server. No PID management needed since it runs in the foreground. For background monitoring, users rely on external tools (Kubernetes controllers, operators) that run as pods themselves.",
+ "evidence": "kubectl docs: '--watch' flag starts watching updates. 'kubectl get events --watch --field-selector involvedObject.name=myapp-pod' for targeted watching. Kubernetes controllers handle actual background supervision.",
+ "status": "live",
+ "source_url": "https://kubernetes.io/docs/reference/kubectl/",
+ "sub_theme": "foreground_streaming"
+ },
+ {
+ "name": "Gatsby develop (foreground long-running CLI)",
+ "what": "Runs as a foreground process with file watching via webpack. No daemon mode -- process lives in the terminal. Uses port-based conflict detection: checks if port 8000 is in use and warns 'Looks like develop for this site is already running'. Known issues with Ctrl+C not properly killing the process (zombie process problem). Relies on the user's terminal for lifecycle management. No PID file, no daemon mode, no auto-restart.",
+ "evidence": "GitHub issue #5810: 'Ctrl+C not quitting CLI'. Discussion #26869: infinite loop with stale port detection. Gatsby develop is purely foreground with webpack-dev-server.",
+ "status": "live",
+ "source_url": "https://www.gatsbyjs.com/docs/reference/gatsby-cli/",
+ "sub_theme": "foreground_dev_server"
+ },
+ {
+ "name": "Node.js graceful shutdown patterns",
+ "what": "Best practice for Node.js daemons: listen for SIGINT and SIGTERM (cross-platform safe). On signal: (1) stop accepting new connections, (2) drain existing connections with timeout, (3) close database pools, (4) remove PID file, (5) call process.exit(). Use a forced-exit timeout (e.g., setTimeout(() => process.exit(1), 10000)) as safety net. PID file cleanup via process.on('exit') or atexit-style handlers. SIGUSR1/SIGUSR2 not safe on Windows. For containers (PID 1 trap): Node.js doesn't forward signals to children when running as PID 1 -- use tini or dumb-init as entrypoint.",
+ "evidence": "Medium article on PID 1 trap: 'Node.js doesn't forward signals to children when running as PID 1'. Node.js docs recommend only SIGINT and SIGTERM for cross-platform compatibility.",
+ "status": "live",
+ "source_url": "https://medium.com/@etienne.rossignon/stop-killing-your-node-js-containers-the-pid-1-trap-5c18abffd72c",
+ "sub_theme": "graceful_shutdown"
+ },
+ {
+ "name": "Unix socket IPC pattern (daemon communication)",
+ "what": "Production CLIs overwhelmingly prefer Unix domain sockets over PID files for daemon communication. Pattern: daemon creates socket file (e.g., /var/run/docker.sock, ~/.pm2/rpc.sock, .turbo/daemon.sock), CLI connects as client. Socket existence + successful connect = daemon is alive (no need for PID file to check liveness). Socket files are automatically cleaned up by the OS on process termination if using abstract sockets. For file-based sockets, daemon removes on clean shutdown; stale sockets detected by failed connect attempt. Advantages over PID files: bidirectional communication, no PID recycling problem, atomic liveness check.",
+ "evidence": "Docker uses /var/run/docker.sock, PM2 uses ~/.pm2/rpc.sock + ~/.pm2/pub.sock, Turborepo uses .turbo/daemon.sock. All three use socket connectivity as the primary liveness check rather than PID files.",
+ "status": "live",
+ "source_url": "https://docs.docker.com/reference/cli/dockerd/",
+ "sub_theme": "ipc_pattern"
+ },
+ {
+ "name": "nohup and terminal multiplexer patterns",
+ "what": "Lightweight background process approaches without daemon infrastructure: (1) nohup -- detaches process from terminal, redirects output to nohup.out, survives terminal close. No monitoring, no auto-restart, no PID management. (2) tmux/screen -- session persistence, user can reattach to see output. Some CLIs document 'run in tmux' as official guidance for long-running tasks. (3) disown -- bash builtin to remove job from shell's job table. These are user-managed approaches with no supervision; commonly used for ad-hoc tasks rather than production services.",
+ "evidence": "Multiple Node.js deployment guides suggest nohup for simple cases, pm2/systemd for production. tmux/screen commonly recommended in CLI docs for long-running tasks that need output inspection.",
+ "status": "live",
+ "source_url": "https://pixeljets.com/blog/using-supervisorctl-for-node-processes-common-gotchas/",
+ "sub_theme": "lightweight_background"
+ }
+ ],
+ "gaps": [
+ {
+ "topic": "s6 supervision suite",
+ "note": "s6 is a modern alternative to runit/daemontools popular in Alpine/container images. Did not deep-dive into its readiness notification (s6-notifyoncheck) and dependency management which are more sophisticated than runit."
+ },
+ {
+ "topic": "Windows service integration",
+ "note": "Focused on Unix patterns. Windows has different mechanisms: Windows Service Control Manager, NSSM (Non-Sucking Service Manager), and named pipes instead of Unix sockets. Relevant if CLI targets Windows."
+ },
+ {
+ "topic": "Electron/desktop app daemon patterns",
+ "note": "Apps like VS Code, Slack, and Docker Desktop manage background processes differently (tray icon, auto-update daemons). Not explored."
+ },
+ {
+ "topic": "Health check endpoints",
+ "note": "Many daemons expose HTTP health check endpoints (e.g., Docker's /_ping, Kubernetes liveness/readiness probes). Patterns for CLI-started daemons with health check URLs not deeply explored."
+ },
+ {
+ "topic": "Log rotation and management",
+ "note": "PM2's pm2-logrotate module and systemd's journald handle log rotation, but specific patterns for CLI-managed daemon logs (file size limits, rotation policies) not deeply covered."
+ }
+ ]
+}
diff --git a/research/async-cli/findings/wave1-long-running.json b/research/async-cli/findings/wave1-long-running.json
new file mode 100644
index 00000000..9951f94a
--- /dev/null
+++ b/research/async-cli/findings/wave1-long-running.json
@@ -0,0 +1,172 @@
+{
+ "agent_question": "How best-in-class CLIs handle long-running operations",
+ "findings": [
+ {
+ "name": "Stripe CLI: stripe listen runs as a foreground-only long-lived process",
+ "what": "stripe listen maintains a persistent WebSocket connection to Stripe, forwarding webhook events to a local endpoint. It displays 'Ready! Your webhook signing secret is ... (^C to quit)' and then streams events as they arrive. There is no background/detach mode; the process runs until Ctrl+C. If the terminal is closed, the listener stops.",
+ "evidence": "Stripe docs state: 'Ready! Your webhook signing secret is {{WEBHOOK_SIGNING_SECRET}} (^C to quit)'. The listen command establishes a direct connection with Stripe, delivering webhook events to your computer directly.",
+ "status": "live",
+ "source_url": "https://docs.stripe.com/stripe-cli/use-cli",
+ "sub_theme": "foreground-only long-lived process"
+ },
+ {
+ "name": "Stripe CLI: stripe logs tail streams API logs in real-time",
+ "what": "stripe logs tail establishes a direct connection with Stripe and streams test mode API request logs in real-time. It supports filters (--filter-account, --filter-http-method, --filter-request-path, --filter-request-status). It runs in the foreground with no detach mode.",
+ "evidence": "Stripe docs: 'The logs tail command establishes a direct connection with Stripe and enables you to tail your test mode Stripe API request logs in real-time from your terminal.'",
+ "status": "live",
+ "source_url": "https://docs.stripe.com/cli/logs/tail",
+ "sub_theme": "foreground-only long-lived process"
+ },
+ {
+ "name": "Vercel CLI: deploy blocks by default, has --no-wait flag",
+ "what": "vercel deploy blocks by default, showing progress phases: uploading, building, completing. Terminal output shows an Inspect URL and Preview/Production URL with timing. Example output: 'Inspect: https://vercel.com/myorg/project/2x6dq3wmp [3s]' then 'Preview: https://my-project.vercel.app [6s]'. The --no-wait flag exits immediately after initiating the deploy without waiting for completion. The --logs flag additionally prints build logs. stdout is always the deployment URL, enabling piping (vercel > deployment-url.txt).",
+ "evidence": "Vercel docs: '--no-wait: does not wait for a deployment to finish before exiting from the deploy command.' Also: 'When deploying, stdout is always the Deployment URL.' Example output shows Inspect and Preview URLs with timing brackets.",
+ "status": "live",
+ "source_url": "https://vercel.com/docs/cli/deploy",
+ "sub_theme": "blocking with opt-out"
+ },
+ {
+ "name": "Vercel CLI: status check via vercel inspect after --no-wait",
+ "what": "After using --no-wait, users can check deployment status with vercel inspect, which shows deployment details and state. There is no built-in 'watch' or re-attach command for monitoring a deployment initiated with --no-wait.",
+ "evidence": "Vercel CLI docs list vercel inspect as a separate command for examining deployments. The --no-wait docs do not describe a re-attach mechanism.",
+ "status": "live",
+ "source_url": "https://vercel.com/docs/cli/deploy",
+ "sub_theme": "status checking"
+ },
+ {
+ "name": "Railway CLI: railway up has three modes: attached (default), --detach, --ci",
+ "what": "railway up uploads and deploys, defaulting to attached mode that streams build and deployment logs in real-time. --detach (-d) returns immediately after uploading; the deployment continues in the background and can be monitored via dashboard or 'railway logs'. --ci (-c) streams only build logs and exits when the build completes, designed for CI/CD pipelines.",
+ "evidence": "Railway docs: 'By default, railway up operates in attached mode and streams build and deployment logs to your terminal.' '--detach: return immediately after uploading.' '--ci: stream only build logs and exit when the build completes.'",
+ "status": "live",
+ "source_url": "https://docs.railway.com/cli/deploying",
+ "sub_theme": "three-mode pattern"
+ },
+ {
+ "name": "Railway CLI: railway logs for monitoring after detach",
+ "what": "railway logs streams logs by default via WebSocket, or fetches historical logs when using --lines, --since, or --until flags. The -d flag shows deployment logs, -b shows build logs. Can target a specific deployment ID.",
+ "evidence": "Railway docs: 'Stream mode (the default) connects via WebSocket and streams logs in real-time.' 'Use railway logs with the deployment ID to stream the failed build/deploy logs.'",
+ "status": "live",
+ "source_url": "https://docs.railway.com/cli/logs",
+ "sub_theme": "log streaming"
+ },
+ {
+ "name": "Fly.io: fly deploy blocks with monitoring, supports --detach flag",
+ "what": "fly deploy builds and deploys, then enters a 'Monitoring Deployment' phase showing machine status counts (e.g., '1 desired, 1 placed, 1 healthy, 0 unhealthy') and health check results. The terminal displays 'You can detach the terminal anytime without stopping the deployment.' The --detach flag returns immediately instead of monitoring. --wait-timeout (default 5m0s) controls how long to wait for machines to become healthy. Smoke checks monitor machines for ~10 seconds after start.",
+ "evidence": "Fly docs: '--detach: Return immediately instead of monitoring deployment progress.' '--wait-timeout: Time duration to wait for individual machines to transition states and become healthy. (default 5m0s).' Community posts show output: 'You can detach the terminal anytime without stopping the deployment' followed by 'Monitoring deployment' with machine status counts.",
+ "status": "live",
+ "source_url": "https://fly.io/docs/flyctl/deploy/",
+ "sub_theme": "blocking with detach"
+ },
+ {
+ "name": "Fly.io: --detach flag may not work as documented",
+ "what": "Community testing suggests the --detach flag may not function correctly. One user reported: 'it ignored the --detach flag completely -- and that matches what I see in the flyctl source code.' There is no documented re-attach mechanism after detaching.",
+ "evidence": "Fly.io community forum post: user reports --detach flag is non-functional based on testing and source code review.",
+ "status": "live",
+ "source_url": "https://community.fly.io/t/how-fly-deploy-detach-works/25607",
+ "sub_theme": "detach reliability"
+ },
+ {
+ "name": "Heroku: git push blocks, but Ctrl+C detaches without canceling",
+ "what": "git push heroku main blocks by default, streaming build output in real-time (compiling, installing dependencies, etc.). Pressing Ctrl+C detaches from the build output but does NOT cancel the build or deploy -- it continues server-side and creates a new release when complete. Status can be checked afterward with 'heroku releases' (release history), 'heroku ps' (running processes), and 'heroku releases:output' (build output for a specific release).",
+ "evidence": "Heroku docs: 'After you initiate a Heroku deploy with git push, you can detach from the resulting build process by pressing Ctrl + C. Detaching doesn't cancel the build or the deploy. The build continues in the background and creates a new release as soon as it completes.'",
+ "status": "live",
+ "source_url": "https://devcenter.heroku.com/articles/git",
+ "sub_theme": "Ctrl+C detach pattern"
+ },
+ {
+ "name": "GitHub CLI: gh run watch polls and displays workflow progress",
+ "what": "gh run watch monitors a GitHub Actions workflow run until it completes. It polls at a configurable interval (--interval, default 3 seconds) and displays step-by-step progress. --compact shows only relevant/failed steps. --exit-status exits with non-zero on failure. Without arguments, interactively selects from active runs. Can be chained: 'gh run watch && notify-send \"done\"'. Closing the terminal does not cancel the workflow -- the operation is server-side.",
+ "evidence": "GitHub CLI docs: 'Watch a run until it completes, showing its progress.' '--interval: Refresh interval in seconds (default 3).' '--exit-status: Exit with non-zero status if run fails.'",
+ "status": "live",
+ "source_url": "https://cli.github.com/manual/gh_run_watch",
+ "sub_theme": "separate watch command"
+ },
+ {
+ "name": "GitHub CLI: gh workflow run fires and forgets, separate from watching",
+ "what": "gh workflow run triggers a workflow dispatch event and returns immediately. It does not wait for or monitor execution. Users must separately use 'gh run watch' or 'gh run view' to monitor. This is a clean separation: trigger is async, monitoring is opt-in via a separate command.",
+ "evidence": "GitHub CLI manual separates 'gh workflow run' (trigger) from 'gh run watch' (monitor) as distinct commands.",
+ "status": "live",
+ "source_url": "https://cli.github.com/manual/gh_workflow_run",
+ "sub_theme": "fire-and-forget with separate monitor"
+ },
+ {
+ "name": "AWS CLI: wait subcommands poll silently until condition is met",
+ "what": "AWS CLI provides 'wait' subcommands for services with long-running operations (e.g., aws cloudformation wait stack-create-complete). These poll the API at fixed intervals (30 seconds for CloudFormation) until the condition is met or max attempts exceeded (120 for CloudFormation, producing exit code 255). The wait command produces NO terminal output -- it blocks silently. Operations themselves (e.g., aws cloudformation create-stack) return immediately with an ID; waiting is always opt-in via a separate command.",
+ "evidence": "AWS docs for stack-create-complete: 'It will poll every 30 seconds until a successful state has been reached. This will exit with a return code of 255 after 120 failed checks.' 'This command produces no output.'",
+ "status": "live",
+ "source_url": "https://docs.aws.amazon.com/cli/latest/reference/cloudformation/wait/stack-create-complete.html",
+ "sub_theme": "silent polling wait"
+ },
+ {
+ "name": "AWS CLI: S3 transfers show progress by default with opt-out",
+ "what": "aws s3 cp shows file transfer progress by default. For large files (>100MB), it uses multipart upload and shows 'Completed X of Y part(s) with Z file(s) remaining'. --no-progress disables the progress display. --quiet suppresses all output. --only-show-errors shows only errors. --progress-frequency controls update interval. --progress-multiline shows progress on multiple lines.",
+ "evidence": "AWS docs: 'File transfer progress is displayed by default.' '--no-progress: File transfer progress is not displayed.' For large uploads: 'Completed 9896 of 9896 part(s) with 1 file(s) remaining.'",
+ "status": "live",
+ "source_url": "https://docs.aws.amazon.com/cli/latest/reference/s3/cp.html",
+ "sub_theme": "progress display"
+ },
+ {
+ "name": "Docker: docker run -d (detached) vs foreground is the canonical pattern",
+ "what": "docker run defaults to foreground (attached) mode with stdin/stdout/stderr connected. The -d/--detach flag runs the container in the background, printing only the container ID. In detached mode, closing the terminal does not stop the container. Users check status with 'docker ps', view output with 'docker logs', and re-attach with 'docker attach'. This is the most well-known sync/async dual-mode pattern in CLIs.",
+ "evidence": "Docker docs: 'If you want to run the container in the background, you can use the --detach (or -d) flag.' 'Detached mode means that a Docker container runs in the background of your terminal. It does not receive input or display output.'",
+ "status": "live",
+ "source_url": "https://docs.docker.com/engine/containers/run/",
+ "sub_theme": "canonical detach pattern"
+ },
+ {
+ "name": "Docker: docker build --progress controls output format (auto/plain/tty/quiet/rawjson)",
+ "what": "docker build supports five progress output modes. 'auto' (default) selects tty for terminals, plain otherwise. 'tty' uses color and dynamic redrawing with progress bars. 'plain' prints raw build progress as plaintext with step numbers and timing. 'quiet' suppresses all output except final image ID. 'rawjson' outputs JSON lines for programmatic consumption. This demonstrates adaptive output based on terminal capabilities.",
+ "evidence": "Docker docs: '--progress: Set type of progress output (auto, plain, tty, quiet, rawjson). Use plain to show container output (default \"auto\").' 'auto: Uses tty mode if the client is a TTY, otherwise uses plain.'",
+ "status": "live",
+ "source_url": "https://docs.docker.com/reference/cli/docker/buildx/build/",
+ "sub_theme": "adaptive output format"
+ },
+ {
+ "name": "Pattern: CLIs that offer both sync and async for the same operation",
+ "what": "Multiple CLIs implement dual-mode patterns for the same operation: (1) Vercel: vercel deploy blocks by default, --no-wait returns immediately. (2) Railway: railway up attaches by default, --detach returns immediately. (3) Fly.io: fly deploy monitors by default, --detach returns immediately (though may be buggy). (4) Docker: docker run attaches by default, -d detaches. (5) AWS CloudFormation: create-stack returns immediately, 'wait stack-create-complete' blocks. The dominant pattern is block-by-default with opt-out, except AWS which is async-by-default with opt-in waiting.",
+ "evidence": "Compiled from official documentation for each CLI. Vercel: --no-wait flag. Railway: --detach flag. Fly.io: --detach flag. Docker: -d flag. AWS: separate wait subcommand.",
+ "status": "live",
+ "source_url": "https://vercel.com/docs/cli/deploy",
+ "sub_theme": "cross-CLI pattern"
+ },
+ {
+ "name": "Pattern: What happens when the terminal is closed mid-operation",
+ "what": "Server-side operations continue regardless of terminal state: Heroku builds continue after Ctrl+C. GitHub Actions workflows run independently of gh run watch. AWS CloudFormation stacks deploy independently of the CLI. Fly.io deployments continue if detached. Client-side operations stop: Stripe listen/logs tail stop when the terminal closes (they are local processes maintaining connections). The key distinction is whether the operation is server-side (survives terminal close) or client-side (dies with terminal).",
+ "evidence": "Heroku docs explicitly state: 'Detaching doesn't cancel the build or the deploy.' GitHub Actions workflows are server-side by nature. Stripe listen requires an active terminal connection.",
+ "status": "live",
+ "source_url": "https://devcenter.heroku.com/articles/git",
+ "sub_theme": "terminal close behavior"
+ }
+ ],
+ "gaps": [
+ {
+ "description": "Exact terminal output of Vercel CLI during the build phase (between Inspect and Preview URLs) is not documented in detail -- only the final Inspect/Preview URLs with timing are shown in docs and blog posts.",
+ "attempted_sources": [
+ "https://vercel.com/docs/cli/deploy",
+ "https://www.alexchantastic.com/deploying-with-vercel-cli"
+ ]
+ },
+ {
+ "description": "Fly.io's exact monitoring output format (what the step-by-step progress looks like) is described only in community forum posts, not in official documentation.",
+ "attempted_sources": [
+ "https://fly.io/docs/flyctl/deploy/",
+ "https://community.fly.io/t/how-fly-deploy-detach-works/25607"
+ ]
+ },
+ {
+ "description": "Railway CLI's exact terminal output format during railway up (what the progress looks like) is not shown in the docs -- docs say 'real-time deployment logs' but don't show example output.",
+ "attempted_sources": ["https://docs.railway.com/cli/deploying"]
+ },
+ {
+ "description": "Whether Stripe CLI has any reconnection/retry logic when the listen or logs tail connection drops is not documented.",
+ "attempted_sources": ["https://docs.stripe.com/stripe-cli/use-cli"]
+ },
+ {
+ "description": "How Heroku users check the status of a build that was detached via Ctrl+C -- heroku releases and heroku ps are documented separately but not in the context of post-detach monitoring.",
+ "attempted_sources": [
+ "https://devcenter.heroku.com/articles/git",
+ "https://devcenter.heroku.com/articles/releases"
+ ]
+ }
+ ]
+}
diff --git a/research/async-cli/findings/wave1-notifications.json b/research/async-cli/findings/wave1-notifications.json
new file mode 100644
index 00000000..62217a8b
--- /dev/null
+++ b/research/async-cli/findings/wave1-notifications.json
@@ -0,0 +1,136 @@
+{
+ "agent_question": "CLI task completion notification patterns",
+ "findings": [
+ {
+ "name": "Terminal bell (\\a / BEL)",
+ "what": "The simplest, most universal notification: print \\a (ASCII 7) to stdout. Modern terminals map this to OS-native behavior — visual flash, taskbar urgency hint, or native notification depending on config. Zero dependencies. Works over SSH. Warp disables audible bell by default but supports it. iTerm2, Ghostty, Alacritty, Windows Terminal, and foot all support it. Claude Code uses `preferredNotifChannel: terminal_bell` as its primary mechanism.",
+ "evidence": "Claude Code ships terminal_bell as a configurable notification channel. Warp docs show audible bell as a separate setting from desktop notifications. The muxup.com article demonstrates mapping BEL to urgency hints in tiling WMs. Ghostty maps BEL to macOS native notifications when unfocused.",
+ "status": "live",
+ "source_url": "https://muxup.com/2023q4/let-the-terminal-bells-ring-out",
+ "sub_theme": "zero-dependency"
+ },
+ {
+ "name": "OSC 9 / OSC 777 escape sequences",
+ "what": "Terminal escape sequences that trigger native OS desktop notifications without any external dependency. OSC 9 (iTerm2/Windows Terminal format): `\\033]9;Message\\007`. OSC 777 (rxvt-unicode/VSCode format): `\\033]777;notify;Title;Message\\007`. Works over SSH when terminal forwards escape sequences. VSCode integrated terminal forwards these from remote hosts.",
+ "evidence": "Claude Code hooks use both OSC 9 and OSC 777 for desktop notifications. Ghostty, iTerm2, Windows Terminal, VSCode integrated terminal, and foot terminal all support one or both. A VSCode extension (terminal-osc-notifier) exists specifically for this.",
+ "status": "live",
+ "source_url": "https://github.com/ghostty-org/ghostty/discussions/3555",
+ "sub_theme": "zero-dependency"
+ },
+ {
+ "name": "Terminal-native command completion notifications (iTerm2, Warp, Ghostty)",
+ "what": "Modern terminals have built-in 'notify when command finishes' features that require shell integration (OSC 133 marks) but no app-side code. iTerm2: Cmd+Alt+A toggles 'Alert on next mark'. Warp: configurable threshold (e.g. 5s), only fires when app unfocused. Ghostty (1.3+): `notify-on-command-finish = unfocused | always | never` with `notify-on-command-finish-after = 5s`. These are the gold standard UX — the terminal handles everything.",
+ "evidence": "Ghostty issue #8991 and discussion #3555 document the feature. Warp docs at docs.warp.dev/terminal/more-features/notifications. iTerm2 shell integration docs.",
+ "status": "live",
+ "source_url": "https://docs.warp.dev/terminal/more-features/notifications",
+ "sub_theme": "terminal-native"
+ },
+ {
+ "name": "gh run watch (GitHub CLI polling pattern)",
+ "what": "GitHub CLI's `gh run watch` polls a CI run and streams progress with a live-updating table, then exits 0/1 on completion. It does NOT send any notification itself — the user chains it: `gh run watch && notify-send 'done'`. This is the canonical 'poll-then-exit' pattern: the CLI's job is to block until done, and the user composes notification on top.",
+ "evidence": "gh run watch docs at cli.github.com. Stuart Leeks blog post shows chaining with Windows toast notifications. GitHub blog post from 2021 introducing the feature.",
+ "status": "live",
+ "source_url": "https://cli.github.com/manual/gh_run_watch",
+ "sub_theme": "poll-then-exit"
+ },
+ {
+ "name": "node-notifier (cross-platform OS notifications from Node.js)",
+ "what": "Node.js library that sends native desktop notifications on macOS (Notification Center), Windows (Toaster/Balloons), and Linux (notify-send). Supports title, message, icon, sound, click-to-open URL. Has a separate CLI package (node-notifier-cli). 7M+ weekly npm downloads. This is the go-to for Node.js CLIs that want desktop notifications.",
+ "evidence": "npm page shows massive adoption. CLI variant allows `notify -t 'Title' -m 'Message' -s`. Used by webpack, Jest, and many build tools for completion notifications.",
+ "status": "live",
+ "source_url": "https://www.npmjs.com/package/node-notifier",
+ "sub_theme": "os-notification-library"
+ },
+ {
+ "name": "terminal-notifier (macOS-specific)",
+ "what": "macOS-only CLI tool for sending native Notification Center notifications. Simpler and more reliable than node-notifier on macOS since it's a native binary. Commonly used: `terminal-notifier -title 'Build' -message 'Done'`. Claude Code users frequently configure it as a hook.",
+ "evidence": "GitHub repo julienXX/terminal-notifier. Multiple blog posts show Claude Code hook integration. Andrea Grandi's blog post on using it with Claude Code.",
+ "status": "live",
+ "source_url": "https://github.com/julienXX/terminal-notifier",
+ "sub_theme": "os-notification-library"
+ },
+ {
+ "name": "notify-send (Linux)",
+ "what": "Linux-native CLI for desktop notifications via D-Bus/libnotify. Pre-installed on most desktop Linux distros. Usage: `notify-send 'Title' 'Message'`. The canonical Linux approach — gh run watch docs suggest chaining with it.",
+ "evidence": "Used in gh run watch examples. opensource.com article on Linux desktop notifications from terminal. undistract-me uses it internally.",
+ "status": "live",
+ "source_url": "https://opensource.com/article/22/1/linux-desktop-notifications",
+ "sub_theme": "os-notification-library"
+ },
+ {
+ "name": "ntfy.sh (HTTP push notifications)",
+ "what": "Self-hostable push notification service. Send notifications via simple HTTP PUT/POST: `curl -d 'Build done' ntfy.sh/mytopic`. Phone app subscribes to topics. CLI supports `ntfy publish --wait-cmd ` to auto-notify on completion, and `--wait-pid` to watch an already-running process. Free tier available. Great for remote/headless servers.",
+ "evidence": "ntfy.sh docs show CLI integration patterns. Integrations page lists Ansible, cron, systemd, and many CI tools. Self-hosting is well-documented.",
+ "status": "live",
+ "source_url": "https://ntfy.sh/",
+ "sub_theme": "push-notification-service"
+ },
+ {
+ "name": "dschep/ntfy (automatic shell notifications)",
+ "what": "Python utility that auto-detects long-running commands (>10s default) and sends desktop notifications when they finish. Shell integration for bash/zsh hooks into PROMPT_COMMAND / precmd. Supports backends: desktop (dbus), Pushover, Pushbullet, Slack, Telegram. Only notifies when terminal is NOT focused. The most 'magic' solution — requires no per-command opt-in.",
+ "evidence": "GitHub repo dschep/ntfy. auto-ntfy-done.sh provides automatic shell integration. Configurable via LONG_RUNNING_COMMAND_TIMEOUT.",
+ "status": "live",
+ "source_url": "https://github.com/dschep/ntfy",
+ "sub_theme": "automatic-shell-integration"
+ },
+ {
+ "name": "undistract-me (Ubuntu/Debian automatic notifications)",
+ "what": "Shell functions that hook into bash/zsh execution cycle. Auto-notifies for any command >10s when the terminal window is not focused. Configurable threshold via LONG_RUNNING_COMMAND_TIMEOUT. Can play sound via UDM_PLAY_SOUND. Ignore list for commands like `vim`. Pre-packaged in Ubuntu/Debian repos.",
+ "evidence": "GitHub repo jml/undistract-me. Available in Ubuntu repos (`apt install undistract-me`). Configurable ignore list and timeout.",
+ "status": "live",
+ "source_url": "https://github.com/jml/undistract-me",
+ "sub_theme": "automatic-shell-integration"
+ },
+ {
+ "name": "zsh-notify (zsh plugin)",
+ "what": "Zsh-specific plugin that sends desktop notifications for long-running commands. Uses macOS Notification Center or notify-send on Linux. Distinguishes success/failure with different notification styles. Threshold-based (default 30s).",
+ "evidence": "GitHub repo marzocchi/zsh-notify. Active maintenance.",
+ "status": "live",
+ "source_url": "https://github.com/marzocchi/zsh-notify",
+ "sub_theme": "automatic-shell-integration"
+ },
+ {
+ "name": "Slack/Discord webhook notifications",
+ "what": "CLIs that POST to Slack/Discord webhooks on completion. cli-notify-slack wraps any command and sends Slack notification with user, hostname, command, output, and exit status. notify_slack pipes stdout to Slack at configurable intervals. Pattern: set SLACK_TOKEN + SLACK_CHANNEL env vars, then `cli-notify-slack `.",
+ "evidence": "GitHub repos EntilZha/cli-notify-slack and catatsuy/notify_slack. Kubernetes deployment pipelines commonly post to Slack on completion.",
+ "status": "live",
+ "source_url": "https://github.com/EntilZha/cli-notify-slack",
+ "sub_theme": "team-notification"
+ },
+ {
+ "name": "Claude Code hooks system (reference implementation)",
+ "what": "Claude Code implements a multi-level notification system: (1) terminal bell as default, (2) OSC 9/777 for desktop notifications, (3) configurable hooks that run shell commands on 'Stop' event. Users configure via `claude config set --global preferredNotifChannel terminal_bell`. Hook scripts can call terminal-notifier, notify-send, or ntfy.sh. This is the closest prior art to what Vana Connect CLI needs.",
+ "evidence": "Multiple blog posts documenting setup. GitHub gist by michael-swann-rp shows complete multi-level notification config. Claude Code docs at code.claude.com/docs/en/hooks-guide.",
+ "status": "live",
+ "source_url": "https://code.claude.com/docs/en/hooks-guide",
+ "sub_theme": "reference-implementation"
+ },
+ {
+ "name": "Webhook/callback URL pattern",
+ "what": "Some async APIs accept a callback URL in the initial request and POST results there on completion. Not common in CLI tools directly, but the pattern is well-established in API design (Stripe, Twilio, etc.). A CLI could accept --webhook-url and POST on completion. This is heavyweight for most CLI use cases but valuable for CI/CD integration.",
+ "evidence": "Hookdeck guide on webhooks vs callbacks. Standard in cloud APIs but rarely seen in CLI tools themselves.",
+ "status": "pattern",
+ "source_url": "https://hookdeck.com/webhooks/guides/webhooks-callbacks",
+ "sub_theme": "machine-to-machine"
+ }
+ ],
+ "synthesis": {
+ "recommendation_for_1_10_min_tasks": [
+ "Tier 1 (ship immediately): Terminal bell (\\a) — zero deps, works everywhere, users already configure their terminals to handle it. Print \\a when task completes.",
+ "Tier 2 (ship immediately): OSC 9/777 escape sequences — zero deps, native desktop notifications in modern terminals (Ghostty, iTerm2, VSCode, Windows Terminal). Detect terminal and emit the right one.",
+ "Tier 3 (consider): node-notifier — cross-platform desktop notifications for users whose terminals don't support OSC. Adds a dependency but is battle-tested (7M+ weekly downloads).",
+ "Tier 4 (future): ntfy.sh integration via --notify-url flag — for headless/remote/CI scenarios. Simple HTTP POST, phone notifications.",
+ "Tier 5 (future): --webhook-url flag — for CI/CD pipelines and programmatic consumers."
+ ],
+ "what_users_actually_do": "Based on the evidence: (1) Power users configure their terminals (Ghostty/Warp/iTerm2) to auto-notify on long commands — they need nothing from the CLI. (2) Most users want desktop notifications — terminal bell mapped to OS notification is the path of least resistance. (3) The 'poll-then-exit' pattern (like gh run watch) lets users compose their own notification via shell chaining (`vana collect && notify-send done`). (4) Very few users set up Slack/webhook notifications for CLI tools — this is mainly for CI/CD.",
+ "key_insight": "The terminal bell (\\a) is massively underrated. Modern terminals (Ghostty, Warp, iTerm2, Windows Terminal) all map BEL to native OS notifications when the window is unfocused. By just printing \\a, you get desktop notifications for free in most modern terminal setups with zero dependencies. OSC 9/777 is the upgrade path for richer notifications (title + body)."
+ },
+ "gaps": [
+ "No quantitative data on what percentage of users have terminal bell mapped to notifications vs. disabled",
+ "No survey data on CLI notification preferences (Slack vs desktop vs push vs email)",
+ "No data on how many terminal users have shell integration (OSC 133) enabled, which is required for terminal-native auto-notifications",
+ "Email notification for CLI completion: found no CLI tools that implement this — it appears to be a dead pattern for CLI tools (exists only in web-based CI/CD like GitHub Actions)",
+ "No data on notification fatigue — at what frequency do notifications become annoying rather than helpful",
+ "Windows Terminal notification support for OSC 9 is documented but real-world adoption/reliability is unclear"
+ ]
+}
diff --git a/research/async-cli/findings/wave1-scheduled.json b/research/async-cli/findings/wave1-scheduled.json
new file mode 100644
index 00000000..f58b7d74
--- /dev/null
+++ b/research/async-cli/findings/wave1-scheduled.json
@@ -0,0 +1,109 @@
+{
+ "agent_question": "CLI scheduled and recurring task patterns",
+ "findings": [
+ {
+ "name": "Heroku Scheduler",
+ "what": "Free add-on that runs one-off dyno commands at fixed intervals (every 10 min, hourly, or daily). Configuration is done via a web dashboard opened with `heroku addons:open scheduler`—there is no declarative config file or CLI flag for defining schedules. State lives entirely in Heroku's cloud. Monitoring is via `heroku logs --ps scheduler.1`. No built-in retry, no failure notifications, and execution is best-effort ('known to occasionally miss'). For reliable scheduling, Heroku recommends a custom clock process instead. Key limitation: only 3 interval choices, no cron expressions, dashboard-only config.",
+ "evidence": "heroku addons:create scheduler:standard; heroku addons:open scheduler; heroku logs --ps scheduler.1. Task definition: Task: rake update_feed, Frequency: Hourly, Time: :30. Jobs execute as one-off dynos named scheduler.X visible via heroku ps.",
+ "status": "live",
+ "source_url": "https://devcenter.heroku.com/articles/scheduler",
+ "sub_theme": "cloud-managed scheduler"
+ },
+ {
+ "name": "GitHub Actions scheduled workflows (gh CLI)",
+ "what": "GitHub Actions supports `schedule` triggers using POSIX cron syntax in workflow YAML files. The `gh` CLI does not create schedules directly—users write `on: schedule: - cron: '...'` in .github/workflows/*.yml. The CLI manages workflows after creation: `gh workflow list/view/run/enable/disable` and `gh run list -e schedule` to filter scheduled runs. State lives in GitHub's cloud. Minimum interval: every 5 minutes. UTC only. Scheduled workflows on public repos auto-disable after 60 days of inactivity. Runs can be delayed during high load. Failures show as failed runs in `gh run list -s failure`.",
+ "evidence": "on: schedule: - cron: '30 5 * * 1,3'. gh workflow list; gh workflow run ; gh run list -e schedule -s completed. 'The shortest interval you can run scheduled workflows is once every 5 minutes.' 'In a public repository, scheduled workflows are automatically disabled when no repository activity has occurred in 60 days.'",
+ "status": "live",
+ "source_url": "https://docs.github.com/en/actions/using-workflows/events-that-trigger-workflows#schedule",
+ "sub_theme": "cloud-managed scheduler"
+ },
+ {
+ "name": "Kubernetes CronJobs (kubectl)",
+ "what": "First-class Kubernetes resource for recurring container execution using standard cron syntax. Created via `kubectl create cronjob --image= --schedule='...'` or declarative YAML. Rich configuration surface: concurrencyPolicy (Allow/Forbid/Replace), startingDeadlineSeconds, successfulJobsHistoryLimit, failedJobsHistoryLimit, backoffLimit, timeZone, suspend. State lives in etcd (cluster state). Monitoring via `kubectl get cronjobs`, `kubectl describe cronjob`, `kubectl logs`. Suspend/resume via `kubectl patch cronjob -p '{\"spec\":{\"suspend\":true}}'`. The most complete scheduling model of any CLI tool surveyed—covers concurrency, deadlines, history retention, timezone, and suspension.",
+ "evidence": "kubectl create cronjob hello --image=busybox:1.28 --schedule='* * * * *'. YAML spec fields: schedule, concurrencyPolicy (Allow|Forbid|Replace), startingDeadlineSeconds, successfulJobsHistoryLimit (default 3), failedJobsHistoryLimit (default 1), suspend, timeZone. Macros: @yearly, @monthly, @weekly, @daily, @hourly.",
+ "status": "live",
+ "source_url": "https://kubernetes.io/docs/concepts/workloads/controllers/cron-jobs/",
+ "sub_theme": "infrastructure scheduler"
+ },
+ {
+ "name": "Vercel Cron Jobs",
+ "what": "Cron jobs defined in vercel.json that trigger HTTP GET requests to Vercel Functions at the production URL. Configuration is declarative in vercel.json `crons` array with `path` and `schedule` fields. No CLI commands for managing crons—changes require editing vercel.json and redeploying. State lives in Vercel's cloud. Monitoring via dashboard Settings > Cron Jobs and runtime logs. No automatic retry on failure. Hobby tier: once per day only with hourly precision (±59 min). Pro/Enterprise: per-minute precision. 100 cron jobs per project on all plans. Security via CRON_SECRET env var. Duplicate invocations possible—idempotency recommended. No concurrency control built in (suggests Redis locks).",
+ "evidence": "vercel.json: {\"crons\": [{\"path\": \"/api/cron\", \"schedule\": \"0 5 * * *\"}]}. Hobby: once/day, ±59min precision. Pro: once/min, per-minute precision. 100 cron jobs per project. 'Vercel will not retry an invocation if a cron job fails.' User-Agent: vercel-cron/1.0.",
+ "status": "live",
+ "source_url": "https://vercel.com/docs/cron-jobs",
+ "sub_theme": "cloud-managed scheduler"
+ },
+ {
+ "name": "PM2 (cron_restart + startup)",
+ "what": "Node.js process manager with daemon mode, cron-based restart scheduling, and OS init system integration. Scheduling via `pm2 start app.js --cron-restart='0 0 * * *'` or ecosystem.config.js `cron_restart` field. PM2 runs as a persistent daemon—not pure cron. `pm2 startup` generates init scripts for systemd/launchd/upstart/openrc, and `pm2 save` persists the process list for resurrection on reboot. Monitoring via `pm2 monit`, `pm2 logs`, `pm2 list`. Failure handling via exponential backoff restart delay, max_memory_restart, and stop_exit_codes. Key distinction: PM2's cron_restart restarts an already-running daemon on a schedule, rather than running a one-shot task.",
+ "evidence": "pm2 start app.js --cron-restart='0 0 * * *'. ecosystem.config.js: { cron_restart: '0 0 * * *' }. pm2 startup (detects systemd/launchd/upstart). pm2 save (persists process list). --exp-backoff-restart-delay=100 for exponential backoff.",
+ "status": "live",
+ "source_url": "https://pm2.keymetrics.io/docs/usage/restart-strategies/",
+ "sub_theme": "daemon process manager"
+ },
+ {
+ "name": "Homebrew Services (brew services)",
+ "what": "CLI that manages OS-native services (launchd on macOS, systemd on Linux). Commands: `brew services start/stop/restart/run/list/info/kill/cleanup`. `start` both launches and registers for auto-start at login/boot. `run` launches without auto-start registration. Automatically generates plist (macOS) or systemd unit files. Service files live in ~/Library/LaunchAgents or ~/.config/systemd/user (user-level) or system directories (root). Monitoring via `brew services list --json` and OS-level tools (journalctl, launchctl). No built-in cron/scheduling syntax—it delegates to the formula's service definition. Mainly manages long-running daemons, not one-shot scheduled tasks.",
+ "evidence": "brew services start ; brew services list --json; brew services run (no auto-start). macOS: ~/Library/LaunchAgents. Linux: ~/.config/systemd/user. --file for custom service files. brew services cleanup removes unused definitions.",
+ "status": "live",
+ "source_url": "https://docs.brew.sh/Manpage#services-subcommand",
+ "sub_theme": "OS service integration"
+ },
+ {
+ "name": "Whenever (Ruby crontab DSL)",
+ "what": "Ruby gem that provides a human-readable DSL for defining cron jobs in config/schedule.rb, then writes them directly to the system crontab. CLI: `whenever` (preview cron output), `whenever --update-crontab` (write to crontab), `whenever --clear-crontab` (remove entries). DSL supports `every 3.hours do`, `every :sunday, at: '12pm'`, `every 1.day, at: '4:30 am'`. Three job types: runner (Rails method), rake (Rake task), command (shell). Uses identifier-based namespacing so multiple apps can share a crontab without conflicts. Integrates with Capistrano for deploy-time crontab updates. State is the system crontab itself. No daemon—pure cron. No built-in failure handling or monitoring.",
+ "evidence": "schedule.rb: every 3.hours do; runner 'MyModel.some_process'; end. CLI: whenever --update-crontab --user app. Custom job types: job_type :awesome, '/usr/local/bin/awesome :task :fun_level'. Capistrano: require 'whenever/capistrano'.",
+ "status": "live",
+ "source_url": "https://github.com/javan/whenever",
+ "sub_theme": "crontab wrapper/generator"
+ },
+ {
+ "name": "Supercronic (container cron)",
+ "what": "Crontab-compatible job runner designed for containers. Solves cron-in-Docker problems: preserves environment variables (traditional cron purges them), logs to stdout/stderr (not syslog), handles SIGTERM gracefully, prevents concurrent execution by default. Reads a standard crontab file. Supports second-level precision. CLI flags: -debug, -split-logs, -test (validate without running), -overlapping (allow concurrent), -inotify (auto-reload on file change), -sentry-dsn (error tracking). No daemon management—runs as PID 1 in a container. Key differentiator: drop-in cron replacement that works correctly in containerized environments.",
+ "evidence": "supercronic /path/to/crontab. Flags: -test (validate syntax), -overlapping (allow parallel), -inotify (reload on file change), -sentry-dsn (Sentry integration), -split-logs (stdout vs stderr). 'When a job exceeds its scheduled interval, the system logs a warning but delays the next execution until the current one completes.'",
+ "status": "live",
+ "source_url": "https://github.com/aptible/supercronic",
+ "sub_theme": "container-native cron"
+ },
+ {
+ "name": "Jobber (cron alternative with error handling)",
+ "what": "Go-based cron alternative with built-in job execution history, exponential backoff on failure, and configurable failure notifications. Jobs defined in a YAML-like jobfile. Three error handling modes: Stop (disable job after failure), Backoff (exponential delay before retry), Continue (keep running on schedule). Users can configure notifications on every failure or only when a job gets disabled. CLI for listing job status and history. Currently unmaintained (seeking new maintainer). Last release v1.4.4 in June 2020.",
+ "evidence": "Error handling: 'after an initial failure of a job, Jobber can schedule future runs using an exponential backoff algorithm.' Three onError modes: Stop, Backoff, Continue. Notification on failure or on disable. 444 commits, 1.4k stars.",
+ "status": "unmaintained",
+ "source_url": "https://github.com/dshearer/jobber",
+ "sub_theme": "cron alternative"
+ },
+ {
+ "name": "Dkron (distributed cron)",
+ "what": "Distributed, fault-tolerant job scheduling system built in Go using Raft consensus and Serf gossip protocol. Provides web UI, REST API, and CLI for job management. Eliminates single points of failure by distributing scheduling across multiple nodes. Scales to thousands of nodes. Deployed via Docker Compose with `docker compose up -d --scale dkron-server=4 --scale dkron-agent=10`. Web UI at port 8080. Primarily designed for infrastructure teams, not end-user CLIs. 4.7k GitHub stars.",
+ "evidence": "docker compose up -d --scale dkron-server=4 --scale dkron-agent=10. Web UI: http://localhost:8080/ui. Built on Raft consensus + Serf gossip protocol. 'Able to handle high volumes of scheduled jobs and thousands of nodes.'",
+ "status": "live",
+ "source_url": "https://github.com/distribworks/dkron",
+ "sub_theme": "distributed scheduler"
+ },
+ {
+ "name": "Docker restart policies (not true scheduling)",
+ "what": "Docker provides --restart flags (no, on-failure[:max-retries], always, unless-stopped) for automatic container restart, but these are recovery mechanisms, not schedulers. An increasing delay (doubling from 100ms, max 1 min) is added before each restart, resetting after 10 seconds of successful runtime. For actual scheduled/recurring container execution, Docker relies on external tools (host cron, Kubernetes CronJobs, Supercronic inside containers). Health checks (--health-cmd, --health-interval, --health-retries) complement restart policies but don't provide scheduling.",
+ "evidence": "docker run --restart=on-failure:10 redis. 'An increasing delay (double the previous delay, starting at 100 milliseconds) is added before each restart.' Health checks: --health-cmd, --health-interval, --health-timeout, --health-retries.",
+ "status": "live",
+ "source_url": "https://docs.docker.com/reference/cli/docker/container/run/",
+ "sub_theme": "container restart (not scheduling)"
+ },
+ {
+ "name": "Cronie (modern cron daemon)",
+ "what": "Modern implementation of the standard UNIX cron daemon (evolved from vixie-cron). Provides `crontab` CLI for editing/viewing jobs with syntax validation and automatic backup, plus `cronnext` utility for querying upcoming execution times. Adds PAM/SELinux integration, tilde (~) operator for randomization within ranges, support for up to 10,000 entries per user, and NO_MAIL_OUTPUT env var. Not a scheduling CLI itself but the underlying daemon that crontab-wrapper CLIs (like Whenever) write to.",
+ "evidence": "cronnext utility for querying upcoming job execution times. Tilde operator (~) for randomization within ranges. Up to 10,000 crontab entries per user. PAM and SELinux integration. 56 contributors, 611 commits.",
+ "status": "live",
+ "source_url": "https://github.com/cronie-crond/cronie",
+ "sub_theme": "cron daemon"
+ }
+ ],
+ "gaps": [
+ "Linear CLI: could not access Linear docs on cycles/automation (404 errors). Linear has cycles and auto-close/auto-archive workflows but unclear if these are exposed via any CLI tool.",
+ "Jobber jobfile format: detailed YAML configuration docs were inaccessible (readthedocs 404). The error handling model (Stop/Backoff/Continue with notifications) is confirmed from the GitHub README but specific YAML syntax examples were not retrieved.",
+ "OpenClaw/NanoClaw: no relevant results found. The agent framework landscape for cron-scheduled tasks was not explored in this wave.",
+ "systemd timers as a scheduling mechanism: not researched as a standalone pattern. systemd timers are an alternative to cron with richer features (OnCalendar, Persistent=true for catching up missed runs, RandomizedDelaySec). Would be worth a dedicated investigation.",
+ "Temporal / Inngest / Trigger.dev: cloud-based workflow engines with scheduled triggers that could be relevant as 'schedule from CLI, runs in cloud' patterns. Not covered in this wave.",
+ "macOS launchd scheduled agents: launchd plist files with StartCalendarInterval provide native macOS scheduling without cron. Homebrew Services generates these but the raw launchd scheduling model was not deeply explored."
+ ]
+}
diff --git a/research/async-cli/findings/wave1-sync-async.json b/research/async-cli/findings/wave1-sync-async.json
new file mode 100644
index 00000000..e59e42a8
--- /dev/null
+++ b/research/async-cli/findings/wave1-sync-async.json
@@ -0,0 +1,116 @@
+{
+ "agent_question": "Sync vs async mode patterns in CLIs",
+ "findings": [
+ {
+ "name": "Docker: foreground by default, -d to detach",
+ "what": "docker run blocks in foreground by default, attaching stdin/stdout/stderr to the terminal. The -d (--detach) flag runs the container in the background and prints only the container ID. Rationale: foreground is the safe, observable default — you see errors immediately. Detached mode is for production/daemon use cases.",
+ "evidence": "Docker docs: 'By default, Docker runs the container in attached mode.' The -d flag 'means that a Docker container runs in the background of your terminal. It does not receive input or display output.' After -d, the user sees only the container ID hash.",
+ "status": "live",
+ "source_url": "https://docs.docker.com/engine/containers/run/",
+ "sub_theme": "foreground-default-detach-opt-in"
+ },
+ {
+ "name": "Vercel: blocks by default, --no-wait to skip",
+ "what": "vercel deploy blocks by default, waiting for the deployment to finish. stdout is always the deployment URL. The --no-wait flag 'does not wait for a deployment to finish before exiting.' The --logs flag additionally streams build logs. Default shows a progress indicator and prints the URL on completion.",
+ "evidence": "Vercel CLI docs for deploy: '--no-wait: Does not wait for a deployment to finish before exiting from the deploy command.' Separate --logs flag prints build logs. CI/CD scripts capture the deployment URL from stdout.",
+ "status": "live",
+ "source_url": "https://vercel.com/docs/cli/deploy",
+ "sub_theme": "foreground-default-no-wait-opt-in"
+ },
+ {
+ "name": "AWS CloudFormation: deploy blocks, no official --no-wait",
+ "what": "aws cloudformation deploy blocks by default, waiting for the stack operation to complete. There is NO official --no-wait flag despite long-standing feature requests (since 2014). Workaround: use create-stack/update-stack (which return immediately) + custom polling with describe-stacks. The --no-execute-changeset flag is related but different (previews without executing).",
+ "evidence": "GitHub issue #895 (2014) requests synchronous create/update. Issue #12037 on serverless framework requests async deploys. AWS has not shipped --no-wait for deploy as of 2026.",
+ "status": "live",
+ "source_url": "https://github.com/aws/aws-cli/issues/895",
+ "sub_theme": "foreground-default-no-async-escape"
+ },
+ {
+ "name": "Terraform: apply blocks, no CLI-level async flag",
+ "what": "terraform apply blocks by default, waiting for all resources to reach their expected lifecycle state. There is no --async or --background CLI flag. Some providers (e.g. OCI) support an async=true resource-level attribute that skips waiting per-resource, but this is provider-specific, not a terraform core feature. After async resource creation, state has nulls; you must run terraform refresh.",
+ "evidence": "Terraform docs: 'By default, when Terraform creates, updates, or deletes a resource it waits for that resource to reach its expected lifecycle state before proceeding.' OCI provider supports async=true at resource level only.",
+ "status": "live",
+ "source_url": "https://developer.hashicorp.com/terraform/cli/commands/apply",
+ "sub_theme": "foreground-default-no-async-escape"
+ },
+ {
+ "name": "Fly.io: deploy blocks with live monitoring, --detach to return immediately",
+ "what": "fly deploy blocks by default, showing live deployment progress as machines transition through states until healthy. The --detach flag causes the command to 'return immediately instead of monitoring deployment progress.' Important: --detach also disables automatic rollback on failed health checks, which is poorly documented.",
+ "evidence": "Fly.io docs: '--detach: Return immediately instead of monitoring deployment progress.' Community discussion notes that --detach disabling rollbacks should be documented in CLI help.",
+ "status": "live",
+ "source_url": "https://fly.io/docs/flyctl/deploy/",
+ "sub_theme": "foreground-default-detach-opt-in"
+ },
+ {
+ "name": "Railway: deploy blocks with log streaming, -d/--detach to skip",
+ "what": "railway up blocks by default, streaming build logs to the terminal. The -d/--detach flag 'returns immediately after uploading, with the deployment continuing in the background.' Also offers -c/--ci which streams only build logs and exits when build (not deploy) completes — a middle ground between full sync and full detach.",
+ "evidence": "Railway docs: '--detach prevents attaching to the log stream.' The -c/--ci flag 'streams only build logs and exits when the build completes' — useful for CI/CD.",
+ "status": "live",
+ "source_url": "https://docs.railway.com/cli/deploying",
+ "sub_theme": "foreground-default-detach-opt-in"
+ },
+ {
+ "name": "Kubernetes: kubectl apply returns immediately (async default), rollout status blocks",
+ "what": "kubectl apply is async by default — it submits the manifest and returns immediately without waiting for pods to be ready. To block, you compose a second command: kubectl rollout status deployment/NAME, which watches until the rollout completes. kubectl wait provides generic condition-based waiting. This is the opposite pattern from most deploy CLIs.",
+ "evidence": "Kubernetes docs: 'kubectl apply doesn't have a --wait option, so the success of a deploy doesn't actually indicate the deploy succeeded, as the rollout happens asynchronously.' kubectl rollout status 'watches the status of the latest rollout until it's done.'",
+ "status": "live",
+ "source_url": "https://kubernetes.io/docs/reference/kubectl/generated/kubectl_rollout/kubectl_rollout_status/",
+ "sub_theme": "async-default-opt-in-wait"
+ },
+ {
+ "name": "Heroku: git push blocks but Ctrl+C detaches without canceling",
+ "what": "git push heroku main blocks by default, streaming build output. Pressing Ctrl+C detaches from the build stream but does NOT cancel the deploy — the build continues in the background. There is no explicit --detach or --no-wait flag; the escape hatch is just Ctrl+C.",
+ "evidence": "Heroku docs: 'After you initiate a Heroku deploy with git push, you can detach from the resulting build process by pressing Ctrl + C. Detaching doesn't cancel the build or the deploy.'",
+ "status": "live",
+ "source_url": "https://devcenter.heroku.com/articles/git",
+ "sub_theme": "foreground-default-implicit-detach"
+ },
+ {
+ "name": "npm/pnpm install: always blocks, no async mode",
+ "what": "npm install and pnpm install always block the terminal. There is no --background or --async flag. This makes sense because subsequent commands (build, test, run) depend on install completing. The user sees a progress bar/spinner and package resolution output.",
+ "evidence": "No --background, --async, or --no-wait flags exist in npm or pnpm CLI documentation. Install is a prerequisite step that must complete before anything else can run.",
+ "status": "live",
+ "source_url": "https://docs.npmjs.com/cli/commands/npm-install",
+ "sub_theme": "foreground-only-no-async"
+ },
+ {
+ "name": "GitHub CLI: trigger is async, watch is opt-in sync",
+ "what": "gh workflow run triggers a workflow and returns immediately (async). gh run watch blocks and polls every 3 seconds showing live progress until the run completes. gh run list shows a snapshot of recent runs. This is the two-command pattern: fire-and-forget trigger + optional blocking monitor.",
+ "evidence": "GitHub CLI docs: 'gh run watch watches a run until it completes, showing its progress.' The interface refreshes every 3 seconds. --compact shows only relevant/failed steps.",
+ "status": "live",
+ "source_url": "https://cli.github.com/manual/gh_run_watch",
+ "sub_theme": "async-default-opt-in-watch"
+ },
+ {
+ "name": "docker compose up: foreground default, -d for detach",
+ "what": "docker compose up blocks in the foreground by default, streaming logs from all services. The -d/--detach flag starts containers in the background. This mirrors docker run's pattern and is the most widely-known example of the --detach convention.",
+ "evidence": "Docker docs: 'Running docker compose up --detach starts the containers in the background and leaves them running.'",
+ "status": "live",
+ "source_url": "https://docs.docker.com/reference/cli/docker/compose/up/",
+ "sub_theme": "foreground-default-detach-opt-in"
+ },
+ {
+ "name": "CLI convention: flag naming consensus",
+ "what": "Three dominant flag names emerge: --detach/-d (Docker, Fly.io, Railway), --no-wait (Vercel, AWS requests), and --background (less common, mostly OS-level). --detach implies the process continues server-side. --no-wait implies you don't care about the outcome. --background implies a local background process. For deploy/connect operations that run server-side, --detach is the most conventional name.",
+ "evidence": "clig.dev: 'Use standard names for flags, if there is a standard. If another commonly used command uses a flag name, it's best to follow that existing pattern.'",
+ "status": "live",
+ "source_url": "https://clig.dev/",
+ "sub_theme": "naming-conventions"
+ }
+ ],
+ "synthesis": {
+ "dominant_pattern": "Foreground/blocking is the overwhelming default for deploy-like operations (Docker, Vercel, Fly.io, Railway, Heroku, Terraform, AWS CF). Only Kubernetes kubectl apply and GitHub CLI workflow triggers default to async.",
+ "why_sync_default": "Three reasons: (1) Users need immediate feedback on success/failure, (2) CLI commands often chain in scripts where order matters, (3) Errors surfaced synchronously are caught faster than errors discovered later.",
+ "why_async_exists": "Two reasons: (1) CI/CD pipelines where blocking wastes compute, (2) Long-running operations where the user wants their terminal back.",
+ "recommended_flag_name": "--detach or -d is the most conventional for server-side operations. --no-wait is the runner-up (Vercel). Avoid --background (implies local process) and --async (uncommon).",
+ "what_sync_shows": "Progress indicators, log streaming, final status. Vercel shows deployment URL on stdout. Docker streams container output. Fly.io shows machine health transitions.",
+ "what_async_returns": "A job/resource ID (Docker prints container ID), a URL (Vercel prints deployment URL), or nothing (Fly.io just returns). Best practice: always print an identifier the user can use to check status later.",
+ "recommendation_for_vana_connect": "Default to blocking/foreground for `vana connect github`. Users expect to see progress and know when it's done. Offer --detach to return immediately, printing a session ID or status URL. Consider Railway's middle-ground --ci flag pattern if there are distinct build vs. runtime phases."
+ },
+ "gaps": [
+ "Could not find detailed Vercel --no-wait user experience (what exactly prints after --no-wait — just the URL, or also a message?)",
+ "Heroku has no explicit async flag — only the Ctrl+C convention. Could not verify if newer Heroku CLI versions added one.",
+ "Did not research Render, Netlify, or Cloudflare Workers deploy CLIs which may have additional patterns.",
+ "Did not research long-running data pipeline CLIs (Spark, dbt) which may have different async conventions relevant to data connector use cases."
+ ]
+}
diff --git a/research/async-cli/findings/wave2-auth-expiry.json b/research/async-cli/findings/wave2-auth-expiry.json
new file mode 100644
index 00000000..99405813
--- /dev/null
+++ b/research/async-cli/findings/wave2-auth-expiry.json
@@ -0,0 +1,160 @@
+{
+ "agent_question": "Auth expiry and re-authentication patterns in data aggregation services",
+ "findings": [
+ {
+ "name": "Plaid Item health state model",
+ "what": "Plaid tracks Item health via a status object on /item/get containing last_successful_update and last_failed_update timestamps per product (transactions, investments). The update_type field indicates 'background' (automatic) vs 'user_present_required'. There is a consent_expiration_time field (ISO 8601) that is non-null for institutions enforcing consent expiry (common in EU, rare in US). Error states are surfaced via error objects with error_type, error_code, and error_code_reason. Key error codes: ITEM_LOGIN_REQUIRED (password changed, MFA expired, OAuth consent invalid), INVALID_CREDENTIALS, INVALID_MFA, ITEM_LOCKED (3-5 failed attempts), ACCESS_NOT_GRANTED, NO_ACCOUNTS, INSUFFICIENT_CREDENTIALS.",
+ "evidence": "Plaid /item/get API docs: status.transactions.last_successful_update, status.investments.last_failed_update, consent_expiration_time fields. Error codes documented at plaid.com/docs/errors/item/.",
+ "status": "live",
+ "source_url": "https://plaid.com/docs/api/items/",
+ "sub_theme": "connection_health_state_model"
+ },
+ {
+ "name": "Plaid webhook-driven expiry notification system",
+ "what": "Plaid sends proactive webhooks for connection health changes: (1) PENDING_DISCONNECT (US/CA) or PENDING_EXPIRATION (UK/EU) sent 7 days before consent expires, (2) ITEM:ERROR webhook when ITEM_LOGIN_REQUIRED occurs, (3) LOGIN_REPAIRED when an Item exits error state (even if repaired via another app), (4) NEW_ACCOUNTS_AVAILABLE when new shareable accounts detected, (5) USER_PERMISSION_REVOKED when user removes access. Developers must handle duplicate and out-of-order webhooks with idempotency.",
+ "evidence": "Plaid webhooks docs and Link update mode docs describe 7-day advance notice via PENDING_DISCONNECT/PENDING_EXPIRATION.",
+ "status": "live",
+ "source_url": "https://plaid.com/docs/api/webhooks/",
+ "sub_theme": "advance_warning_notifications"
+ },
+ {
+ "name": "Plaid Link update mode — abbreviated re-auth UX",
+ "what": "When an Item needs re-authentication, Plaid presents an abbreviated flow requesting only the minimum input needed to repair it. Example: if an OTP token expired, the user provides a new OTP without full re-login. Update mode can be triggered by access_token (single Item) or user_token (multi-Item). Successfully completing update mode resets consent_expiration_time as if the Item were newly created. Update mode also supports adding/removing accounts, granting new product permissions, and consent renewal.",
+ "evidence": "Plaid Link update mode docs: 'For most institutions, Plaid will present an abbreviated re-authentication flow requesting only the minimum user input required to repair the Item.'",
+ "status": "live",
+ "source_url": "https://plaid.com/docs/link/update-mode/",
+ "sub_theme": "reauth_ux_patterns"
+ },
+ {
+ "name": "Plaid consent expiry timelines vary by region and institution",
+ "what": "UK: 90-day consent cycle, 7-day advance notice, re-consent handled by TPP (Plaid) without bank redirect since FCA 2022 changes. EU: Extended from 90 to 180 days (July 2023 EBA change), but re-consent still requires bank authentication via TPP redirect. US: Some institutions (e.g. Bank of America) enforce 12-month consent expiry, but most US Items do not expire. OAuth-based connections are more likely to have consent expiry than credential-based ones.",
+ "evidence": "Plaid blog 'A more seamless way to reauthenticate in the UK': 7-day advance notification, TPP-managed re-consent. Plaid blog 'Now you can enjoy 180 days': EU 90->180 day extension, still requires bank auth. Plaid docs: consent_expiration_time is null for most US institutions.",
+ "status": "live",
+ "source_url": "https://plaid.com/blog/90-day-are-you-ready/",
+ "sub_theme": "consent_expiry_timelines"
+ },
+ {
+ "name": "Stripe Connect OAuth — tokens don't expire but can be revoked",
+ "what": "Stripe Connect Standard account access tokens do not expire. They remain valid until the user explicitly deauthorizes the connected account, which triggers an account.application.deauthorized webhook. Refresh tokens are rolled on every exchange, so the expiry is always 1 year from last use — effectively never expiring if used regularly. This is a fundamentally different model from credential-based connections: once authorized, connections persist indefinitely unless user-revoked. Stripe Apps (different context) use 1-hour access tokens with 1-year refresh tokens.",
+ "evidence": "Stripe Connect OAuth docs: 'the access token does not expire but may be revoked by the user at any time'. Refresh tokens roll on exchange, 1-year expiry from last use.",
+ "status": "live",
+ "source_url": "https://docs.stripe.com/connect/oauth-reference",
+ "sub_theme": "never_expire_model"
+ },
+ {
+ "name": "MX Platform — 13+ member connection statuses",
+ "what": "MX uses a granular connection_status enum for members (user-institution connections): CONNECTED (healthy), CHALLENGED (MFA required — security question or access code), DENIED (invalid credentials), DISCONNECTED (institution not updating data), PREVENTED (3 consecutive failed attempts, locked out), EXPIRED, LOCKED, IMPEDED, IMPAIRED, REJECTED, IMPORTED, DISABLED, DISCONTINUED, CLOSED. Terminal states requiring new aggregation and possible user input: PREVENTED, DENIED, IMPEDED, IMPAIRED, REJECTED, EXPIRED, LOCKED, IMPORTED, DISABLED, DISCONTINUED, CLOSED. CHALLENGED is an interactive state requiring MFA response.",
+ "evidence": "MX Academy 'Member Connection Statuses Overview': lists all statuses. CHALLENGED requires user to answer MFA. PREVENTED occurs after 3 consecutive failures.",
+ "status": "live",
+ "source_url": "https://academy.mx.com/hc/en-us/articles/4708854368525-Member-Connection-Statuses-Overview",
+ "sub_theme": "connection_health_state_model"
+ },
+ {
+ "name": "Google OAuth — multiple silent revocation triggers",
+ "what": "Google refresh tokens can be revoked/expired by: (1) 6-month inactivity — unused tokens auto-invalidate, (2) Password change — revokes tokens with Gmail scopes, (3) User-initiated removal from Google Account settings, (4) Testing-mode apps — 7-day automatic expiry, (5) 100-token-per-client ceiling — oldest tokens silently invalidated when exceeded, (6) Workspace admin policy changes restricting scopes or setting time-limited access, (7) Undocumented security heuristics. Access tokens expire after 1 hour (extendable to 12 hours). Mitigation: touch tokens every few days, persist rotated refresh tokens, monitor invalid_grant spikes, move to production status.",
+ "evidence": "Nango blog 'Google OAuth invalid_grant': documents all revocation triggers including 6-month inactivity, 100-token ceiling, testing-mode 7-day expiry. Google OAuth docs confirm 1-hour access token, 6-month inactivity rule.",
+ "status": "live",
+ "source_url": "https://nango.dev/blog/google-oauth-invalid-grant-token-has-been-expired-or-revoked",
+ "sub_theme": "silent_revocation_triggers"
+ },
+ {
+ "name": "Salesforce — configurable refresh token expiry policies",
+ "what": "Salesforce connected apps offer 4 refresh token policies: (1) Valid until revoked (default — indefinite), (2) Immediately expire refresh token, (3) Expire if not used for N time (inactivity-based, resets on use), (4) Expire after N time (absolute). Admins can change policies at any time, retroactively affecting existing tokens. Access token lifetime is determined by session timeout value. This represents a 'platform-configurable' model where the service provider (not the aggregator) controls token lifetime.",
+ "evidence": "Salesforce help docs: 'Manage OAuth Access Policies for a Connected App' describes all 4 policies. Default is 'valid until revoked'.",
+ "status": "live",
+ "source_url": "https://help.salesforce.com/s/articleView?id=sf.connected_app_manage_oauth.htm&language=en_US&type=5",
+ "sub_theme": "platform_controlled_expiry"
+ },
+ {
+ "name": "Strava OAuth — 6-hour access tokens with rotating refresh tokens",
+ "what": "Strava access tokens expire after 6 hours. Refresh tokens are rotated on every exchange — each refresh response includes a new refresh token, and the old one becomes invalid. Apps must persist the latest refresh token. During the transition window, both old and new access tokens work until the old one naturally expires. On 401 Unauthorized, apps should use the refresh token to obtain a new access token. No push notification for token expiry — apps must proactively check expiry before API calls. If a user's connection breaks (e.g., Garmin-Strava sync), the fix is typically disconnect and reconnect.",
+ "evidence": "Strava developer docs: 'Access tokens expire six hours after they are created'. 'Once a new refresh token code has been returned, the older code will no longer work.'",
+ "status": "live",
+ "source_url": "https://developers.strava.com/docs/authentication/",
+ "sub_theme": "short_lived_token_rotation"
+ },
+ {
+ "name": "Banking app session expiry UX patterns",
+ "what": "Common patterns for session expiry in banking/financial apps: (1) Pre-timeout warning — gentle nudge before auto-logout with option to extend, (2) Biometric re-authentication on mobile — Face ID/Touch ID for frictionless re-auth without full login, (3) Silent token renewal in background — refresh tokens used transparently, (4) Context-aware security — adjust auth requirements based on user behavior, location, device trust level, (5) Push notification for suspicious login attempts (approve/deny model). Data-sensitive apps (banking) tend to auto-logout after idle periods rather than silently renewing, prioritizing security over convenience.",
+ "evidence": "Smashing Magazine 'Rethinking Authentication UX', OneSignal banking push notification patterns, HID Global blog on UX sweet spot for online banking.",
+ "status": "live",
+ "source_url": "https://www.smashingmagazine.com/2022/08/authentication-ux-design-guidelines/",
+ "sub_theme": "session_expiry_ux"
+ },
+ {
+ "name": "Apple HealthKit — persistent permissions, no expiry",
+ "what": "HealthKit uses a persistent authorization model: once a user grants read/write access to health data types, the permission persists until explicitly revoked in Settings > Health > Data Access & Devices. There is no token expiry or re-consent cycle. Apps can become 'inactive' data sources after iOS upgrades (a known bug/behavior), requiring users to toggle permissions in Health settings. Apps should check authorization status on foreground entry (willEnterForeground notification) to detect user-initiated revocations. No push notification system for permission changes.",
+ "evidence": "Apple Developer docs on HealthKit authorization. Apple Community forums document inactive data source behavior after iOS upgrades.",
+ "status": "live",
+ "source_url": "https://developer.apple.com/documentation/healthkit/authorizing-access-to-health-data",
+ "sub_theme": "persistent_permission_model"
+ },
+ {
+ "name": "Password managers — expiry tracking is nascent",
+ "what": "1Password added item expiry dates (Q1 2025) viewable in Watchtower, but does not send push/email notifications when passwords approach expiry — users must check Watchtower manually. Bitwarden has had long-standing community requests for password expiration dates and rotation reminders (GitHub issue #227, 2017) but the feature remains limited. Neither product offers automated credential rotation or proactive expiry alerts. This is relevant because it shows that even credential management tools lack mature expiry notification systems.",
+ "evidence": "1Password blog 'Q1 2025 usability updates': expiry dates visible in Watchtower only. Bitwarden community forums: feature requests for expiration dates span years without full implementation.",
+ "status": "live",
+ "source_url": "https://1password.com/blog/1password-q1-2025-usability-updates",
+ "sub_theme": "credential_rotation_alerts"
+ },
+ {
+ "name": "Finicity (Mastercard Open Banking) — aggregation status codes",
+ "what": "Finicity uses numeric aggregation status codes to represent connection health. MFA challenges are handled inline during aggregation — the API returns MFA questions and an mfa_session token, and the app must relay answers back. Finicity's test bank continually returns MFA challenges for integration testing. OAuth connections are handled via institution-managed tokens. The aggregation status code system provides granular feedback on why a connection succeeded or failed, similar to HTTP status codes for data aggregation.",
+ "evidence": "Finicity docs (now Mastercard Open Banking): authentication-and-integration docs describe MFA challenge flow with mfa_session headers. GitHub client libraries show MFA handling patterns.",
+ "status": "live",
+ "source_url": "https://developer.mastercard.com/open-banking-us/documentation/",
+ "sub_theme": "mfa_challenge_handling"
+ },
+ {
+ "name": "Cross-cutting pattern: three-tier connection health model",
+ "what": "Across all services studied, connection health follows a three-tier model: (1) HEALTHY — data flows normally, background updates succeed; (2) DEGRADED/PENDING — connection will expire soon or is experiencing intermittent failures, advance warning sent (Plaid: 7 days, Google: none/silent); (3) BROKEN — requires user interaction to repair (re-login, re-consent, MFA). The key differentiator is whether the transition from DEGRADED to BROKEN is predictable (consent expiry with known date) or unpredictable (user changes password, admin revokes, platform security flag). Services with predictable expiry (Plaid, Salesforce) can send advance warnings. Services with unpredictable revocation (Google, Stripe) can only react after the fact.",
+ "evidence": "Synthesized from Plaid (PENDING_EXPIRATION -> ITEM_LOGIN_REQUIRED), MX (CONNECTED -> CHALLENGED/DENIED/EXPIRED), Google (valid -> invalid_grant), Strava (valid -> 401).",
+ "status": "live",
+ "source_url": "",
+ "sub_theme": "universal_state_model"
+ },
+ {
+ "name": "Cross-cutting pattern: notification channel preferences",
+ "what": "Services use different notification channels for auth expiry: (1) Webhooks to developer servers — Plaid (PENDING_EXPIRATION, ITEM:ERROR), Stripe (account.application.deauthorized), MX (connection status change callbacks). This is the primary channel for B2B2C services. (2) In-app prompts — shown when user next opens the app, common in consumer apps (banking, fitness). (3) Push notifications — used by banking apps for session expiry and login approval. (4) Email — used by some services for consent renewal reminders. (5) No notification (silent failure) — Google silently invalidates tokens, Strava returns 401 on next API call. For a CLI tool doing background collection, the closest analog is the webhook model: the system detects failure and queues a notification for the user's next interactive session.",
+ "evidence": "Plaid webhooks, Stripe webhooks, MX callbacks, banking app push notifications, Google silent revocation patterns.",
+ "status": "live",
+ "source_url": "",
+ "sub_theme": "notification_channels"
+ },
+ {
+ "name": "Cross-cutting pattern: re-auth UX spectrum from minimal to full",
+ "what": "Re-authentication complexity varies: (1) Silent/automatic — refresh token exchange, no user involvement (Strava, Google, Salesforce when tokens valid), (2) Minimal/abbreviated — single MFA prompt or consent checkbox without full re-login (Plaid UK update mode, biometric re-auth in banking apps), (3) Redirect to provider — user sent to institution's OAuth page to re-authorize (Plaid EU, Garmin-Strava reconnect), (4) Full re-setup — complete teardown and reconnect from scratch (MX PREVENTED status after 3 failures, Strava disconnect+reconnect). For browser-automation-based collection: the equivalent of (1) is session cookies still valid, (2) is re-entering MFA/OTP in automated browser, (3) is launching a full interactive browser session for login, (4) is clearing all cookies and starting over.",
+ "evidence": "Plaid abbreviated update mode, MX PREVENTED requiring new aggregation, Strava disconnect/reconnect pattern, banking biometric re-auth.",
+ "status": "live",
+ "source_url": "",
+ "sub_theme": "reauth_complexity_spectrum"
+ }
+ ],
+ "gaps": [
+ {
+ "topic": "MX webhook payload structure and connection status change callbacks",
+ "reason": "MX Academy and docs.mx.com pages were not fully accessible. Could not extract webhook payload schemas or callback mechanisms for connection status transitions.",
+ "suggested_source": "https://docs.mx.com/resources/webhooks/"
+ },
+ {
+ "topic": "Finicity aggregation status codes — full numeric code list",
+ "reason": "Finicity docs redirect to Mastercard Open Banking portal which did not expose the aggregation status codes page directly. The specific numeric codes (e.g., 103=MFA required, 185=credentials changed) were not extracted.",
+ "suggested_source": "https://developer.mastercard.com/open-banking-us/documentation/api-reference/"
+ },
+ {
+ "topic": "Quantitative data on re-auth conversion rates",
+ "reason": "Plaid and other services do not publish conversion/drop-off rates for re-authentication flows. Industry benchmarks for re-auth completion rates are not publicly available.",
+ "suggested_source": "Plaid sales team or industry reports"
+ },
+ {
+ "topic": "Browser automation session longevity by platform",
+ "reason": "No research was found on how long browser sessions/cookies typically last for specific platforms (GitHub, ChatGPT, LinkedIn) before requiring re-authentication. This is critical for the CLI's scheduling design.",
+ "suggested_source": "Empirical testing or platform-specific documentation"
+ },
+ {
+ "topic": "Yodlee connection health model details",
+ "reason": "Yodlee developer portal content was not fully extractable from search results. Their error code taxonomy and connection lifecycle management details remain unclear.",
+ "suggested_source": "https://developer.yodlee.com/resources/yodlee/error-codes/docs"
+ }
+ ]
+}
diff --git a/research/async-cli/findings/wave2-daemon-patterns.json b/research/async-cli/findings/wave2-daemon-patterns.json
new file mode 100644
index 00000000..a4b0678c
--- /dev/null
+++ b/research/async-cli/findings/wave2-daemon-patterns.json
@@ -0,0 +1,195 @@
+{
+ "agent_question": "Daemon and background service patterns for Node.js CLIs",
+ "findings": [
+ {
+ "name": "PM2 God Daemon — dual Unix socket architecture",
+ "what": "PM2 runs a persistent background process (the 'God Daemon') that communicates via two separate Unix sockets in ~/.pm2/: rpc.sock for synchronous request-reply commands (start, stop, restart, list) and pub.sock for event broadcasting (log output, process state changes). The CLI spawns the daemon on first use if not running. The axon library handles socket transport; axon-rpc layers an RPC protocol on top. God exposes methods like prepare(), getMonitorData(), startProcessId(), stopProcessId() over RPC. The daemon maintains a clusters_db hash map as its in-memory process registry.",
+ "evidence": "PM2 source: Daemon class initializes PUB socket at cst.DAEMON_PUB_PORT and RPC socket at cst.DAEMON_RPC_PORT. Client class auto-spawns daemon if connection fails. Socket files at $HOME/.pm2/pub.sock and $HOME/.pm2/rpc.sock.",
+ "status": "live",
+ "source_url": "https://deepwiki.com/Unitech/pm2/2-core-components",
+ "sub_theme": "pm2_architecture"
+ },
+ {
+ "name": "PM2 crash recovery — exponential backoff with min_uptime reset",
+ "what": "PM2's handleExit method tracks prev_restart_delay and doubles it on each crash (e.g., 100ms -> 200ms -> 400ms). The backoff resets when a process stays alive longer than min_uptime. A Worker loop runs every 30 seconds to monitor health and reset delays for stable processes. max_restarts caps total attempts to prevent infinite loops. max_memory_restart triggers reload if a process exceeds a memory threshold.",
+ "evidence": "handleExit tracks prev_restart_delay for exponential backoff. Worker system runs every 30s monitoring health. Configurable via ecosystem.config.js: min_uptime, max_restarts, max_memory_restart.",
+ "status": "live",
+ "source_url": "https://deepwiki.com/Unitech/pm2/2-core-components",
+ "sub_theme": "pm2_crash_recovery"
+ },
+ {
+ "name": "PM2 startup — cross-platform init script generation",
+ "what": "pm2 startup auto-detects the host OS init system (systemd, launchd, upstart, init.d, rcd, systemv) and generates appropriate service files. On systemd, it creates a unit file targeting multi-user.target. pm2 save serializes current process list to ~/.pm2/dump.pm2. On boot, the init script starts PM2 which reads dump.pm2 and resurrects all saved processes with their exact configurations. pm2-windows-service extends this to Windows.",
+ "evidence": "pm2 startup [systemd|launchd|upstart|systemv] generates platform-specific init scripts. pm2 save writes dump.pm2. pm2 resurrect restores from dump.",
+ "status": "live",
+ "source_url": "https://pm2.keymetrics.io/docs/usage/pm2-doc-single-page/",
+ "sub_theme": "pm2_startup"
+ },
+ {
+ "name": "PM2 ecosystem.config.js — declarative multi-process config",
+ "what": "ecosystem.config.js exports { apps: [...], deploy: {...} }. Each app entry specifies: name, script, exec_mode (fork|cluster), instances, max_memory_restart, cron_restart (cron expression for scheduled restarts), error_file, out_file, log_date_format, env variables, and watch options. The God daemon parses this via Configuration.getSync() to populate pm2_env for each process. Supports multiple apps in a single file with different configurations.",
+ "evidence": "module.exports = { apps: [{ name, script, exec_mode, instances, max_memory_restart, cron_restart, ... }], deploy: { production: { user, host, ref, repo, path } } }",
+ "status": "live",
+ "source_url": "https://deepwiki.com/Unitech/pm2/2-core-components",
+ "sub_theme": "pm2_config"
+ },
+ {
+ "name": "PM2 web dashboard — RPC socket + Socket.io bridge",
+ "what": "PM2 Plus is the official SaaS dashboard for remote monitoring with email/Slack notifications, custom metrics, and exception tracking (free tier: 4 servers). For self-hosted, pm2-gui connects directly to PM2's RPC socket, reads process metrics, and serves a web UI over Socket.io for real-time CPU/memory monitoring and process control (restart/stop/delete). The dashboard bridges PM2's Unix socket IPC to WebSocket for browser consumption.",
+ "evidence": "pm2-gui communicates with PM2 through RPC socket directly, uses Socket.io between client and server. PM2 Plus provides issues tracking, deployment reporting, real-time logs, email/Slack notifications.",
+ "status": "live",
+ "source_url": "https://pm2.keymetrics.io/docs/usage/monitoring/",
+ "sub_theme": "pm2_dashboard"
+ },
+ {
+ "name": "PM2 graceful shutdown — signal cascade and state dump",
+ "what": "Daemon's gracefullExit() method: (1) dumps process list to disk, (2) sends SIGINT to each managed process with a configurable kill_timeout, (3) falls back to SIGKILL after timeout, (4) cleans up socket files (rpc.sock, pub.sock), (5) removes PID file. Signal handlers: SIGTERM/SIGINT/SIGQUIT trigger graceful exit; SIGUSR2 triggers log rotation/reload.",
+ "evidence": "Daemon signal handlers: SIGTERM, SIGINT, SIGQUIT -> gracefullExit(). SIGUSR2 -> log reload. gracefullExit dumps process lists, stops managed processes, cleans socket files, removes PID files.",
+ "status": "live",
+ "source_url": "https://deepwiki.com/Unitech/pm2/2-core-components",
+ "sub_theme": "pm2_shutdown"
+ },
+ {
+ "name": "Turborepo daemon — on-demand gRPC over Unix socket with graceful degradation",
+ "what": "Turborepo's Rust daemon spawns transparently via DaemonConnector: if no daemon exists, the CLI forks one. Communication uses gRPC (Protocol Buffers) over a Unix domain socket at .turbo/daemon.sock (named pipe on Windows). A .turbo/daemon.pid lock file prevents multiple instances; stale PID detection allows recovery. The daemon auto-disables in CI (checks CI=true env var). If the daemon crashes or socket fails, the CLI falls back to synchronous operation — builds complete without caching/watching benefits but never fail due to daemon unavailability.",
+ "evidence": "DaemonConnector spawns daemon on-demand. TurboGrpcService listens on .turbo/daemon.sock. .turbo/daemon.pid for lock. CI detection disables daemon. Graceful degradation: CLI falls back to synchronous ops if daemon unavailable.",
+ "status": "live",
+ "source_url": "https://deepwiki.com/vercel/turborepo/2-architecture",
+ "sub_theme": "turborepo_daemon"
+ },
+ {
+ "name": "Turborepo file watching — layered watchers with cookie synchronization",
+ "what": "The daemon's FileSystemWatcher uses specialized sub-watchers: PackageWatcher (package.json changes), HashWatcher (source file content hashes for cache invalidation), GlobWatcher (turbo.json task input/output patterns). A CookieWriter/CookieWatcher system uses marker files in .turbo/cookies/ to synchronize file events back to CLI clients through the daemon socket, ensuring consistent state between file system events and build decisions.",
+ "evidence": "PackageWatcher, HashWatcher, GlobWatcher, CookieWriter/CookieWatcher all under FileSystemWatcher. Cookie files in .turbo/cookies/ synchronize events to gRPC clients.",
+ "status": "live",
+ "source_url": "https://deepwiki.com/vercel/turborepo/2-architecture",
+ "sub_theme": "turborepo_file_watching"
+ },
+ {
+ "name": "Docker daemon — systemd socket activation for on-demand startup",
+ "what": "Docker supports systemd socket activation: docker.socket creates /var/run/docker.sock, and dockerd only starts when something connects. The daemon can listen on Unix, TCP, or fd:// (systemd file descriptors). PID file at /var/run/docker.pid prevents duplicate instances. The REST API is exposed over the Unix socket. Client-server model means CLI (docker) and daemon (dockerd) are fully decoupled — the daemon persists state independently of any CLI session.",
+ "evidence": "dockerd -H fd:// for systemd socket activation. Default socket at /var/run/docker.sock. PIDFile in docker.service unit. REST API over Unix socket. Supports unix, tcp, fd socket types.",
+ "status": "live",
+ "source_url": "https://docs.docker.com/reference/cli/dockerd/",
+ "sub_theme": "docker_daemon"
+ },
+ {
+ "name": "Homebrew services — thin wrapper generating launchd plists and systemd units",
+ "what": "brew services start generates a .plist XML file and symlinks it to ~/Library/LaunchAgents (user-level) or /Library/LaunchDaemons (root/boot-level). The plist contains: Label, ProgramArguments, RunAtLoad, KeepAlive, WorkingDirectory, StandardOutPath, StandardErrorPath. On Linux, brew services generates systemd user units instead. The formula's #startup_plist method defines the service configuration. It's a thin mapping layer — all actual supervision is delegated to launchd/systemd.",
+ "evidence": "brew services creates .plist symlinks to ~/Library/LaunchAgents or /Library/LaunchDaemons. On Linux, uses systemd user units. Formula implements #startup_plist for config.",
+ "status": "live",
+ "source_url": "https://www.dorokhovich.com/blog/homebrew-services",
+ "sub_theme": "homebrew_services"
+ },
+ {
+ "name": "Temporal.io — durable execution with event-sourced workflow recovery",
+ "what": "Temporal's architecture: a Temporal Cluster tracks all workflow state, and long-lived Worker processes execute Workflows (orchestration logic) and Activities (individual steps like API calls). Every state change is recorded in an append-only Event History, enabling exact-point-of-failure recovery. Workers can be restarted and will resume workflows from their last recorded state. TypeScript SDK available. This is a server-side pattern requiring infrastructure (Temporal Cluster), not suitable for a local CLI daemon, but the event-sourcing pattern for state recovery is highly relevant.",
+ "evidence": "Temporal records every state change in Event History, a complete append-only log. Workers execute Workflows and Activities. TypeScript SDK for Node.js. Requires running Temporal Cluster.",
+ "status": "live",
+ "source_url": "https://docs.temporal.io/workflows",
+ "sub_theme": "workflow_orchestration"
+ },
+ {
+ "name": "Inngest — serverless event-driven workflow alternative",
+ "what": "Inngest takes a serverless, event-driven approach vs Temporal's dedicated-server model. Uses native language primitives (no runtime proxying), making debugging simpler. The Inngest platform manages event log, state storage, and scheduling. Better for cloud-deployed workflows. Less relevant for a local CLI daemon, but the event-driven step function model (each step is independently retryable) is a useful design pattern for scheduled collection tasks.",
+ "evidence": "Inngest uses native language primitives for direct execution. Platform handles event log, state storage, scheduling. Serverless model vs Temporal's worker model.",
+ "status": "live",
+ "source_url": "https://www.inngest.com/compare-to-temporal",
+ "sub_theme": "workflow_orchestration"
+ },
+ {
+ "name": "VS Code extension host — process isolation with lazy activation and JSON-RPC IPC",
+ "what": "VS Code runs extensions in a separate Extension Host process (Node.js), isolating them from the UI. Communication uses JSON-RPC over IPC (extension API calls serialized to JSON requests). Extensions activate lazily based on activation events (onLanguage, onCommand, etc.) — unused extensions consume zero memory. Each window gets its own extension host. Extensions can spawn child processes. ExtensionMemento provides persistent state storage per extension. This pattern of lazy activation + process isolation + persistent state is directly applicable to a CLI daemon.",
+ "evidence": "Extension Host runs in separate Node.js process. Communication via JSON-RPC over IPC. Lazy activation via activation events. ExtensionMemento for persistent state. One extension host per window.",
+ "status": "live",
+ "source_url": "https://code.visualstudio.com/api/advanced-topics/extension-host",
+ "sub_theme": "vscode_extension_host"
+ },
+ {
+ "name": "IPC mechanism comparison — Unix sockets best for same-machine Node.js",
+ "what": "Benchmarks show Unix sockets and named pipes have virtually identical performance, while TCP is 20-40% slower. The bottleneck is serialization/deserialization, not the transport. For small messages (<100 bytes), named pipes edge out; for larger messages (10KB+), sockets are faster. The node-ipc library provides a unified API that auto-converts Unix socket paths to Windows named pipes. For a Node.js CLI daemon, Unix domain sockets are the recommended IPC mechanism: fast, simple, no port conflicts, and the socket file doubles as a liveness indicator.",
+ "evidence": "Practically no performance difference between native pipe and unix sockets; TCP is 20-40% slower. node-ipc auto-converts Unix socket paths to Windows named pipes. Bottleneck is serialization, not transport.",
+ "status": "live",
+ "source_url": "https://60devs.com/performance-of-inter-process-communications-in-nodejs.html",
+ "sub_theme": "ipc_mechanisms"
+ },
+ {
+ "name": "Electron tray pattern — minimize to system tray for persistent background operation",
+ "what": "Electron apps persist in the background by intercepting window-all-closed (don't quit) and window close events (hide instead of destroy). A Tray icon in the OS system tray (macOS menu bar, Windows taskbar) provides a menu for re-opening windows or quitting. The app process stays alive with no visible windows. This pattern is relevant for a companion desktop app but not directly for a headless CLI daemon. The key insight: users expect background processes to be visible/controllable via system tray, not invisible.",
+ "evidence": "Listen to window-all-closed, don't call app.quit(). Hide windows on close instead of destroying. Create Tray with menu for show/quit. Process persists with no visible windows.",
+ "status": "live",
+ "source_url": "https://moinism.medium.com/how-to-keep-an-electron-app-running-in-the-background-f6a7c0e1ee4f",
+ "sub_theme": "electron_background"
+ },
+ {
+ "name": "PID file stale detection — check /proc/$pid/cmdline, not just process existence",
+ "what": "Simply checking if a PID exists is insufficient — the OS may have recycled the PID to a different process. Robust stale detection: (1) read PID from lock file, (2) check if /proc/$PID/cmdline matches expected command, (3) if not matching or process dead, overwrite lock file with new PID. The pidlock library uses an atomic directory-rename strategy: create temp dir named after PID, atomically rename to lock path. On ungraceful kill (SIGKILL), the lock file remains but stale detection handles recovery. Never delete PID files in signal handlers — use atexit handlers instead, but note SIGKILL bypasses all handlers.",
+ "evidence": "pidlock creates directory with PID name, atomically renames to lock file. Validates lock authenticity by checking /proc/$old_pid/cmdline. SIGKILL bypasses all signal handlers and atexit, so stale detection on next startup is essential.",
+ "status": "live",
+ "source_url": "https://www.guido-flohr.net/never-delete-your-pid-file/",
+ "sub_theme": "pid_management"
+ },
+ {
+ "name": "Cross-platform startup — PM2 startup for *nix, launchd plist, systemd unit, Windows Task Scheduler",
+ "what": "For a Node.js CLI daemon, the cross-platform startup matrix: macOS uses launchd (~/Library/LaunchAgents for user-level, /Library/LaunchDaemons for system), Linux uses systemd user units (~/.config/systemd/user/), Windows uses Task Scheduler or node-windows for native service registration. PM2's pm2 startup command abstracts this by detecting the init system and generating the right script. For a standalone CLI, generating platform-specific service files (plist XML on macOS, .service INI on Linux) gives maximum control. The brew services pattern of thin wrappers around OS service managers is the most maintainable approach.",
+ "evidence": "PM2 auto-detects init system: systemd, launchd, upstart, systemv. brew services generates plists for macOS, systemd units for Linux. Windows requires Task Scheduler or node-windows/pm2-windows-service.",
+ "status": "live",
+ "source_url": "https://blog.appsignal.com/2022/03/09/a-complete-guide-to-nodejs-process-management-with-pm2.html",
+ "sub_theme": "cross_platform_startup"
+ },
+ {
+ "name": "Bree — worker-thread-based Node.js job scheduler with graceful shutdown",
+ "what": "Bree spawns each job in a separate worker thread (sandboxed), supports cron expressions, human-friendly intervals, async/await, retries, throttling, and concurrency control. Graceful shutdown stops all workers cleanly. No daemon required — runs in-process. Croner is lighter (used by PM2 itself, Uptime Kuma) and works in Node, Deno, and browser. node-cron is the simplest (66KB) for basic cron scheduling. For a CLI daemon, Bree or Croner provides the scheduling layer while the daemon process itself handles lifecycle.",
+ "evidence": "Bree v9.1.3: worker threads, cron/dates/ms/human-friendly, retries, throttling, concurrency, graceful shutdown. Croner: used by PM2, Uptime Kuma, cross-runtime. node-cron v4.2.1: 66KB, basic cron.",
+ "status": "live",
+ "source_url": "https://jobscheduler.net/",
+ "sub_theme": "job_schedulers"
+ },
+ {
+ "name": "Recommended architecture for vana daemon — forked process + Unix socket + JSON-RPC + OS service registration",
+ "what": "Synthesis of research: (1) Daemon spawned on-demand by CLI via child_process.fork() (Turborepo pattern). (2) IPC via Unix domain socket with JSON-RPC protocol (VS Code pattern) — socket file also serves as liveness check. (3) PID file with /proc cmdline validation for stale detection (pidlock pattern). (4) Scheduler using Croner or Bree for cron-based collection triggers. (5) State persisted to JSON/SQLite file, restored on startup (PM2 dump.pm2 pattern). (6) Graceful degradation: CLI works without daemon, daemon crash doesn't break manual CLI usage (Turborepo pattern). (7) OS service registration via generated launchd plist / systemd user unit for boot persistence (Homebrew services pattern). (8) Auth failure detection pauses scheduler, sends notification (desktop notification API + optional webhook), resumes after re-auth.",
+ "evidence": "Composite pattern from PM2 (dual socket, state dump, startup scripts), Turborepo (on-demand spawn, graceful degradation, CI detection), Docker (socket activation), VS Code (JSON-RPC, lazy activation, persistent state), Homebrew (thin OS service wrappers).",
+ "status": "recommendation",
+ "source_url": "",
+ "sub_theme": "synthesis"
+ },
+ {
+ "name": "Key design decision — embedded daemon vs. external process manager",
+ "what": "Two viable approaches: (A) Embedded daemon: CLI itself can fork a background process (like Turborepo). Simpler for users (no PM2 dependency), but you own all lifecycle management, crash recovery, and log rotation. (B) External PM2: Users install PM2 and run 'pm2 start vana -- daemon'. PM2 handles crash recovery, log rotation, startup scripts. More dependencies, but battle-tested. Recommendation: Start with embedded daemon (approach A) for v1 — it's simpler for users and avoids the PM2 dependency. Add 'vana daemon install' to generate OS service files (approach inspired by Homebrew services) for boot persistence.",
+ "evidence": "Turborepo and Docker both use embedded daemons. PM2 is itself an embedded daemon that manages other processes. The trend in modern CLI tools is toward self-contained daemon management rather than requiring external process managers.",
+ "status": "recommendation",
+ "source_url": "",
+ "sub_theme": "design_decision"
+ },
+ {
+ "name": "Notification channels for auth failure — desktop notifications + terminal bell + optional webhook",
+ "what": "When the daemon detects an auth failure mid-collection, it should: (1) Pause the scheduler (don't retry with bad creds). (2) Write state to disk (which connector failed, when, why). (3) Send desktop notification via node-notifier (cross-platform: macOS Notification Center, Linux libnotify, Windows toast). (4) On next CLI invocation, show a prominent banner about the auth failure. (5) Optionally, call a user-configured webhook URL for remote notification (Slack, email, etc.). The 'laptop may be off' constraint means notification must be stored durably and shown on next interaction, not just fire-and-forget.",
+ "evidence": "PM2 Plus sends email/Slack notifications on process events. Desktop notifications via node-notifier. Durable state means the daemon writes failure state to disk and the CLI reads it on next run.",
+ "status": "recommendation",
+ "source_url": "",
+ "sub_theme": "notification_design"
+ }
+ ],
+ "gaps": [
+ {
+ "name": "SQLite vs JSON file for daemon state persistence",
+ "what": "Need to evaluate whether daemon state (schedule config, last run times, auth status, collection history) warrants SQLite (better for concurrent access, querying) vs a simple JSON file (simpler, no native dependency). SQLite has the better-sqlite3 package which is synchronous and fast but requires native compilation. JSON is zero-dependency but risks corruption on ungraceful shutdown without write-ahead patterns."
+ },
+ {
+ "name": "Windows support complexity",
+ "what": "Windows lacks Unix domain sockets (uses named pipes instead) and has no launchd/systemd equivalent (Task Scheduler is cron-like, not a process supervisor). Need to decide if Windows is a v1 target or can be deferred. node-ipc handles socket/pipe abstraction but Windows service registration (via node-windows or nssm) adds significant complexity."
+ },
+ {
+ "name": "Daemon log management and rotation",
+ "what": "Long-running daemons accumulate logs. PM2 handles this with pm2-logrotate module. Need a strategy for the vana daemon: file-based logging with rotation, or structured logging to a bounded file. Also need to decide how 'vana daemon logs' surfaces these to the user."
+ },
+ {
+ "name": "Security of Unix socket — file permissions and access control",
+ "what": "The daemon's Unix socket file should be readable/writable only by the owning user (mode 0600). Need to verify Node.js net.createServer for Unix sockets respects umask or requires explicit chmod. PM2 has had issues with root-owned socket files after reboot (GitHub issue #4226)."
+ },
+ {
+ "name": "Daemon version mismatch handling",
+ "what": "If the user updates the CLI package, the running daemon may be an older version. Need a protocol version check on connection (like Docker API versioning) and a strategy for graceful daemon restart on version mismatch. Turborepo handles this but details of their approach were not found."
+ }
+ ]
+}
diff --git a/research/cli-design/freshness-ux.md b/research/cli-design/freshness-ux.md
new file mode 100644
index 00000000..8a6cde90
--- /dev/null
+++ b/research/cli-design/freshness-ux.md
@@ -0,0 +1,132 @@
+# Temporal & Freshness UX in CLIs
+
+_As of March 16, 2026_
+
+## Which CLIs Show Time by Default?
+
+| Tool | What | Format | Default? |
+| -------------------- | ----------------- | ----------------------------------- | -------- |
+| **kubectl get pods** | AGE column | Relative compact: `2d`, `3h`, `5m` | Yes |
+| **docker ps** | CREATED column | Relative natural: `2 days ago` | Yes |
+| **gh run list** | AGE column | Relative compact: `2h` | Yes |
+| **ls -l** | Modification time | Adaptive (see below) | Yes |
+| **restic snapshots** | Timestamp | Absolute ISO: `2018-02-22 12:59:30` | Yes |
+| **git log** | Date | Requires `--date=relative` flag | No |
+| **npm list** | None | -- | No |
+| **brew list** | None | -- | No |
+
+## Time Format Patterns
+
+### Relative Compact (kubectl, gh)
+
+```
+NAME READY STATUS AGE
+nginx-deployment-66b6 1/1 Running 2d
+nginx-deployment-abc1 0/1 Pending 5m
+```
+
+Best for: status dashboards, monitoring. Instantly scannable.
+
+### Relative Natural (docker)
+
+```
+CONTAINER ID IMAGE STATUS CREATED
+a1b2c3d4e5 nginx:latest Up 2 hours 2 days ago
+```
+
+Best for: occasional use. More readable but takes more space.
+
+### Adaptive (ls -l)
+
+```
+-rw-r--r-- 1 user group 1024 Mar 30 23:45 recent-file.txt
+-rw-r--r-- 1 user group 2048 Mar 30 2024 old-file.txt
+```
+
+Recent files show time, old files show year. Avoids wasting precision on ancient items.
+
+### Absolute ISO (restic, rclone)
+
+```
+ID Date Host Directory
+9ba42540 2018-02-22 12:59:30 hwkb /
+```
+
+Best for: logs, audit trails. Unambiguous but requires mental math for recency.
+
+## Visual Hierarchy
+
+Consistent patterns across CLIs:
+
+- Time is **rightmost column** (supporting metadata, not the headline)
+- **Right-aligned** for scannability, even for variable-width text
+- **No color or dimming** in most CLI tools (unlike web UIs)
+- Never the primary identifier -- always secondary to name/status
+
+## Periodic Tasks & Frequency Metadata
+
+**Healthchecks.io** is the best model for "last run + expected frequency":
+
+```
+Check Status Last Ping Period
+backup-db Up 3 minutes ago every 1 hour
+sync-files Late 2 hours ago every 30 min
+cleanup Down 3 days ago every 1 hour
+```
+
+Shows: status derived from (last run vs expected period), not just the timestamp.
+
+No other CLI tool studied combines "last occurrence" with "expected frequency" in default output. This is an opportunity for the vana CLI.
+
+## Anti-Patterns
+
+| Problem | Example |
+| -------------------- | ------------------------------------------------------------------------- |
+| Excessive precision | rclone showing nanoseconds: `18:55:41.062626927` |
+| Format inconsistency | gh uses relative in tables, absolute in JSON |
+| Hidden by default | git log requiring `--date=relative` flag |
+| Missing entirely | npm/brew showing no temporal data in list output |
+| No frequency context | Showing "last run: 5 hours ago" without indicating whether that's healthy |
+
+## Recommendations for Vana CLI
+
+### Status Output (Human)
+
+```
+Sources:
+ github synced 3 days ago weekly recommended
+ spotify synced 12 hours ago daily recommended
+ chatgpt synced 2 weeks ago weekly recommended (overdue)
+```
+
+Use relative time (kubectl-style). Show frequency only when defined. Derive "overdue" from (elapsed > frequency).
+
+### Status Output (JSON, for agents)
+
+```json
+{
+ "source": "github",
+ "lastCollectedAt": "2026-03-13T14:30:00Z",
+ "exportFrequency": "weekly",
+ "suggestedNextCollectionAt": "2026-03-20T14:30:00Z",
+ "isOverdue": false
+}
+```
+
+### Where Freshness Should Surface
+
+| Context | Show? | What |
+| ----------------------- | ------- | ------------------------------------------------ |
+| `vana status` | Yes | Relative time + frequency + overdue flag |
+| `vana sources list` | Minimal | Last collected relative time only |
+| `vana sources info ` | Full | Absolute + relative + frequency + next suggested |
+| `vana connect ` | No | Command is action-focused, not informational |
+| `--json` everywhere | Yes | ISO timestamps + frequency + computed fields |
+
+### Design Principles
+
+1. Default to relative time for humans, absolute ISO for machines
+2. Compact format (`3d`) over natural language (`3 days ago`) in tables
+3. Show frequency context alongside timestamps -- bare timestamps are ambiguous
+4. Right-align temporal columns, keep them rightmost
+5. Never show sub-second precision in human output
diff --git a/research/cli-design/install-path.md b/research/cli-design/install-path.md
new file mode 100644
index 00000000..709405fb
--- /dev/null
+++ b/research/cli-design/install-path.md
@@ -0,0 +1,103 @@
+# CLI Install Path Research
+
+_As of March 13, 2026_
+
+## Question
+
+What do best-in-class CLI install paths look like, and should `vana` require something like `pnpm` or `npx`?
+
+## Short answer
+
+- Requiring `pnpm` is **not** best-in-class for a general-purpose CLI.
+- Requiring `npx` is **better as a temporary bridge**, but still not ideal as the final public install path.
+- The best install stories usually avoid tying the user to a language-specific package manager unless the CLI is clearly aimed at that ecosystem.
+
+## What the strongest references suggest
+
+### `uv`
+
+`uv` is a strong reference because it offers direct installation methods rather than assuming a Python package manager. The installation story is product-native and low-friction.
+
+Source:
+
+- https://docs.astral.sh/uv/getting-started/installation/
+
+### GitHub CLI (`gh`)
+
+`gh` is a strong reference because it is installable through standard OS package channels like Homebrew, apt, yum/dnf, winget, and Scoop. It does not expect users to already have a language-specific toolchain.
+
+Source:
+
+- https://cli.github.com/manual/installation
+
+### Doppler CLI
+
+Doppler follows a similar pattern: install through standard package manager channels and shell-friendly instructions.
+
+Source:
+
+- https://docs.doppler.com/docs/install-cli
+
+### Vercel CLI
+
+Vercel is the main counterexample. Its audience is heavily JavaScript-centric, so an npm-based install path is acceptable. Even there, that works because the product is already aimed at developers who almost certainly have Node.
+
+Source:
+
+- https://vercel.com/docs/cli
+
+## Implication for `vana`
+
+`vana` is not purely a JavaScript-developer tool.
+
+It is meant to work for:
+
+- coding agents
+- vibe coders
+- developers outside the JS ecosystem
+- eventually users who may not think in terms of Node/npm/pnpm at all
+
+That means:
+
+- `pnpm` should **not** be a required prerequisite for the public install path
+- `npx` is acceptable for canary/internal/early-adopter usage
+- the final public install story should likely be one or more of:
+ - standalone shell installer
+ - Homebrew
+ - winget
+ - maybe Scoop
+ - direct binary or installer downloads
+
+## Current recommendation
+
+### Short-term
+
+Use:
+
+```bash
+npx -y @opendatalabs/connect@canary
+```
+
+Why:
+
+- no `pnpm` prerequisite
+- works today
+- good enough for internal and agent-driven testing
+
+### Long-term
+
+Aim for a primary install story that does **not** require Node package-manager literacy.
+
+The likely quality bar is:
+
+- primary: OS-native / shell-native install
+- secondary: npm/npx for JS-heavy workflows
+- tertiary: canary/nightly channel for pre-release testing
+
+## Conclusion
+
+`npx` is a good bridge.
+
+It is **better than `pnpm`** for the current stage because it removes one unnecessary prerequisite.
+
+But if `vana` is meant to be truly best-in-class, the final install story should not depend on users already identifying as Node developers.
diff --git a/research/cli-design/pre-auth-patterns.md b/research/cli-design/pre-auth-patterns.md
new file mode 100644
index 00000000..854486b8
--- /dev/null
+++ b/research/cli-design/pre-auth-patterns.md
@@ -0,0 +1,115 @@
+# CLI Browser Auth & Pre-Auth Verification Patterns
+
+_As of March 16, 2026_
+
+## How Production CLIs Handle Browser Auth
+
+| CLI | Flow | Browser Open | Polling | Headless Fallback |
+| ------------------ | -------------------------- | ---------------------------- | ------------------------- | --------------------------------------- |
+| **gh auth login** | OAuth Device Code | Auto + clipboard code | Every 5s | `--with-token` / `GH_TOKEN` env |
+| **stripe login** | Device Code variant | Enter to open | Every 1s, 60s timeout | `--api-key` / `STRIPE_API_KEY` env |
+| **vercel login** | Device Code | Auto | Server-specified interval | `--token` on other commands |
+| **firebase login** | OAuth + localhost redirect | Auto | Waits for callback | `login:ci` for CI tokens |
+| **railway login** | Pairing code | Auto | Browser verification | `--browserless` + `RAILWAY_TOKEN` env |
+| **az login** | Authorization code | Auto (fallback: device code) | Interval-based | `--use-device-code` / service principal |
+| **netlify login** | Device Code | Auto | HTTP polling | Token in config |
+
+### Universal Patterns
+
+1. **Auto-open browser** with fallback to printing URL
+2. **Short-lived verification codes** displayed in terminal (MITM protection)
+3. **Polling with timeout** (not callbacks) for completion detection
+4. **Clear progress messages**: "Waiting for confirmation..." with spinner
+5. **Token storage** in `~/.config//` with restricted file permissions
+
+## Pre-Auth Verification
+
+### How CLIs Check Existing Auth
+
+- **File existence**: Does `~/.config/gh/hosts.yml` exist?
+- **Token format**: Parse stored token, check expiry claim
+- **Lazy validation**: Trust stored token until it actually fails on an API call
+
+Key insight: **CLIs avoid throwaway API calls to verify auth.** They trust the stored token structure and only validate on actual use. `gh auth status` is the exception -- it actively calls GitHub's API.
+
+### Pre-existing Auth Check (Netlify pattern)
+
+```
+$ netlify login
+You are already logged in via netlify config
+Run netlify status for account details
+To login with a new account, run netlify login --new
+```
+
+Netlify checks `getToken()` before attempting login. Skips the browser flow entirely if already authenticated.
+
+## Vana's Case Is Different
+
+The CLIs above authenticate **to their own service** via OAuth flows they control. Vana connectors authenticate **to third-party services** (GitHub, ChatGPT, Google) where Vana has no OAuth integration. This means:
+
+- No device code flow (Vana doesn't control the auth server)
+- No token exchange (Vana doesn't get API tokens)
+- Auth state lives in a **browser profile** (cookies/sessions), not a token file
+- Session validity can only be checked by **visiting the site and inspecting the DOM**
+
+## Recommended Patterns for Vana
+
+### Pre-flight Session Check
+
+Use `connectURL` + `connectSelector` metadata per connector:
+
+```json
+{
+ "name": "GitHub",
+ "connectURL": "https://github.com",
+ "connectSelector": "header [aria-label*='profile']"
+}
+```
+
+Before running the full connector:
+
+```
+1. Launch headless browser with saved profile
+2. Navigate to connectURL
+3. Check if connectSelector is visible (2-5 second timeout)
+4. Visible -> "Session active, proceeding..."
+5. Not visible -> "You need to log into GitHub first"
+```
+
+This is analogous to `gh auth status` but using DOM inspection instead of API calls.
+
+### Guided Login Flow (when pre-flight fails)
+
+```
+$ vana connect github
+
+ Checking authentication...
+ Not logged in to GitHub.
+ Open browser to log in? (y/n): y
+ Browser opened. Waiting for login...
+ Login detected. Starting data collection...
+```
+
+Pattern: open `connectURL` in headed browser, poll `connectSelector` every 2-3s, close when detected. Adapts the Stripe/Railway "open browser + poll" pattern for third-party sites.
+
+### Headless Fallback
+
+```
+$ vana connect github --no-browser
+Not logged in to GitHub.
+Please log in at: https://github.com/login
+Then re-run this command.
+```
+
+Always support non-interactive mode. Print the URL, exit with clear instructions.
+
+### Session State Tracking
+
+Upgrade from boolean `sessionPresent` (directory exists) to `sessionState: "valid" | "expired" | "unknown" | "none"` to capture pre-flight check results.
+
+## Key Takeaways
+
+1. Pre-flight auth checks should be fast (2-5s timeout) and non-interactive
+2. Offer browser-based login when pre-flight fails, with a `--no-browser` escape hatch
+3. `connectURL` + `connectSelector` per connector enables all of the above
+4. Track session validity state, not just session existence
diff --git a/research/cli-design/scope-display.md b/research/cli-design/scope-display.md
new file mode 100644
index 00000000..4ae0ebaa
--- /dev/null
+++ b/research/cli-design/scope-display.md
@@ -0,0 +1,94 @@
+# Scope Display & Source Discovery UX
+
+_As of March 16, 2026_
+
+## Progressive Disclosure: List -> Info -> JSON
+
+Production CLIs universally follow a three-tier pattern for presenting installable items.
+
+### Tier 1: List (scannable, minimal)
+
+```bash
+$ brew list
+git node python@3.11 vim wget
+
+$ gh extension list
+gh-copilot v1.0.0 Installed
+gh-dash v3.7.0 Installed
+```
+
+Pattern: **name + version/tag + one status field**. No descriptions, no capabilities, no URLs.
+
+### Tier 2: Info (one item, expanded detail)
+
+```bash
+$ brew info node
+==> node: stable 21.7.1
+Fast, unobstructed, and flexible JavaScript runtime
+https://nodejs.org/
+Installed: /opt/homebrew/Cellar/node/21.7.1 (2,345 files, 78.3MB)
+Dependencies: brotli, c-ares, icu4c, libnghttp2, libuv, openssl@3
+
+$ npm view react
+react@18.3.1 | MIT | deps: 1 | versions: 1987
+React is a JavaScript library for building user interfaces.
+https://react.dev/
+```
+
+Pattern: **name + version + one-line description + metadata block** (deps, size, URL, tags).
+
+### Tier 3: JSON (machine-readable, everything)
+
+`brew info --json`, `npm view --json`, `docker inspect`, `terraform providers -json`. Complete structured data, no formatting.
+
+| Tier | Fields | Audience |
+| -------- | ----------------------------------------------- | ----------------- |
+| **List** | Name, version, status indicator | Humans scanning |
+| **Info** | Name, version, description, category, deps, URL | Humans evaluating |
+| **JSON** | All fields including IDs, checksums, timestamps | Scripts, agents |
+
+## Capabilities/Scopes Display
+
+**terraform plan** shows capabilities as actions with symbols (`+` add, `~` change, `-` destroy). **OAuth consent screens** show scopes as human-readable permissions ("Read your profile", "Access your repositories").
+
+Tools that include descriptions in list output truncate to fit terminal width. No wrapping.
+
+## Recommendations for Vana CLI
+
+### `vana sources list` (Tier 1)
+
+```
+NAME STATUS LAST COLLECTED COLLECTS
+github connected 3d ago repos, commits, stars
+spotify connected 12h ago listening, playlists
+chatgpt available -- conversations
+twitter available -- posts, bookmarks
+```
+
+Name, connection status, recency, one-line scope summary. No descriptions.
+
+### `vana sources info ` (Tier 2)
+
+```
+GitHub
+ Status: connected
+ Collected: 3 days ago (weekly recommended)
+ Version: 1.2.0
+ Category: developer
+ Collects: repositories, commits, stars, profile
+ Auth: browser session at github.com
+ Description: Exports your GitHub activity data.
+```
+
+### `vana sources list --json` (Tier 3)
+
+Full structured data including checksums, URLs, selectors, timestamps, frequency metadata.
+
+### Design Principles
+
+1. **List is for scanning** -- 4-5 columns max, truncate descriptions
+2. **Info is for deciding** -- everything needed to evaluate
+3. **JSON is for machines** -- all fields, no formatting
+4. **Scopes as plain language** -- "repos, commits, stars" not "repo:read, commit:list"
+5. **Status indicators in list** -- connected/available/error at a glance
+6. **Descriptions in info only** -- never in list output
diff --git a/research/cli-design/sdk-ux.md b/research/cli-design/sdk-ux.md
new file mode 100644
index 00000000..02254e35
--- /dev/null
+++ b/research/cli-design/sdk-ux.md
@@ -0,0 +1,345 @@
+# CLI and SDK UX Research for `vana-connect`
+
+_As of March 12, 2026_
+
+## Goal
+
+This note captures current reference points for designing a strong `vana-connect` CLI around an SDK. The question is not "what can we wrap from `scripts/`?" but "what shape should a best-in-class connector CLI take in 2026?"
+
+There is no objective industry ranking for "greatest UX of all time," but a small set of tools are repeatedly treated as gold standards for developer experience. The most useful references for `vana-connect` cluster into three groups:
+
+- General-purpose developer CLIs
+- SDK/API design leaders
+- Agentic / interactive terminal tools
+
+## Shortlist
+
+### Best CLI references
+
+#### 1. GitHub CLI (`gh`)
+
+Why it matters:
+
+- Clear command grammar
+- Good balance between interactive use and scripting
+- Consistent flags and help output
+- Strong machine-readable output via `--json`
+- Extensible without becoming confusing
+
+What to learn:
+
+- Top-level nouns and verbs are easy to predict
+- Commands work both interactively and non-interactively
+- The CLI never feels trapped in one mode
+
+Sources:
+
+- https://cli.github.com/manual/
+- https://cli.github.com/manual/gh_pr_status
+- https://docs.github.com/en/github-cli/github-cli/using-github-cli-extensions
+
+#### 2. `uv`
+
+Why it matters:
+
+- Probably the clearest recent benchmark for fast, low-friction CLI design
+- Single-binary feel
+- Extremely good defaults
+- Minimal conceptual overhead
+- Strong "just run the thing" ergonomics
+
+What to learn:
+
+- Speed is UX
+- Reduce ceremony
+- Prefer commands users can guess before reading docs
+- Compress common workflows into short, memorable commands
+
+Sources:
+
+- https://docs.astral.sh/uv/
+- https://docs.astral.sh/uv/guides/tools/
+- https://docs.astral.sh/uv/concepts/tools/
+
+#### 3. Vercel CLI
+
+Why it matters:
+
+- Excellent onboarding flow
+- Strong local-to-cloud workflow
+- Good project linking model
+- Very polished auth, env, deploy, and logs UX
+
+What to learn:
+
+- Make first run feel guided
+- Model explicit project/account linkage
+- Make it obvious what context the user is operating in
+
+Sources:
+
+- https://vercel.com/docs/cli
+- https://vercel.com/docs/cli/link
+- https://vercel.com/docs/projects/deploy-from-cli
+
+#### 4. Fly CLI (`flyctl`)
+
+Why it matters:
+
+- Strong app lifecycle UX
+- Good examples of scaffolding, provisioning, deploy, inspect, and operate loops
+
+What to learn:
+
+- Provide a coherent workflow end-to-end, not just isolated commands
+- Treat diagnostics as a first-class feature
+
+Sources:
+
+- https://fly.io/docs/flyctl/
+- https://fly.io/docs/flyctl/launch/
+- https://fly.io/docs/launch/deploy/
+
+#### 5. Supabase CLI
+
+Why it matters:
+
+- Good bridge between local development and hosted state
+- Strong `init`, `start`, `link`, migration, and project context patterns
+
+What to learn:
+
+- Make local workflows explicit
+- Make remote state linkage inspectable and reversible
+
+Sources:
+
+- https://supabase.com/docs/guides/cli/getting-started
+- https://supabase.com/docs/reference/cli/supabase-init
+
+#### 6. Doppler CLI
+
+Why it matters:
+
+- Not flashy, but very polished in a sensitive category: secrets and environment configuration
+
+What to learn:
+
+- Secret handling UX should feel deliberate, safe, and unsurprising
+- Local environment commands should be trustworthy and inspectable
+
+Sources:
+
+- https://docs.doppler.com/docs/cli
+- https://docs.doppler.com/docs/install-cli
+
+## Best SDK / API DX references
+
+#### 1. Stripe
+
+Why it matters:
+
+- Still the canonical API DX reference
+- Great docs, examples, and SDK consistency
+- Excellent operational ergonomics: test mode, request IDs, idempotency, webhook tooling
+
+What to learn:
+
+- Design the SDK first, not the CLI first
+- Make failures diagnosable
+- Build for both quickstarts and production reliability
+- Support great local testing loops
+
+Sources:
+
+- https://docs.stripe.com/api
+- https://docs.stripe.com/sdks/server-side
+- https://docs.stripe.com/stripe-cli/use-cli
+
+#### 2. viem
+
+Why it matters:
+
+- One of the strongest modern crypto SDK references
+- Strong type safety
+- Clean composable primitives
+- Good separation between transport, client, action, and utility layers
+
+What to learn:
+
+- Keep the SDK modular and typed
+- Expose small composable primitives, not only giant convenience methods
+- Let advanced users build their own workflows from lower-level pieces
+
+Source:
+
+- https://viem.sh/docs/getting-started
+
+#### 3. Bun
+
+Why it matters:
+
+- A useful study in reducing conceptual surface area and compressing common tasks into sharp commands
+
+What to learn:
+
+- Short commands matter
+- Clear defaults matter more than large option surfaces
+
+Sources:
+
+- https://bun.sh/docs
+- https://bun.sh/docs/pm/bunx
+
+## Agentic / interactive CLI references
+
+#### Claude Code and Codex
+
+These are important references, but this category is still too young and fast-moving to treat as settled "all-time" CLI design.
+
+Why they still matter:
+
+- They show how a terminal tool can combine REPL, agent, SDK, and automation surface
+- They make state, streaming output, and intervention loops central to the experience
+
+What to learn:
+
+- Interactive mode should feel alive and stateful
+- Non-interactive mode still needs to exist for automation
+- Tool output must remain legible under streaming conditions
+
+Sources:
+
+- https://docs.anthropic.com/en/docs/claude-code/cli-reference
+- https://docs.anthropic.com/s/claude-code-sdk
+- https://platform.openai.com/docs/guides/code-generation
+- https://platform.openai.com/docs/docs-mcp
+
+## Practical ranking for `vana-connect`
+
+If the goal is to design a best-in-class connector CLI + SDK, the most relevant references are:
+
+- `gh` for command architecture
+- `uv` for speed, defaults, and low-friction execution
+- Vercel for onboarding and project/account context
+- Stripe for SDK-first design and operational ergonomics
+- viem for typed composable SDK structure
+- Supabase and Doppler for environment, context, and local/remote workflow patterns
+
+## Core lessons to carry into `vana-connect`
+
+### 1. Build the SDK first
+
+The CLI should be a thin, excellent interface over a stable SDK. This is the Stripe / viem lesson.
+
+The SDK should likely own:
+
+- connector discovery
+- registry access
+- auth/session management
+- execution lifecycle
+- progress events
+- result validation
+- machine-readable errors
+
+The CLI should own:
+
+- command grammar
+- interactive prompts
+- formatting
+- shell ergonomics
+- user guidance
+
+### 2. Support both human mode and automation mode
+
+The best modern CLIs do not force a choice between "pretty" and "scriptable." They support both.
+
+For `vana-connect`, that likely means:
+
+- human-friendly default output
+- `--json` or line-delimited JSON for automation
+- explicit exit codes
+- predictable stderr/stdout behavior
+
+### 3. Treat onboarding as a product surface
+
+The first-run path matters disproportionately.
+
+Good references here are Vercel, Supabase, and Doppler:
+
+- authenticate cleanly
+- detect missing prerequisites
+- explain local state
+- avoid surprising writes
+- make recovery obvious
+
+### 4. Favor a small number of excellent commands
+
+`uv` is the strongest reminder here. Fewer commands, better defaults, less ceremony.
+
+Bad direction:
+
+- turning every script into a top-level command
+
+Better direction:
+
+- identify the core user journeys
+- design commands around those journeys
+- keep lower-level escape hatches for advanced users
+
+### 5. Make diagnostics first-class
+
+Connector tooling lives in a failure-heavy environment:
+
+- auth breaks
+- websites change
+- sessions expire
+- anti-bot systems interfere
+- schemas drift
+
+The CLI should therefore make it easy to inspect:
+
+- current session state
+- connector metadata
+- last run status
+- validation failures
+- captured logs and artifacts
+
+### 6. Interactive UX should be optional, not mandatory
+
+Agentic CLIs are useful references, but `vana-connect` should not default to a TUI unless the workflow genuinely benefits from it.
+
+The baseline should probably remain:
+
+- standard subcommands
+- clear progress output
+- prompts only when needed
+
+Then optionally add:
+
+- richer interactive mode
+- watch mode
+- guided setup / doctor flows
+
+## Provisional conclusion
+
+If we want `vana-connect` to feel elite rather than enterprise-heavy, the strongest design blend is:
+
+- Stripe for system design and reliability
+- viem for SDK shape
+- `gh` for command language
+- `uv` for speed and simplicity
+- Vercel / Supabase for onboarding and context management
+
+This suggests that `vana-connect` should be:
+
+- SDK-first
+- scriptable by default
+- interactive when useful
+- fast to first success
+- explicit about local state, auth, and artifacts
+- designed around a few great workflows rather than a mirror of internal scripts
+
+## Notes on confidence
+
+This document is an informed synthesis, not a formal benchmark study. The ranking is partly based on current official docs and partly on broad developer reputation and observable product behavior as of March 12, 2026.
diff --git a/research/cli-design/version-tracking.md b/research/cli-design/version-tracking.md
new file mode 100644
index 00000000..fcf79d94
--- /dev/null
+++ b/research/cli-design/version-tracking.md
@@ -0,0 +1,102 @@
+# Connector Version Tracking & Update Patterns
+
+_As of March 16, 2026_
+
+## How Production CLIs Show "Update Available"
+
+| CLI | Format | Example |
+| ------------------------- | ----------------------------------------- | ----------------------------------------------------------- |
+| **npm outdated** | `Package Current Wanted Latest` table | `vue 2.6.10 2.7.16 3.5.28` |
+| **brew outdated** | `package (installed) < available` | `node (20.5.0) < 20.6.0` |
+| **apt list --upgradable** | `pkg/repo new-ver [upgradable from: old]` | `brave-browser/stable 1.40.113 [upgradable from: 1.40.107]` |
+| **pip list --outdated** | `Package Version Latest Type` table | `setuptools 39.2.0 40.4.3 wheel` |
+| **rustup update** | `toolchain status - compiler` | `stable updated - rustc 1.79.0` |
+
+Key distinction: npm differentiates "Wanted" (safe auto-update within semver range) from "Latest" (absolute latest, may be breaking). This is the most informative format.
+
+## Update Mechanics
+
+| CLI | Strategy | Version Check |
+| ---------- | -------------------------------- | ------------------------------------ |
+| **npm** | Registry lookup against lockfile | Only downloads if version changed |
+| **brew** | Downloads fresh, caches bottles | SHA256 verified before extraction |
+| **apt** | Repository metadata check first | Built-in GPG + checksum verification |
+| **rustup** | Channel metadata check | Downloads only changed components |
+
+All re-download on update (none maintain old+new side-by-side). The difference is whether they check before downloading.
+
+## Checksum Verification
+
+**Blocking (synchronous) is universal.** Every major package manager verifies checksums before installing, not after. The pattern:
+
+```
+1. Download artifact
+2. Verify checksum immediately
+3. Mismatch -> error + exit(1)
+4. Match -> extract and use
+```
+
+Async verification is only used for bulk operations (e.g., verifying hundreds of files in parallel). For single-file CLIs, blocking is correct.
+
+## Schema Evolution Patterns
+
+Production tools follow a consistent playbook:
+
+1. **Add new fields as optional with defaults** -- old consumers keep working
+2. **Never remove fields without deprecation** -- warn for 2+ versions first
+3. **Semantic versioning for format changes** -- major bump = breaking change
+4. **Provide migration paths explicitly** -- npm prints "Run `npm update` to use new lock file format"
+
+Real example: pip 22.3 removed `--format=freeze` with `--outdated` because the freeze format couldn't show "Latest" version. Migration path: use `--format=json | jq` instead.
+
+## Recommendations for Vana CLI
+
+### Version Display
+
+```
+$ vana connector list --check-updates
+
+ stripe-data v1.2.0 -> v1.2.3 available (patch)
+ facebook-graph v2.1.0 -> v2.2.0 available (minor)
+ google-workspace v3.0.0 -> v4.0.0 available (MAJOR)
+```
+
+### Local Lock File
+
+Store in `~/.vana/connectors.lock`:
+
+```json
+{
+ "connectors": {
+ "stripe-data": {
+ "version": "1.2.0",
+ "checksum": "sha256:abc123...",
+ "downloaded_at": "2026-03-15T10:30:00Z"
+ }
+ }
+}
+```
+
+- Download once, verify checksum, cache locally
+- Re-download only if registry version changes or checksum mismatch
+- Show `[cached] stripe-data@1.2.0` when using cached version
+- Show `[updated] stripe-data: 1.2.0 -> 1.2.3` when downloading new version
+
+### Update Commands
+
+```bash
+vana connector list --check-updates # like brew outdated / npm outdated
+vana connector update stripe-data # update specific
+vana connector update # update all
+```
+
+### Checksum Verification
+
+Synchronous, blocking. Download connector -> verify checksum -> use connector. Fail immediately on mismatch. No async/background verification needed for single connectors.
+
+## Key Takeaways
+
+1. Show current + available versions side-by-side with update type (patch/minor/major)
+2. Lock files with checksums eliminate redundant downloads and enable integrity verification
+3. Blocking checksum verification is the correct default
+4. Schema changes should be additive; use semver for breaking changes
diff --git a/research/personal-data-agents/findings/merged.json b/research/personal-data-agents/findings/merged.json
new file mode 100644
index 00000000..11832845
--- /dev/null
+++ b/research/personal-data-agents/findings/merged.json
@@ -0,0 +1,1719 @@
+{
+ "files_merged": 10,
+ "total_findings": 158,
+ "deduplicated_from": 322,
+ "agent_questions": [
+ "Current autonomous agent duration and reliability benchmarks",
+ "Competitive landscape for reducing human-in-the-loop in coding agents",
+ "Direct research on agent autonomy metrics, memory infrastructure, and personal data context",
+ "Failed personal AI and data-driven assistant products",
+ "Taxonomy of human-in-the-loop interventions in coding agent workflows",
+ "Products combining personal data aggregation with AI agent capabilities",
+ "Missing angles in personal data as agent context research",
+ "Product shapes for generating agent tasks from personal data",
+ "Verification of key numerical claims from wave 1 and wave 2"
+ ],
+ "findings": [
+ {
+ "name": "2025 AI Startup Shutdown Trends",
+ "what": "The market is aggressively filtering for companies with proprietary data advantage, real unit economics, and deep enterprise integration; thin GPT-wrappers are dying",
+ "evidence": "966 startups shut down in 2024 (25.6% increase from 769 in 2023); 2023-2024 cycle rewarded speed and UX leading to a long tail of thin GPT-wrapper products; 2025 data shows market shifted to require proprietary data advantage and real unit economics",
+ "status": "live",
+ "source_urls": [
+ "https://simpleclosure.com/blog/posts/state-of-startup-shutdowns-2025/"
+ ],
+ "sub_theme": "Failed products and lessons",
+ "merged_from": 2
+ },
+ {
+ "name": "Aider",
+ "what": "Open-source CLI AI pair programmer with repository map, automatic context management, and git-native workflow.",
+ "evidence": "39K+ GitHub stars, 4.1M+ installations, 15B tokens processed. 49.2% solve rate on SWE-bench Verified (with Claude 3.5 Sonnet, early 2026). Repository map creates condensed codebase overview. Automatically pulls context from related files. Every AI change gets its own git commit. Connects to 100+ models.",
+ "status": "live",
+ "source_urls": ["https://aider.chat/"],
+ "sub_theme": "CLI-based agent tools",
+ "merged_from": 2
+ },
+ {
+ "name": "Aider - SWE-bench performance (open source)",
+ "what": "Open-source CLI coding agent with published benchmark scores.",
+ "evidence": "Aider achieved 49.2% solve rate on SWE-bench Verified as of March 2026.",
+ "status": "live",
+ "source_urls": [
+ "https://is4.ai/blog/our-blog-1/cline-vs-aider-comparison-2026-313"
+ ],
+ "sub_theme": "Benchmark performance",
+ "merged_from": 2
+ },
+ {
+ "name": "Anthropic - 2026 Agentic Coding Trends Report",
+ "what": "Report based on Claude Code usage data showing how developer-agent collaboration patterns have changed, including session length, multi-file edits, and autonomy delegation.",
+ "evidence": "78% of Claude Code sessions in Q1 2026 involve multi-file edits, up from 34% in Q1 2025. Average session length increased from 4 minutes (autocomplete era) to 23 minutes (agentic era). Tool calls per session average 47. 99.9th percentile turn duration nearly doubled from <25 min to >45 min between Oct 2025 and Jan 2026. Developer acceptance rate of agent changes is 89% when the agent provides a diff summary vs 62% for raw output. Only 0-20% of tasks can be fully delegated.",
+ "status": "live",
+ "source_urls": [
+ "https://resources.anthropic.com/2026-agentic-coding-trends-report"
+ ],
+ "sub_theme": "Taxonomy: trust calibration and oversight strategy shift",
+ "merged_from": 2
+ },
+ {
+ "name": "Anthropic - Constitutional AI",
+ "what": "Research showing how human oversight can be compressed from thousands of individual preference labels to ~10 natural-language principles, with AI self-critique replacing per-instance human review.",
+ "evidence": "Standard RLHF requires tens of thousands of human preference labels. Constitutional AI reduces this to ~10 principles stated in natural language. The model self-critiques and revises its own outputs, then an AI preference model replaces human labelers in the RL phase. Constitutional RL produces Pareto improvements over RLHF (both more helpful and more harmless). Published December 2022.",
+ "status": "live",
+ "source_urls": ["https://arxiv.org/abs/2212.08073"],
+ "sub_theme": "Alignment: replacing human feedback with structured knowledge",
+ "merged_from": 2
+ },
+ {
+ "name": "Anthropic - Effective Harnesses for Long-Running Agents",
+ "what": "Anthropic engineering blog on the core challenge of multi-context-window agent sessions spanning hours or days.",
+ "evidence": "Long-running agents must work in discrete sessions, each starting with no memory of prior work. Context windows are limited, so agents need bridging mechanisms (e.g., claude-progress.txt + git history). Even frontier models like Opus 4.5 running in a loop across multiple context windows fail to build production-quality apps from high-level prompts alone, tending to attempt too much at once. The harness pattern uses different prompts for the first context window vs. subsequent ones.",
+ "status": "live",
+ "source_urls": [
+ "https://www.anthropic.com/engineering/effective-harnesses-for-long-running-agents"
+ ],
+ "sub_theme": "Autonomy duration and trust metrics",
+ "merged_from": 2
+ },
+ {
+ "name": "Anthropic - Measuring Agent Autonomy (Claude Code telemetry)",
+ "what": "Anthropic analyzed millions of Claude Code interactions to measure how long agents operate autonomously and how user trust evolves.",
+ "evidence": "Between Oct 2025 and Jan 2026, the 99.9th percentile turn duration nearly doubled from under 25 minutes to over 45 minutes. Median turn duration remained ~45 seconds (fluctuating 40-55s). Users with <50 sessions use full auto-approve ~20% of the time; by 750 sessions this rises to >40%. Experienced users interrupt Claude more often, not less, reflecting a shift from pre-approval to autonomous-with-intervention oversight. Software engineering accounts for ~50% of all agentic tool calls on the API. Only 0.8% of all actions are irreversible.",
+ "status": "live",
+ "source_urls": [
+ "https://www.anthropic.com/research/measuring-agent-autonomy"
+ ],
+ "sub_theme": "Autonomy duration and trust metrics",
+ "merged_from": 2
+ },
+ {
+ "name": "Anthropic - Measuring Agent Autonomy in Practice",
+ "what": "Anthropic analyzed millions of Claude Code interactions to categorize autonomy patterns, interruption types, and how human oversight behavior evolves with experience.",
+ "evidence": "Newer users (<50 sessions) auto-approve ~20% of sessions; by 750 sessions this rises to >40%. Interruption rate increases from ~5% of work steps for new users to ~9% for experienced ones. On complex tasks, Claude self-stops for clarification >2x as often as humans interrupt it. 80% of tool calls come from agents with at least one safeguard in place. 73% of actions appear to have a human in the loop. Only 0.8% of actions are irreversible. Published February 2026.",
+ "status": "live",
+ "source_urls": [
+ "https://www.anthropic.com/research/measuring-agent-autonomy"
+ ],
+ "sub_theme": "Taxonomy: trust calibration and oversight strategy shift",
+ "merged_from": 2
+ },
+ {
+ "name": "Anthropic agent autonomy metrics",
+ "what": "Verified: 99.9th percentile turn duration doubled (25 min to 45 min). Interventions decreased from 5.4 to 3.3.",
+ "evidence": "From Anthropic's own research publication. Auto-approve rates climb from 20% to 40%+.",
+ "status": "live",
+ "source_urls": [
+ "https://www.anthropic.com/research/measuring-agent-autonomy"
+ ],
+ "sub_theme": "Verification",
+ "merged_from": 2
+ },
+ {
+ "name": "Anthropic Agentic Coding Trends 2026",
+ "what": "Comprehensive report on how coding agents are reshaping software development",
+ "evidence": "Documents the shift from autocomplete to agentic coding. Average session 23 minutes. Background tasks, sub-agents, and hooks as autonomy mechanisms.",
+ "status": "live",
+ "source_urls": [
+ "https://resources.anthropic.com/hubfs/2026%20Agentic%20Coding%20Trends%20Report.pdf"
+ ],
+ "sub_theme": "Industry Reports",
+ "merged_from": 2
+ },
+ {
+ "name": "Anthropic Claude Code",
+ "what": "CLI-based autonomous coding agent with hooks, CLAUDE.md context files, memory system, and agent SDK.",
+ "evidence": "$2.5B ARR as of February 2026, reached in ~9 months from public launch (May 2025). Anthropic overall at $19B ARR in March 2026 ($380B valuation, $30B Series G). Uses 6-layer memory system loaded at session start. Hooks provide deterministic lifecycle control (session start, prompt submit). CLAUDE.md files provide static per-project context. Agent SDK powers the same execution loop externally.",
+ "status": "live",
+ "source_urls": [
+ "https://www.uncoveralpha.com/p/anthropics-claude-code-is-having"
+ ],
+ "sub_theme": "Coding agent products",
+ "merged_from": 2
+ },
+ {
+ "name": "Anthropic Claude Memory (Markdown-based)",
+ "what": "Anthropic chose transparent file-based memory (CLAUDE.md files) over vector databases, now available for Pro, Max, Team, and Enterprise users",
+ "evidence": "Launched September 2025 for Team/Enterprise; expanded to Pro and Max; 1M token context window generally available for Opus 4.6 and Sonnet 4.6; MCP protocol enables connections to Google Drive, Slack, GitHub, Postgres; Desktop Extensions for one-click MCP server installation",
+ "status": "live",
+ "source_urls": [
+ "https://platform.claude.com/docs/en/agents-and-tools/tool-use/memory-tool"
+ ],
+ "sub_theme": "Memory and context infrastructure",
+ "merged_from": 2
+ },
+ {
+ "name": "Anthropic Measuring Agent Autonomy",
+ "what": "Anthropic published real-world data on Claude Code autonomous operation duration and intervention frequency",
+ "evidence": "Between Oct 2025 and Jan 2026, 99.9th percentile turn duration nearly doubled from under 25 min to over 45 min. Average human interventions per session decreased from 5.4 to 3.3. Success rate on challenging tasks doubled.",
+ "status": "live",
+ "source_urls": [
+ "https://www.anthropic.com/research/measuring-agent-autonomy"
+ ],
+ "sub_theme": "Autonomous Agent Duration",
+ "merged_from": 2
+ },
+ {
+ "name": "Apple Intelligence (Personal Context)",
+ "what": "Apple's on-device AI system that was supposed to give Siri access to emails, messages, photos, and personal data for contextual assistance, but has been repeatedly delayed.",
+ "evidence": "Announced WWDC 2024. Personal context features (Siri accessing emails, photos, messages, form-filling from driver's license photos, etc.) delayed from 2025 to spring 2026. Craig Federighi stated: 'when it comes to automating capabilities on devices in a reliable way, no one's doing it really well right now.' On-device model is ~3B parameters on Apple Silicon. As of Feb 2026, 'Siri testing isn't going well' per AppleInsider; features may slip past iOS 26.4. Apple has ~2.2B active devices that would receive this.",
+ "status": "announced",
+ "source_urls": [
+ "https://www.cnbc.com/2025/03/07/apple-delays-siri-ai-improvements-to-2026.html"
+ ],
+ "sub_theme": "Platform-native personal AI",
+ "merged_from": 2
+ },
+ {
+ "name": "Apple Intelligence On-Device Foundation Models",
+ "what": "Apple deployed a ~3B parameter on-device model on Apple silicon, enabling AI features without sending user data to external servers, with zero API costs for developers",
+ "evidence": "Supports 16 languages (expanded from English-only); Foundation Models framework gives developers direct access at zero inference cost; 2.3 billion active Apple devices in 2025; Siri overhaul expected spring 2026 with multi-step task capability",
+ "status": "live",
+ "source_urls": [
+ "https://machinelearning.apple.com/research/apple-foundation-models-2025-updates"
+ ],
+ "sub_theme": "On-device personal AI",
+ "merged_from": 2
+ },
+ {
+ "name": "ARF Data Sharing Trust Study",
+ "what": "Trust in AI surged 16 points in 2025 but consumers take an increasingly transactional view of data sharing, demanding clear value exchange",
+ "evidence": "Nearly 60% of consumers willing to share data for personalized shopping recommendations; willingness varies sharply by context; only 19% trust AI in finance",
+ "status": "live",
+ "source_urls": [
+ "https://www.prnewswire.com/news-releases/trust-in-ai-surges-as-consumers-take-a-more-transactional-view-of-data-sharing-arf-study-finds-302667046.html"
+ ],
+ "sub_theme": "Trust and adoption curve",
+ "merged_from": 2
+ },
+ {
+ "name": "Atlassian HULA - Challenges and Future Directions",
+ "what": "Follow-up paper identifying two major HITL challenges: high computational costs of unit testing for validation and variability in LLM-based code quality evaluation.",
+ "evidence": "Presented at MSR 2025 Industry Track. Highlights that human review remains necessary because automated evaluation (unit tests + GPT-based similarity scoring) is inconsistent and expensive. Suggests the human-in-the-loop burden cannot be eliminated purely by adding automated checks.",
+ "status": "live",
+ "source_urls": ["https://arxiv.org/abs/2506.11009"],
+ "sub_theme": "Taxonomy: staged intervention points in development pipelines",
+ "merged_from": 2
+ },
+ {
+ "name": "Atlassian HULA Framework",
+ "what": "Human-in-the-Loop LLM-based Agents framework deployed at Atlassian for Jira issue resolution, with measured rates of human intervention at each stage of the coding pipeline.",
+ "evidence": "Deployed to 2,600 practitioners by Sept 2024 across 22,000+ eligible issues. Engineers used HULA on 663 work items in 2 months. Plan generation succeeded for 79% of items. Plan approval rate by humans: 82% (433/527). Code generated for 87% of approved plans. 25% reached pull request stage. 59% of generated PRs merged. ~900 total PRs merged. Unit test pass rate: 31% on SWE-bench. Code rated highly similar to human code in 45% of cases. Published November 2024.",
+ "status": "live",
+ "source_urls": ["https://arxiv.org/abs/2411.12924"],
+ "sub_theme": "Taxonomy: staged intervention points in development pipelines",
+ "merged_from": 2
+ },
+ {
+ "name": "Augment Code",
+ "what": "Enterprise AI coding platform focused on understanding large codebases, backed by Eric Schmidt.",
+ "evidence": "$252M raised at $977M valuation (Apr 2024). 188 employees as of Feb 2026. Backed by Eric Schmidt, Index Ventures, Sutter Hill, Lightspeed. Ranks 3rd among 144 active competitors by total funding. No new public funding rounds in 2025-2026.",
+ "status": "live",
+ "source_urls": [
+ "https://techcrunch.com/2024/04/24/eric-schmidt-backed-augment-a-github-copilot-rival-launches-out-of-stealth-with-252m/"
+ ],
+ "sub_theme": "Coding agent products",
+ "merged_from": 2
+ },
+ {
+ "name": "Autonomous Agent Reliability Gap",
+ "what": "Compound failure rates and trust deficits remain the core barrier to reducing human-in-the-loop.",
+ "evidence": "At 85% per-action accuracy, a 10-step workflow succeeds only 20% of the time. 46% of developers actively distrust AI code accuracy, only 3% highly trust it. 66% say top frustration is 'almost right, but not quite'. 99% of enterprise devs experimented with agents in 2025 but mass adoption did not materialize. Merged PRs correlate with small, localized changes; failed PRs are invasive and sprawling.",
+ "status": "live",
+ "source_urls": [
+ "https://medium.com/@vivek.babu/where-autonomous-coding-agents-fail-a-forensic-audit-of-real-world-prs-59d66e33efe9"
+ ],
+ "sub_theme": "Benchmarks and limitations",
+ "merged_from": 2
+ },
+ {
+ "name": "Builder.ai (Natasha)",
+ "what": "AI assistant 'Natasha' that promised anyone could build apps without code, backed by Microsoft and Qatar's sovereign wealth fund.",
+ "evidence": "Raised $445M total including $250M Series D. Valued at $1.2B. Filed for bankruptcy May 2025. Investigation revealed ~700 human engineers in India did most of the work, not AI. Claimed $220M in 2024 revenue; independent audit found ~$50M. Creditor Viola Credit seized $37M triggering insolvency.",
+ "status": "shut down",
+ "source_urls": [
+ "https://techstartups.com/2025/05/24/builder-ai-a-microsoft-backed-ai-startup-once-valued-at-1-2-billion-files-for-bankruptcy-is-ai-becoming-another-com-bubble/"
+ ],
+ "sub_theme": "Human-in-the-loop AI assistants",
+ "merged_from": 2
+ },
+ {
+ "name": "ByteBridge - From Human-in-the-Loop to Human-on-the-Loop",
+ "what": "Analysis of the transition from per-action approval (HITL) to exception-based oversight (HOTL), where agents run autonomously by default and humans intervene only on anomalies.",
+ "evidence": "HITL: human must review and approve before each action. HOTL: agents autonomous by default, human oversight available when it counts. Moving to HOTL doesn't mean abandoning HITL; it means using HITL strategically with optional interventions on demand. Published January 2026.",
+ "status": "live",
+ "source_urls": [
+ "https://bytebridge.medium.com/from-human-in-the-loop-to-human-on-the-loop-evolving-ai-agent-autonomy-c0ae62c3bf91"
+ ],
+ "sub_theme": "Taxonomy: trust calibration and oversight strategy shift",
+ "merged_from": 2
+ },
+ {
+ "name": "CCPA Automated Decision-Making Regulations",
+ "what": "California finalized CCPA regulations specifically addressing automated decision-making technology (ADMT), creating new compliance requirements for AI agents processing personal data",
+ "evidence": "Approved September 22, 2025; compliance required from January 1, 2026 for some provisions, January 1, 2027 for ADMT consumer rights; CPPA issued over $100 million in enforcement actions in 2024; ADMT broadly defined as any technology that processes personal information to replace or substantially replace human decision-making",
+ "status": "live",
+ "source_urls": [
+ "https://www.wiley.law/alert-California-Finalizes-Pivotal-CCPA-Regulations-on-AI-Cyber-Audits-and-Risk-Governance"
+ ],
+ "sub_theme": "Regulatory barriers",
+ "merged_from": 2
+ },
+ {
+ "name": "ChatGPT Agent Connectors",
+ "what": "ChatGPT's built-in connectors for email, calendar, and task management with autonomous scheduling",
+ "evidence": "Can summarize inbox, find meeting availability, schedule recurring tasks (e.g. weekly metrics reports every Monday). Connectors authenticated per user.",
+ "status": "live",
+ "source_urls": ["https://openai.com/index/introducing-chatgpt-agent/"],
+ "sub_theme": "Daily Briefing / Task Generation Products",
+ "merged_from": 2
+ },
+ {
+ "name": "ChatGPT Connectors (data integration layer)",
+ "what": "First-party integrations that let ChatGPT read from Gmail, Google Drive, SharePoint, Dropbox, Box, GitHub, Linear, HubSpot, and more, with synced indexing for some sources.",
+ "evidence": "Google Drive uses synced connector that indexes content in advance for fast retrieval. Most connectors are read-only; actions beyond reading require explicit user confirmation. Available to Team, Enterprise, and Edu plans, not free tier.",
+ "status": "live",
+ "source_urls": [
+ "https://help.openai.com/en/articles/11487775-connectors-in-chatgpt"
+ ],
+ "sub_theme": "LLM platforms with personal data ingestion",
+ "merged_from": 2
+ },
+ {
+ "name": "ChatGPT Full Chat History Memory",
+ "what": "OpenAI expanded ChatGPT memory to reference all past chats, not just explicit memories, creating de facto personal data dossier",
+ "evidence": "April 2025: memory can reference all past chats; June 2025: rolled out to free users; Incognito mode available for all; privacy team lead says 'memories are only visible to you'; context collapse problem demonstrated by location inference across unrelated conversations",
+ "status": "live",
+ "source_urls": ["https://www.axios.com/2025/07/11/chatgpt-memory-update"],
+ "sub_theme": "Memory and context infrastructure",
+ "merged_from": 2
+ },
+ {
+ "name": "ChatGPT Memory + Connectors",
+ "what": "OpenAI's ChatGPT retains persistent facts across conversations and now connects to Gmail, Google Drive, GitHub, Outlook, and other services for real-time personal context.",
+ "evidence": "800M+ weekly active users as of Sep 2025. $13B ARR as of Aug 2025. Memory launched Feb 2024; expanded Apr 2025 to reference all past conversations (not just saved memories). Free users get short-term continuity; Plus/Pro get long-term memory. Connectors (Gmail, Google Calendar, Google Drive, GitHub, Outlook, Teams, etc.) available to Team/Enterprise/Edu plans. MCP-based partner connectors (Stripe, Amplitude, Monday.com, etc.) launched 2025. 5M paying business users as of Aug 2025.",
+ "status": "live",
+ "source_urls": [
+ "https://openai.com/index/memory-and-new-controls-for-chatgpt/"
+ ],
+ "sub_theme": "LLM platforms with personal data ingestion",
+ "merged_from": 2
+ },
+ {
+ "name": "ChatGPT Pulse",
+ "what": "OpenAI's proactive daily briefing feature that researches overnight based on user memory and chat history, delivering personalized morning updates without a prompt.",
+ "evidence": "Launched September 25, 2025 as early preview for Pro users on iOS and Android. Synthesizes information from user memory, chat history, and direct feedback. Shelved in December 2025 when CEO Sam Altman issued 'Code Red' to refocus on core ChatGPT improvements amid competitive pressure.",
+ "status": "shut down",
+ "source_urls": [
+ "https://techcrunch.com/2025/09/25/openai-launches-chatgpt-pulse-to-proactively-write-you-morning-briefs/"
+ ],
+ "sub_theme": "Platform-native scheduled agents",
+ "merged_from": 2
+ },
+ {
+ "name": "ChatGPT Scheduled Tasks",
+ "what": "OpenAI's built-in feature for scheduling recurring prompts that execute at predetermined times (daily, weekly, monthly) and deliver results via push notification or email.",
+ "evidence": "Launched January 2025 in beta for Plus, Pro, and Teams plans. Limit of 10 active tasks at a time. Tasks run independently of whether the user is online. Available on web, Android, iOS, macOS. Still in beta as of 2026.",
+ "status": "live",
+ "source_urls": [
+ "https://help.openai.com/en/articles/10291617-scheduled-tasks-in-chatgpt"
+ ],
+ "sub_theme": "Platform-native scheduled agents",
+ "merged_from": 2
+ },
+ {
+ "name": "Claude Code $2.5B ARR",
+ "what": "Verified: Claude Code run-rate revenue above $2.5B, accounts for over half of Anthropic enterprise spending",
+ "evidence": "Confirmed by multiple sources. Anthropic at $14B total ARR. Claude Code reached $1B ARR within 6 months of May 2025 launch. $30B Series G at $380B valuation (Feb 2026).",
+ "status": "live",
+ "source_urls": [
+ "https://www.saastr.com/anthropic-just-hit-14-billion-in-arr-up-from-1-billion-just-14-months-ago/"
+ ],
+ "sub_theme": "Verification",
+ "merged_from": 2
+ },
+ {
+ "name": "Claude Code - Sub-agent and background task architecture",
+ "what": "Claude Code supports parallel sub-agents and background tasks but with significant isolation constraints.",
+ "evidence": "Sub-agents work in isolation with their own context windows and no direct awareness of other sub-agents. They cannot collaborate in real-time; the main agent must wait for all reports before synthesizing. Sub-agents have temporary context windows and cannot ask clarifying questions. Background tasks (Ctrl+B) keep long-running processes active without blocking the main session. Sustained multi-day autonomy on real codebases is described as 'not a solved problem' due to context limitations and brittleness.",
+ "status": "live",
+ "source_urls": [
+ "https://www.anthropic.com/news/enabling-claude-code-to-work-more-autonomously"
+ ],
+ "sub_theme": "Agent architecture and limitations",
+ "merged_from": 2
+ },
+ {
+ "name": "Claude Code Action (GitHub)",
+ "what": "Anthropic's GitHub Action that responds to @claude mentions, issue assignments, and PR comments to implement code changes and answer questions.",
+ "evidence": "Available on GitHub as of February 2026 via Agent HQ for Copilot Pro+ and Enterprise customers. Can commit code and comment on pull requests. Intelligently detects when to activate based on workflow context.",
+ "status": "live",
+ "source_urls": ["https://github.com/anthropics/claude-code-action"],
+ "sub_theme": "Autonomous coding agents",
+ "merged_from": 2
+ },
+ {
+ "name": "Claude Opus 4.6 - SWE-bench Verified",
+ "what": "Latest Claude model achieves top-tier autonomous coding benchmark scores.",
+ "evidence": "Opus 4.6 reached 81.42% on SWE-bench Verified (with prompt modification). Opus 4.5 scored 80.9%. Sonnet 4.5 scored 77.2%, up from Sonnet 4's 72.7%. Note: OpenAI has stopped reporting Verified scores after finding training data contamination across all frontier models, recommending SWE-Bench Pro instead.",
+ "status": "live",
+ "source_urls": ["https://www.anthropic.com/research/swe-bench-sonnet"],
+ "sub_theme": "Benchmark performance",
+ "merged_from": 2
+ },
+ {
+ "name": "Claude Opus 4.6 Task Horizon",
+ "what": "Claude Opus 4.6 crossed a full work-day task horizon at 14.5 hours",
+ "evidence": "50% time horizon of approximately 14.5 hours (doubling every 123 days). Claude Opus 4.5 was at 4 hours 49 minutes. Average session length increased from 4 minutes (autocomplete era) to 23 minutes (agentic era).",
+ "status": "live",
+ "source_urls": [
+ "https://getbeam.dev/blog/anthropic-agentic-coding-trends-2026.html"
+ ],
+ "sub_theme": "Autonomous Agent Duration",
+ "merged_from": 2
+ },
+ {
+ "name": "Cline",
+ "what": "Open-source autonomous coding agent VS Code extension with Plan/Act modes and MCP integration.",
+ "evidence": "5M+ developers. $32M Series A (Jul 2025) from Emergence Capital. Founded 2024 by Saoud Rizwan. Samsung beta-testing for Device eXperience division. Free and open-source, requires external API keys. Only fully open-source, local-first agent purpose-built for production development.",
+ "status": "live",
+ "source_urls": ["https://cline.bot/"],
+ "sub_theme": "Coding agent products",
+ "merged_from": 2
+ },
+ {
+ "name": "Cognition (Devin + Windsurf)",
+ "what": "Autonomous coding agent that handles full project lifecycle in a secure sandbox, plus IDE-based coding via acquired Windsurf.",
+ "evidence": "$10.2B valuation, $400M+ raised (Founders Fund led). ARR grew from $1M (Sep 2024) to $73M (Jun 2025). Windsurf acquisition (Jul 2025) more than doubled ARR. Total net burn under $20M across company history. Enterprise customers: Goldman Sachs, Citi, Dell, Cisco, Palantir. Devin Wiki and Devin Search added for codebase understanding. Multi-agent dispatch capability added early 2025.",
+ "status": "live",
+ "source_urls": [
+ "https://techcrunch.com/2025/09/08/cognition-ai-defies-turbulence-with-a-400m-raise-at-10-2b-valuation/"
+ ],
+ "sub_theme": "Coding agent products",
+ "merged_from": 2
+ },
+ {
+ "name": "Cognition (Devin) - Funding and Scale",
+ "what": "Cognition's financial trajectory provides context for the market size of human-in-the-loop coding agent products.",
+ "evidence": "$10.2B valuation as of September 2025. $400M raise led by Founders Fund. ARR grew from $1M (Sept 2024) to $73M (June 2025). Combined ARR ~$150M after Windsurf acquisition. Net burn under $20M since founding. Usage-based pricing via Agent Compute Units (ACUs).",
+ "status": "live",
+ "source_urls": [
+ "https://techcrunch.com/2025/09/08/cognition-ai-defies-turbulence-with-a-400m-raise-at-10-2b-valuation/"
+ ],
+ "sub_theme": "Market context",
+ "merged_from": 2
+ },
+ {
+ "name": "Cognition Devin - 2025 Performance Review",
+ "what": "Cognition published Devin's annual performance data showing PR merge rates as a proxy for how often human reviewers accept or reject autonomous agent work.",
+ "evidence": "PR merge rate doubled from 34% to 67% year-over-year. Devin merged hundreds of thousands of PRs total. 4x faster at problem-solving, 2x more efficient in resource consumption vs prior year. Described as 'senior-level at codebase understanding but junior at execution.' Customers include Goldman Sachs, Citi, Santander, Nubank. 33% of PRs are still rejected by human reviewers.",
+ "status": "live",
+ "source_urls": [
+ "https://cognition.ai/blog/devin-annual-performance-review-2025"
+ ],
+ "sub_theme": "Error correction: human as quality gate",
+ "merged_from": 2
+ },
+ {
+ "name": "Cognition Labs (Devin)",
+ "what": "Autonomous coding agent with $10.2B valuation and $73M ARR",
+ "evidence": "Raised $400M in Sept 2025 at $10.2B valuation. ARR grew from $1M (Sept 2024) to $73M (June 2025). Acquired Windsurf, combined ARR ~$155M. Customers include Goldman Sachs, Cisco, Palantir.",
+ "status": "live",
+ "source_urls": [
+ "https://techcrunch.com/2025/09/08/cognition-ai-defies-turbulence-with-a-400m-raise-at-10-2b-valuation/"
+ ],
+ "sub_theme": "Autonomous Agent Companies",
+ "merged_from": 2
+ },
+ {
+ "name": "Cognition Labs (Devin) - Real-world task completion",
+ "what": "First marketed 'AI software engineer' with widely scrutinized real-world completion rates.",
+ "evidence": "Original Devin resolved 13.86% of real GitHub issues on SWE-bench (7x improvement over prior 1.96% baseline). Independent testing showed 15-30% success rates in practice. Devin 2.0 (April 2025) claimed 83% more junior-level tasks per compute unit vs. 1.x. Pricing dropped from $500/month to $20/month. Cognition hit $155M ARR in July 2025 ($73M from Devin, $82M from acquired Windsurf). Raised ~$696M total, valued at $10.2B as of Sept 2025. Goldman Sachs piloted Devin alongside 12,000 developers, reporting 20% efficiency gains.",
+ "status": "live",
+ "source_urls": [
+ "https://cognition.ai/blog/funding-growth-and-the-next-frontier-of-ai-coding-agents"
+ ],
+ "sub_theme": "Autonomous coding products",
+ "merged_from": 2
+ },
+ {
+ "name": "Cognition/Devin $10.2B valuation",
+ "what": "Verified: $400M raise at $10.2B valuation (Sept 2025). ARR $73M growing to ~$155M with Windsurf.",
+ "evidence": "Confirmed by CNBC, TechCrunch. Net burn under $20M total. Triple-digit YoY growth.",
+ "status": "live",
+ "source_urls": [
+ "https://techcrunch.com/2025/09/08/cognition-ai-defies-turbulence-with-a-400m-raise-at-10-2b-valuation/"
+ ],
+ "sub_theme": "Verification",
+ "merged_from": 2
+ },
+ {
+ "name": "Cold Start Personalization Research",
+ "what": "A single AI task can involve 20-30 preference dimensions but individual users care about only 2-4, making cold start a navigation problem in high-dimensional space",
+ "evidence": "February 2025 arxiv paper; with a limited interaction budget, the assistant cannot ask about all dimensions and must find the sparse subset relevant to each user within a handful of questions; strategies include onboarding elicitation, leveraging existing data (e.g., loyalty card history), and real-time streaming updates that turn cold start into a short-term condition",
+ "status": "live",
+ "source_urls": ["https://arxiv.org/html/2602.15012"],
+ "sub_theme": "Cold start problem",
+ "merged_from": 2
+ },
+ {
+ "name": "Compounding error rate in multi-step agent workflows",
+ "what": "Research on why per-step accuracy is misleading for autonomous agent reliability.",
+ "evidence": "If an AI agent achieves 85% accuracy per action, a 10-step workflow only succeeds about 20% of the time (0.85^10 = 0.197). Graham Neubig found agents autonomously solve 30-40% of tasks without human intervention, with performance dropping sharply in domains requiring broader context: administrative work (0% success), financial analysis (8.3% success).",
+ "status": "live",
+ "source_urls": [
+ "https://firstpagesage.com/seo-blog/agentic-ai-statistics/"
+ ],
+ "sub_theme": "Academic research on agent effectiveness",
+ "merged_from": 2
+ },
+ {
+ "name": "Context7 (Upstash)",
+ "what": "MCP server that injects live, up-to-date library documentation into AI coding assistants' context windows.",
+ "evidence": "9,000+ libraries and frameworks indexed. Two core tools: resolve-library-id and query-docs. Integrates with Cursor, VS Code, Claude Code. Built by Upstash. CLI + Skills mode (no MCP required) or MCP mode. Solves the stale training data problem for library APIs.",
+ "status": "live",
+ "source_urls": ["https://upstash.com/blog/context7-llmtxt-cursor"],
+ "sub_theme": "Code intelligence platforms",
+ "merged_from": 2
+ },
+ {
+ "name": "CoPrompter (ACM IUI 2025)",
+ "what": "Framework that makes prompt-to-output alignment measurable by generating evaluation criteria from prompt requirements and letting users score outputs against a checklist.",
+ "evidence": "Generates evaluation criteria questions from prompt requirements. Users edit checklist to define alignment. Addresses the problem that prompt engineers cannot identify all points of misalignment through manual inspection alone. Published 2025.",
+ "status": "live",
+ "source_urls": ["https://dl.acm.org/doi/10.1145/3708359.3712102"],
+ "sub_theme": "Prompt engineering as human-in-the-loop",
+ "merged_from": 2
+ },
+ {
+ "name": "CrewAI",
+ "what": "Multi-agent automation platform for enterprises with structured role-based memory and RAG support.",
+ "evidence": "$18M total raised (Series A $12.5M, Oct 2024, led by Insight Partners). $3.2M revenue as of Jul 2025. Backed by Andrew Ng and Dharmesh Shah (HubSpot). Natively integrates Mem0 for memory. Fastest-growing multi-agent framework by adoption.",
+ "status": "live",
+ "source_urls": [
+ "https://siliconangle.com/2024/10/22/agentic-ai-startup-crewai-closes-18m-funding-round/"
+ ],
+ "sub_theme": "Multi-agent frameworks",
+ "merged_from": 2
+ },
+ {
+ "name": "Cursor $29.3B valuation",
+ "what": "Verified: Cursor raised $2.3B Series D at $29.3B valuation (Nov 2025). Now in talks for $50B round.",
+ "evidence": "Confirmed by CNBC, BusinessWire, TechCrunch. ARR crossed $2B as of March 2026. $1B ARR in 24 months. $1.2B ARR in 2025 per Sacra.",
+ "status": "live",
+ "source_urls": [
+ "https://www.cnbc.com/2025/11/13/cursor-ai-startup-funding-round-valuation.html"
+ ],
+ "sub_theme": "Verification",
+ "merged_from": 2
+ },
+ {
+ "name": "Cursor (Anysphere)",
+ "what": "AI-native code editor with codebase indexing, .cursorrules context, and cloud-based autonomous agents.",
+ "evidence": "$29.3B valuation, $3.4B total raised across 7 rounds. $1.2B ARR in 2025 (1,100% YoY growth from $100M). 1M+ daily active users, 360K paying customers. Indexes entire codebase with custom embedding model. .cursorrules file provides standing instructions for AI. Cloud Agents (new 2026) run autonomously on Cursor infrastructure, startable from browser/phone/Slack.",
+ "status": "live",
+ "source_urls": [
+ "https://techcrunch.com/2025/06/05/cursors-anysphere-nabs-9-9b-valuation-soars-past-500m-arr/"
+ ],
+ "sub_theme": "Coding agent products",
+ "merged_from": 2
+ },
+ {
+ "name": "Cursor (Anysphere) - RL-based Suggestion Filtering",
+ "what": "Cursor uses real-time reinforcement learning on user accept/reject signals to learn when to suggest and when to stay silent, directly encoding human intervention patterns into the model.",
+ "evidence": "Upgraded Tab model produces 21% fewer suggestions but 28% higher acceptance rate. Model receives reward on accept, penalty on reject, nothing on silence. Uses on-policy data from currently-deployed model, retraining multiple times per day. $1B+ ARR as of Nov 2025. $29.3B valuation. 1M+ daily active users (Dec 2025). 9,900% YoY ARR growth.",
+ "status": "live",
+ "source_urls": [
+ "https://analyticsindiamag.com/ai-news-updates/cursor-is-using-real-time-reinforcement-learning-to-improve-suggestions-for-developers/"
+ ],
+ "sub_theme": "Alignment: replacing human feedback with structured knowledge",
+ "merged_from": 2
+ },
+ {
+ "name": "Cursor (Anysphere) Codebase Context at Scale",
+ "what": "AI code editor demonstrating the economics of personal codebase context: $500M ARR, 1M+ daily active users, $20-200/month tiers",
+ "evidence": "Over 1 million daily active users as of December 2025; $500M ARR within 2.5 years; valued at $29.3B; raised $3.4B across 7 funding rounds; RAG-based system analyzes entire codebase for context; Pro tier $20/month, Ultra tier $200/month for 20x usage",
+ "status": "live",
+ "source_urls": ["https://research.contrary.com/company/cursor"],
+ "sub_theme": "Economics of personal data ingestion",
+ "merged_from": 2
+ },
+ {
+ "name": "Data Monetization Market",
+ "what": "Global data monetization market valued at $3.75B in 2024, projected to reach $28.16B by 2033",
+ "evidence": "CAGR of 25.1%; North America 41.21% market share in 2024; 30% of large organizations expected to directly monetize data externally by 2025; 60% of companies cite compliance concerns as primary barrier; organizations commercializing data via APIs report recurring revenue growth exceeding 20% annually",
+ "status": "live",
+ "source_urls": [
+ "https://www.grandviewresearch.com/industry-analysis/data-monetization-market"
+ ],
+ "sub_theme": "Economics of personal data ingestion",
+ "merged_from": 2
+ },
+ {
+ "name": "Data Transfer Initiative (DTI)",
+ "what": "Independent nonprofit (spun out of Google's Data Transfer Project in 2022) with Apple, Google, and Meta as founding partners, building open-source data portability infrastructure",
+ "evidence": "Apple and Google expanded direct photo/video transfer between Google Photos and iCloud Photos; EU DMA designates iOS and Android as services obligated to facilitate effective data portability; Apple and Google announced OS-level switching collaboration in late 2025",
+ "status": "live",
+ "source_urls": [
+ "https://dtinit.org/blog/2024/07/10/DTI-members-new-photo-video-tool"
+ ],
+ "sub_theme": "Data portability infrastructure",
+ "merged_from": 2
+ },
+ {
+ "name": "Dependabot",
+ "what": "GitHub's built-in automated dependency update tool that creates pull requests for outdated or vulnerable dependencies without human prompting.",
+ "evidence": "846,000+ repositories with Dependabot configured (2025 GitHub Octoverse report). 137% year-over-year adoption growth. Supports 30+ ecosystems (npm, pip, Maven, Docker, Go, Terraform, GitHub Actions, pnpm, Bun, Helm, Swift, etc.). Free for all GitHub repositories. More than 75% reduction in remediation time for code maintenance tasks.",
+ "status": "live",
+ "source_urls": ["https://docs.renovatebot.com/bot-comparison/"],
+ "sub_theme": "Automated maintenance (agent without prompting)",
+ "merged_from": 2
+ },
+ {
+ "name": "Devin (Cognition Labs)",
+ "what": "Autonomous AI software developer that completes development tasks end-to-end, from understanding codebases to writing and testing code.",
+ "evidence": "$155M+ ARR, growing from $1M in September 2024 to ~$73M by June 2025. $10.2B valuation after $400M Series C in late 2025. Acquired Windsurf (tens of millions ARR, hundreds of enterprise customers) in July 2025. Key customers include Goldman Sachs, Palantir, Cisco, Mercado Libre. Devin 2.0 (April 2025) dropped pricing from $500/month to $20/month Core plan.",
+ "status": "live",
+ "source_urls": [
+ "https://cognition.ai/blog/devin-annual-performance-review-2025"
+ ],
+ "sub_theme": "Autonomous coding agents",
+ "merged_from": 2
+ },
+ {
+ "name": "Devin - Task-Specific Performance (Independent Tests)",
+ "what": "Independent testing reveals that Devin's success rate varies dramatically by task type, with routine tasks succeeding far more often than complex ones.",
+ "evidence": "SWE-bench: solved 13.86% of issues end-to-end. Complex real-world tasks: ~15% completion without human assistance. Security vulnerability fixes: 1.5 min vs 30 min for humans (20x). File migrations: 3-4 hours vs 30-40 hours for humans (10x). The gap between routine and complex task completion rates explains why human oversight remains critical.",
+ "status": "live",
+ "source_urls": ["https://trickle.so/blog/devin-ai-review"],
+ "sub_theme": "Error correction: human as quality gate",
+ "merged_from": 2
+ },
+ {
+ "name": "Digi.me / Meeco / MyDex Personal Data Stores",
+ "what": "First-generation personal data vault companies offering encrypted user-controlled storage with selective sharing",
+ "evidence": "Digi.me: end-to-end encrypted vault for financial, fitness, medical data; Meeco: ISO 27001 accredited, Zero Knowledge Value architecture, pioneering since 2012; MyDex: CIC (Community Interest Company) focused on health conditions and identity; all remain small-scale with limited consumer adoption after years of operation",
+ "status": "live",
+ "source_urls": ["https://pmc.ncbi.nlm.nih.gov/articles/PMC9921726/"],
+ "sub_theme": "Data portability infrastructure",
+ "merged_from": 2
+ },
+ {
+ "name": "Dust.tt",
+ "what": "Enterprise AI agent platform using Claude and MCP protocol to create specialized agents connected to company knowledge and business applications.",
+ "evidence": "Raised $21.5M total ($16M Series A led by Sequoia Capital). Hit $6M ARR by Dec 2025. Uses Anthropic Claude models + MCP protocol. MCP integrations with Asana, Jira, GitHub, Google Drive, Gong, Gmail, Google Calendar, Salesforce, HubSpot, Notion. 2026 vision: multi-player agents where teams share context with AI teammates, infrastructure for managing thousands of agents.",
+ "status": "live",
+ "source_urls": [
+ "https://venturebeat.com/ai/dust-hits-6m-arr-helping-enterprises-build-ai-agents-that-actually-do-stuff-instead-of-just-talking"
+ ],
+ "sub_theme": "Enterprise knowledge + AI",
+ "merged_from": 2
+ },
+ {
+ "name": "Dust.tt Enterprise AI Agents",
+ "what": "Enterprise AI agent platform connecting to company data (Notion, Google Drive, Slack, Intercom) for contextual assistance",
+ "evidence": "Founded by ex-Stripe acquisition founders; $21.5M raised including $16M Series A led by Sequoia; $7.3M ARR as of mid-2025 with 66-person team; 80,000 agents created, 12 million conversations in 2025; customers like Clay, Qonto achieve 70%+ weekly AI adoption rates",
+ "status": "live",
+ "source_urls": ["https://dust.tt/blog/dust-wrapped-2025"],
+ "sub_theme": "Memory and context infrastructure",
+ "merged_from": 2
+ },
+ {
+ "name": "EDPB LLM Anonymization Report",
+ "what": "European Data Protection Board clarified that large language models rarely achieve anonymization standards",
+ "evidence": "April 2025 report; controllers deploying third-party LLMs must conduct comprehensive legitimate interests assessments; undermines the argument that LLM training anonymizes personal data",
+ "status": "live",
+ "source_urls": [
+ "https://www.dpocentre.com/data-protection-ai-governance-2025-2026/"
+ ],
+ "sub_theme": "Regulatory barriers",
+ "merged_from": 2
+ },
+ {
+ "name": "Enterprise agent adoption reality check (2025)",
+ "what": "Despite universal experimentation, mass autonomous agent adoption did not materialize in 2025.",
+ "evidence": "Nearly 99% of enterprise developers experimented with AI agents in 2025, but mass adoption never materialized. Bounded scope, human oversight, and specific workflows showed the most pragmatic results. The current practical sweet spot is Level 3-4: supervised autonomy where you provide goals and guardrails, the agent executes independently, and you approve/reject at decision points.",
+ "status": "live",
+ "source_urls": [
+ "https://firstpagesage.com/seo-blog/agentic-ai-statistics/"
+ ],
+ "sub_theme": "Industry adoption reality",
+ "merged_from": 2
+ },
+ {
+ "name": "EU AI Act High-Risk Deadline August 2026",
+ "what": "Full enforcement of high-risk AI system requirements under EU AI Act Annex III takes effect August 2, 2026, with penalties up to 35M EUR or 7% of global turnover",
+ "evidence": "Requires conformity assessments, technical documentation, CE marking, EU database registration; specific attention to data minimization in large-scale ingestion models, purpose limitation across agentic workflows, transparency in conversational interfaces, accuracy in real-time data synthesis",
+ "status": "announced",
+ "source_urls": [
+ "https://www.legalnodes.com/article/eu-ai-act-2026-updates-compliance-requirements-and-business-risks"
+ ],
+ "sub_theme": "Regulatory barriers",
+ "merged_from": 2
+ },
+ {
+ "name": "Facebook M",
+ "what": "Human-assisted AI assistant inside Facebook Messenger that could complete arbitrary tasks like restaurant reservations and shopping.",
+ "evidence": "Launched August 2015 to ~2,000 users in California. Never scaled beyond ~10,000 test users. Over 70% of requests required human operators ('M trainers'). Scaling to Messenger's 1.3B users would have required prohibitively large human workforce. Shut down January 2018 after ~2.5 years.",
+ "status": "shut down",
+ "source_urls": [
+ "https://techcrunch.com/2018/01/08/facebook-is-shutting-down-its-standalone-personal-assistant-m/"
+ ],
+ "sub_theme": "Human-in-the-loop AI assistants",
+ "merged_from": 2
+ },
+ {
+ "name": "Forward Health (CarePods)",
+ "what": "AI-powered autonomous medical kiosks ('CarePods') for self-service primary care.",
+ "evidence": "Raised ~$400M+ (some sources say up to $657M including debt). Planned 3,200 CarePods in 2024; launched only 3, one removed shortly after. Blood draws frequently failed. Patients got stuck inside pods. Revenue under $100M since founding (2016). Shut down all operations November 2024.",
+ "status": "shut down",
+ "source_urls": [
+ "https://www.fiercehealthcare.com/health-tech/primary-care-player-forward-shutters-after-raising-400m-rolling-out-carepods"
+ ],
+ "sub_theme": "Autonomous AI agents in physical world",
+ "merged_from": 2
+ },
+ {
+ "name": "From Prompt Engineering to Prompt Science (ACM / arXiv)",
+ "what": "Research framing prompt engineering as a human-in-the-loop alignment problem, where humans iteratively correct AI misalignment through prompt refinement rather than model retraining.",
+ "evidence": "Identifies four causes of misalignment: unpredictability, lack of transparency, value misalignment, and inherent complexity. Notes alignment is bi-directional (AI to human AND human to AI). A user's definition of alignment evolves over time as they discover new requirements. Proposes multi-phase verification as a systematic replacement for ad-hoc prompt tweaking. Published January 2024.",
+ "status": "live",
+ "source_urls": ["https://arxiv.org/abs/2401.04122"],
+ "sub_theme": "Prompt engineering as human-in-the-loop",
+ "merged_from": 2
+ },
+ {
+ "name": "Gartner Agentic AI Forecast",
+ "what": "Gartner predicts 80% of customer service issues resolved autonomously by 2029",
+ "evidence": "80% autonomous resolution rate, 30% operational cost reduction predicted. Current 2026 reality is more nuanced with HITL still required for edge cases.",
+ "status": "announced",
+ "source_urls": [
+ "https://acuvate.com/blog/2026-agentic-ai-expert-predictions/"
+ ],
+ "sub_theme": "Industry Reports",
+ "merged_from": 2
+ },
+ {
+ "name": "Gemini Scheduled Actions + Goal Scheduled Actions",
+ "what": "Google Gemini's built-in feature for scheduling recurring AI tasks, with a newer 'Goal' variant where the AI monitors objectives and adjusts actions over time.",
+ "evidence": "Scheduled Actions available to Google AI Pro and Ultra subscribers. Limit of 10 active scheduled actions. Goal Scheduled Actions rolled out February 2026, introducing proactive AI that reviews outputs from previous instructions and adjusts next actions. New Android UI launched March 2026 for naming, describing, and scheduling tasks.",
+ "status": "live",
+ "source_urls": [
+ "https://blog.google/products-and-platforms/products/gemini/scheduled-actions-gemini-app/"
+ ],
+ "sub_theme": "Platform-native scheduled agents",
+ "merged_from": 2
+ },
+ {
+ "name": "GitHub Copilot - Revenue Scale",
+ "what": "Copilot's financial scale contextualizes the market for tools where humans review every AI suggestion.",
+ "evidence": "4.7M paid subscribers as of Jan 2026. Paid subscriptions grew ~75% YoY. Nadella stated in 2024 that Copilot was a larger business than all of GitHub at the time of Microsoft's 2018 acquisition ($7.5B). Analyst estimates place ARR in mid-hundreds of millions to approaching $1B.",
+ "status": "live",
+ "source_urls": ["https://www.getpanto.ai/blog/github-copilot-statistics"],
+ "sub_theme": "Market context",
+ "merged_from": 2
+ },
+ {
+ "name": "GitHub Copilot - Suggestion Acceptance Rates",
+ "what": "GitHub's data on how frequently developers accept, modify, or reject AI code suggestions provides the largest-scale dataset on human intervention frequency.",
+ "evidence": "Average acceptance rate: ~30% of suggestions shown. Rises from 28.9% in first 3 months to 32.1% in next 3 months. Java developers: 61% acceptance rate. 88% code retention rate for accepted suggestions. ZoomInfo enterprise study: 33% suggestion acceptance, 20% line acceptance, 72% developer satisfaction. 4.7M paid subscribers (Jan 2026). 20M cumulative users (July 2025). Used by ~90% of Fortune 100.",
+ "status": "live",
+ "source_urls": [
+ "https://docs.github.com/en/copilot/concepts/copilot-usage-metrics/copilot-metrics"
+ ],
+ "sub_theme": "Error correction: human as quality gate",
+ "merged_from": 2
+ },
+ {
+ "name": "GitHub Copilot Coding Agent",
+ "what": "GitHub's autonomous agent that creates pull requests from issues, replacing the sunset Copilot Workspace.",
+ "evidence": "Copilot Workspace (launched as preview April 2024) was sunset May 30, 2025 and rebuilt into the Copilot Coding Agent, GA to all paid subscribers in Sept 2025. The coding agent contributes to ~1.2 million pull requests per month. Developers using Copilot complete tasks 55% faster (study of 4,800 developers). However, 29.1% of Python code generated contains potential security weaknesses. No public autonomous task completion rate benchmark was found. | GA for all paid Copilot subscribers as of Sep 25, 2025. Copilot CLI became GA Feb 25, 2026 with agentic capabilities: plans, builds, reviews, remembers across sessions. Agent mode expanding to JetBrains, Eclipse, Xcode. Excels at low-to-medium complexity tasks in well-tested codebases. Copilot Chat open-sourced in VS Code. | Copilot Workspace (launched as preview April 2024) was sunset May 30, 2025 and rebuilt into the Copilot Coding Agent, GA to all paid subscribers in Sept 2025. The coding agent contributes to ~1.2 million pull requests per month. Developers using Copilot complete tasks 55% faster (study of 4,800 developers). However, 29.1% of Python code generated contains potential security weaknesses. No public autonomous task completion rate benchmark was found. | GA for all paid Copilot subscribers as of Sep 25, 2025. Copilot CLI became GA Feb 25, 2026 with agentic capabilities: plans, builds, reviews, remembers across sessions. Agent mode expanding to JetBrains, Eclipse, Xcode. Excels at low-to-medium complexity tasks in well-tested codebases. Copilot Chat open-sourced in VS Code. | Copilot Workspace (launched as preview April 2024) was sunset May 30, 2025 and rebuilt into the Copilot Coding Agent, GA to all paid subscribers in Sept 2025. The coding agent contributes to ~1.2 million pull requests per month. Developers using Copilot complete tasks 55% faster (study of 4,800 developers). However, 29.1% of Python code generated contains potential security weaknesses. No public autonomous task completion rate benchmark was found. | GA for all paid Copilot subscribers as of Sep 25, 2025. Copilot CLI became GA Feb 25, 2026 with agentic capabilities: plans, builds, reviews, remembers across sessions. Agent mode expanding to JetBrains, Eclipse, Xcode. Excels at low-to-medium complexity tasks in well-tested codebases. Copilot Chat open-sourced in VS Code. | Copilot Workspace (launched as preview April 2024) was sunset May 30, 2025 and rebuilt into the Copilot Coding Agent, GA to all paid subscribers in Sept 2025. The coding agent contributes to ~1.2 million pull requests per month. Developers using Copilot complete tasks 55% faster (study of 4,800 developers). However, 29.1% of Python code generated contains potential security weaknesses. No public autonomous task completion rate benchmark was found. | GA for all paid Copilot subscribers as of Sep 25, 2025. Copilot CLI became GA Feb 25, 2026 with agentic capabilities: plans, builds, reviews, remembers across sessions. Agent mode expanding to JetBrains, Eclipse, Xcode. Excels at low-to-medium complexity tasks in well-tested codebases. Copilot Chat open-sourced in VS Code. | Available with Copilot Pro, Pro+, Business, and Enterprise plans. Runs in secure GitHub Actions-powered environment. Automates branch creation, commits, PR opening, and description writing. Targets low-to-medium complexity tasks. All PRs require independent human review; agent cannot approve or merge its own work. As of June 4, 2025, uses one premium request per model request. | Copilot Workspace (launched as preview April 2024) was sunset May 30, 2025 and rebuilt into the Copilot Coding Agent, GA to all paid subscribers in Sept 2025. The coding agent contributes to ~1.2 million pull requests per month. Developers using Copilot complete tasks 55% faster (study of 4,800 developers). However, 29.1% of Python code generated contains potential security weaknesses. No public autonomous task completion rate benchmark was found. | GA for all paid Copilot subscribers as of Sep 25, 2025. Copilot CLI became GA Feb 25, 2026 with agentic capabilities: plans, builds, reviews, remembers across sessions. Agent mode expanding to JetBrains, Eclipse, Xcode. Excels at low-to-medium complexity tasks in well-tested codebases. Copilot Chat open-sourced in VS Code. | Available with Copilot Pro, Pro+, Business, and Enterprise plans. Runs in secure GitHub Actions-powered environment. Automates branch creation, commits, PR opening, and description writing. Targets low-to-medium complexity tasks. All PRs require independent human review; agent cannot approve or merge its own work. As of June 4, 2025, uses one premium request per model request.",
+ "status": "live",
+ "source_urls": [
+ "https://github.com/newsroom/press-releases/agent-mode",
+ "https://github.blog/news-insights/product-news/github-copilot-meet-the-new-coding-agent/",
+ "https://docs.github.com/en/copilot/concepts/agents/coding-agent/about-coding-agent"
+ ],
+ "sub_theme": "Autonomous coding products",
+ "merged_from": 4
+ },
+ {
+ "name": "Glean",
+ "what": "Enterprise AI search platform that builds a personalized knowledge graph per employee from 100+ workplace data sources.",
+ "evidence": "$7.2B valuation after $150M Series F in Jun 2025 (up from $4.6B in Sep 2024). $100M+ ARR achieved in under 3 years (fiscal year ending Jan 2025). Integrates with Google Workspace, Microsoft 365, Slack, Salesforce, and more. Sep 2025: launched third-generation assistant with personal graph per employee (tracks projects, collaborators, work style). Personal graph enables agents that summarize weekly work or prepare performance reviews.",
+ "status": "live",
+ "source_urls": [
+ "https://www.glean.com/press/glean-achieves-100m-arr-in-three-years-delivering-true-ai-roi-to-the-enterprise"
+ ],
+ "sub_theme": "Enterprise knowledge + AI",
+ "merged_from": 2
+ },
+ {
+ "name": "Google Agent2Agent Protocol (A2A)",
+ "what": "Open protocol for agent-to-agent communication, task delegation, and capability discovery between AI agents.",
+ "evidence": "Launched April 9, 2025 at Cloud Next. 150+ supporting organizations including Atlassian, PayPal, Salesforce, SAP, ServiceNow. Version 0.3 added gRPC support. Agents advertise capabilities via JSON 'Agent Cards'. Designed as complement to MCP (MCP = tools/context, A2A = agent coordination).",
+ "status": "live",
+ "source_urls": [
+ "https://developers.googleblog.com/en/a2a-a-new-era-of-agent-interoperability/"
+ ],
+ "sub_theme": "Agent-to-agent protocols",
+ "merged_from": 2
+ },
+ {
+ "name": "Google CC (Your Day Ahead)",
+ "what": "Google Labs experimental AI agent that sends a daily morning briefing email by connecting to Gmail, Google Calendar, and Google Drive without requiring a search or prompt.",
+ "evidence": "Launched December 16, 2025 in early access for U.S. and Canada users 18+. Built on Gemini models. Users steer CC by replying to the email. Includes thumbs up/down feedback. Can draft emails and create calendar links. Priority access given to Google AI Ultra and Gemini Advanced subscribers.",
+ "status": "live",
+ "source_urls": [
+ "https://blog.google/technology/google-labs/cc-ai-agent/"
+ ],
+ "sub_theme": "Platform-native scheduled agents",
+ "merged_from": 2
+ },
+ {
+ "name": "Google Intent Extraction (Small Models, Big Results)",
+ "what": "Google Research method using small multimodal LLMs to understand sequences of user interactions on web and mobile, decomposing intent extraction into two stages.",
+ "evidence": "Presented at EMNLP 2025. Separates user intent understanding into: (1) summarizing each screen separately, then (2) extracting intent from the sequence of summaries. Makes intent extraction tractable for small on-device models.",
+ "status": "live",
+ "source_urls": [
+ "https://research.google/blog/small-models-big-results-achieving-superior-intent-extraction-through-decomposition/"
+ ],
+ "sub_theme": "Intent inference and task prediction",
+ "merged_from": 2
+ },
+ {
+ "name": "Google Now / Google Assistant personal context features",
+ "what": "Google's proactive personal assistant that surfaced contextual cards from email, calendar, location, and personal data.",
+ "evidence": "Google Now launched 2012, gradually deprecated from 2016. Replaced by Google Assistant, which itself removed 17 'underutilized' features in January 2024 including: checking personal travel itineraries, asking about contacts, sending emails/payments/reservations by voice. Google Assistant itself deprecated March 2026, replaced by Gemini.",
+ "status": "shut down",
+ "source_urls": [
+ "https://blog.google/products/assistant/google-assistant-update-january-2024/"
+ ],
+ "sub_theme": "Big tech personal assistants",
+ "merged_from": 2
+ },
+ {
+ "name": "Google Personal Intelligence",
+ "what": "Google connected Gemini to Gmail, Photos, YouTube history, and Search history under the 'Personal Intelligence' brand, making it the largest-scale personal data AI integration",
+ "evidence": "Launched January 14, 2026 in Gemini app; expanded to AI Mode in Search on January 22, 2026; rolling out to free Gemini users in US as of March 2026; built on Gemini 3 Pro and Flash models; enables cross-referencing private emails with real-time market data, managing travel itineraries",
+ "status": "live",
+ "source_urls": [
+ "https://blog.google/innovation-and-ai/products/gemini-app/personal-intelligence/"
+ ],
+ "sub_theme": "On-device personal AI",
+ "merged_from": 2
+ },
+ {
+ "name": "Greptile",
+ "what": "AI code review and validation platform that indexes GitHub repositories for codebase-aware analysis.",
+ "evidence": "$25M Series A (Sep 2025) led by Benchmark at ~$180M valuation. $45.5M total raised. Founded by Daksh Gupta (Georgia Tech), YC W24. Analyzes repos and validates every code update. Independent code validation layer across repositories.",
+ "status": "live",
+ "source_urls": [
+ "https://siliconangle.com/2025/09/23/greptile-bags-25m-funding-take-coderabbit-graphite-ai-code-validation/"
+ ],
+ "sub_theme": "Code intelligence platforms",
+ "merged_from": 2
+ },
+ {
+ "name": "Humane AI Pin",
+ "what": "Wearable AI device with camera, projector, and LLM access intended to replace smartphones, which failed commercially and was shut down.",
+ "evidence": "Raised $230M in VC. Sold ~10,000 units against a 100,000-unit plan. Returns outpaced sales from May-August 2024. Revenue of ~$9M lifetime. Marques Brownlee review calling it out viewed 8.5M+ times. HP acquired assets for $116M in February 2025, less than half the capital raised. User data deleted after Feb 28, 2025. | Raised $230-242M from investors including Sam Altman and Marc Benioff. Valued at $884-984M in Mar 2023. Returns outpaced sales by summer 2024. Valuation fell to $25M by Oct 2024. Price cut from $699 to $499. Charging case recalled for battery fire risk. HP acquired assets for $116M in Feb 2025. All devices bricked Feb 28, 2025. Only customers who bought after Nov 15, 2024 got refunds. | Raised $230M in VC. Sold ~10,000 units against a 100,000-unit plan. Returns outpaced sales from May-August 2024. Revenue of ~$9M lifetime. Marques Brownlee review calling it out viewed 8.5M+ times. HP acquired assets for $116M in February 2025, less than half the capital raised. User data deleted after Feb 28, 2025. | Raised $230-242M from investors including Sam Altman and Marc Benioff. Valued at $884-984M in Mar 2023. Returns outpaced sales by summer 2024. Valuation fell to $25M by Oct 2024. Price cut from $699 to $499. Charging case recalled for battery fire risk. HP acquired assets for $116M in Feb 2025. All devices bricked Feb 28, 2025. Only customers who bought after Nov 15, 2024 got refunds. | Raised $230M in VC. Sold ~10,000 units against a 100,000-unit plan. Returns outpaced sales from May-August 2024. Revenue of ~$9M lifetime. Marques Brownlee review calling it out viewed 8.5M+ times. HP acquired assets for $116M in February 2025, less than half the capital raised. User data deleted after Feb 28, 2025. | Raised $230-242M from investors including Sam Altman and Marc Benioff. Valued at $884-984M in Mar 2023. Returns outpaced sales by summer 2024. Valuation fell to $25M by Oct 2024. Price cut from $699 to $499. Charging case recalled for battery fire risk. HP acquired assets for $116M in Feb 2025. All devices bricked Feb 28, 2025. Only customers who bought after Nov 15, 2024 got refunds. | Raised $230M in VC. Sold ~10,000 units against a 100,000-unit plan. Returns outpaced sales from May-August 2024. Revenue of ~$9M lifetime. Marques Brownlee review calling it out viewed 8.5M+ times. HP acquired assets for $116M in February 2025, less than half the capital raised. User data deleted after Feb 28, 2025. | Raised $230-242M from investors including Sam Altman and Marc Benioff. Valued at $884-984M in Mar 2023. Returns outpaced sales by summer 2024. Valuation fell to $25M by Oct 2024. Price cut from $699 to $499. Charging case recalled for battery fire risk. HP acquired assets for $116M in Feb 2025. All devices bricked Feb 28, 2025. Only customers who bought after Nov 15, 2024 got refunds. | Raised $230M in VC. Sold ~10,000 units against a 100,000-unit plan. Returns outpaced sales from May-August 2024. Revenue of ~$9M lifetime. Marques Brownlee review calling it out viewed 8.5M+ times. HP acquired assets for $116M in February 2025, less than half the capital raised. User data deleted after Feb 28, 2025. | Raised $230-242M from investors including Sam Altman and Marc Benioff. Valued at $884-984M in Mar 2023. Returns outpaced sales by summer 2024. Valuation fell to $25M by Oct 2024. Price cut from $699 to $499. Charging case recalled for battery fire risk. HP acquired assets for $116M in Feb 2025. All devices bricked Feb 28, 2025. Only customers who bought after Nov 15, 2024 got refunds.",
+ "status": "shut down",
+ "source_urls": [
+ "https://techcrunch.com/2025/02/18/humanes-ai-pin-is-dead-as-hp-buys-startups-assets-for-116m/"
+ ],
+ "sub_theme": "Personal AI hardware",
+ "merged_from": 3
+ },
+ {
+ "name": "Humane AI Pin Shutdown",
+ "what": "AI wearable that promised ambient personal AI context failed on reliability, speed, battery life, and heat; sold to HP after burning $230M",
+ "evidence": "Launched late 2024; discontinued February 2025; sold team and IP to HP for $116M; customers' pins remotely disabled after cloud services shut down; reviews cited unreliable, slow, confusing experience | Confirmed by Fortune, TechCrunch, Axios. $230M raised. Device scathingly reviewed. Customers given <10 days notice before brick. | Launched late 2024; discontinued February 2025; sold team and IP to HP for $116M; customers' pins remotely disabled after cloud services shut down; reviews cited unreliable, slow, confusing experience | Confirmed by Fortune, TechCrunch, Axios. $230M raised. Device scathingly reviewed. Customers given <10 days notice before brick.",
+ "status": "shut down",
+ "source_urls": [
+ "https://techcrunch.com/2025/12/09/top-ai-startups-that-shut-down-in-2025-what-founders-can-learn/",
+ "https://techcrunch.com/2025/02/18/humanes-ai-pin-is-dead-as-hp-buys-startups-assets-for-116m/"
+ ],
+ "sub_theme": "Failed products and lessons",
+ "merged_from": 3
+ },
+ {
+ "name": "Inflection AI (Pi)",
+ "what": "Personal AI chatbot focused on empathetic conversation rather than task completion, built by DeepMind co-founder Mustafa Suleyman.",
+ "evidence": "Raised $1.525B at $4B valuation. Had 1M daily active users and 6M monthly active users. In March 2024, Microsoft acqui-hired nearly all 70 employees and paid $650M to license its models. Pi still exists but with usage caps and a skeleton crew. Could not find a sustainable business model despite strong user engagement.",
+ "status": "acquired",
+ "source_urls": ["https://www.eesel.ai/blog/inflection-ai"],
+ "sub_theme": "Personal AI chatbots",
+ "merged_from": 2
+ },
+ {
+ "name": "Intercom Fin AI Agent",
+ "what": "AI customer support agent by Intercom, handling front-line customer service conversations.",
+ "evidence": "Handled 8M+ queries. First version (GPT-powered, 2023) achieved 23% resolution rate. Second version (Claude-powered, 2024) improved to 51% average resolution rate across thousands of customers. Still requires human escalation for ~49% of conversations. Pricing is $0.99 per resolved conversation.",
+ "status": "live",
+ "source_urls": [
+ "https://thelettertwo.com/2024/10/12/intercom-releases-fin-2-ai-agent-switching-anthropic-from-openai/"
+ ],
+ "sub_theme": "Enterprise AI agents",
+ "merged_from": 2
+ },
+ {
+ "name": "LangChain / LangSmith",
+ "what": "Most widely adopted AI agent framework (LangChain) with commercial observability and testing platform (LangSmith).",
+ "evidence": "$125M Series B (Oct 2025) at $1.25B valuation, led by IVP with Sequoia, Benchmark. 47M+ PyPI downloads (Jan 2026). ARR $12-16M+ as of mid-2025 (company says that figure is 'low for where we are today'). LangSmith monthly trace volume 12x'd YoY. Total raised: $160M across seed, Series A, Series B.",
+ "status": "live",
+ "source_urls": [
+ "https://fortune.com/2025/10/20/exclusive-early-ai-darling-langchain-is-now-a-unicorn-with-a-fresh-125-million-in-funding/"
+ ],
+ "sub_theme": "Multi-agent frameworks",
+ "merged_from": 2
+ },
+ {
+ "name": "Limitless (formerly Rewind AI)",
+ "what": "AI wearable pendant that continuously records conversations and meetings, providing AI-powered summaries and searchable personal data.",
+ "evidence": "Raised $33M+ total from Sam Altman, a16z, First Round Capital, NEA. $15M at $350M valuation in May 2023 (495x multiple on $707K ARR). Pendant sold for $99, 100-hour battery life. Free tier: unlimited audio storage + 10 hours AI features/month. Pro: $20/month unlimited AI. Acquired by Meta on December 5, 2025; hardware sales stopped, subscription fees waived for existing users.",
+ "status": "acquired",
+ "source_urls": [
+ "https://techcrunch.com/2025/12/05/meta-acquires-ai-device-startup-limitless/"
+ ],
+ "sub_theme": "Personal data capture and AI",
+ "merged_from": 2
+ },
+ {
+ "name": "Limitless (Rewind AI) Acquired by Meta",
+ "what": "Personal data recording pendant company acquired by Meta to build 'personal superintelligence' wearables, after pivoting from desktop screen recording",
+ "evidence": "Acquired December 2025; raised $33M+ from a16z, First Round, NEA; $15M at $350M valuation in May 2023 (495x multiple on $707K ARR); pendant sold for $99 with $20/month Pro plan; stopped selling hardware post-acquisition; customers moved to free Unlimited Plan",
+ "status": "acquired",
+ "source_urls": [
+ "https://techcrunch.com/2025/12/05/meta-acquires-ai-device-startup-limitless/"
+ ],
+ "sub_theme": "Failed products and lessons",
+ "merged_from": 2
+ },
+ {
+ "name": "Limitless AI (formerly Rewind AI)",
+ "what": "Screen recording and conversation capture AI that evolved from desktop app to wearable pendant, then was acquired by Meta.",
+ "evidence": "Founded 2022 as Rewind AI. Raised $15M at $350M valuation in May 2023 on $707K ARR (495x revenue multiple). Raised $27M total. Rebranded to Limitless Apr 2024 with pivot to wearable pendant. Acquired by Meta Dec 5, 2025 for estimated $200-400M (undisclosed official terms). Hardware sales ceased; existing customers moved to free Unlimited Plan for one year. Team joined Meta Reality Labs (Ray-Ban smart glasses). | Raised $15M at $350M valuation (May 2023) on $707K ARR (495x multiple). Total $33M+ raised from Altman, a16z, NEA. Acquired by Meta Dec 2025. Desktop app shut down Dec 19, 2025. | Raised $15M at $350M valuation (May 2023) on $707K ARR (495x multiple). Total $33M+ raised from Altman, a16z, NEA. Acquired by Meta Dec 2025. Desktop app shut down Dec 19, 2025. | Founded 2022 as Rewind AI. Raised $15M at $350M valuation in May 2023 on $707K ARR (495x revenue multiple). Raised $27M total. Rebranded to Limitless Apr 2024 with pivot to wearable pendant. Acquired by Meta Dec 5, 2025 for estimated $200-400M (undisclosed official terms). Hardware sales ceased; existing customers moved to free Unlimited Plan for one year. Team joined Meta Reality Labs (Ray-Ban smart glasses). | Raised $15M at $350M valuation (May 2023) on $707K ARR (495x multiple). Total $33M+ raised from Altman, a16z, NEA. Acquired by Meta Dec 2025. Desktop app shut down Dec 19, 2025. | Founded 2022 as Rewind AI. Raised $15M at $350M valuation in May 2023 on $707K ARR (495x revenue multiple). Raised $27M total. Rebranded to Limitless Apr 2024 with pivot to wearable pendant. Acquired by Meta Dec 5, 2025 for estimated $200-400M (undisclosed official terms). Hardware sales ceased; existing customers moved to free Unlimited Plan for one year. Team joined Meta Reality Labs (Ray-Ban smart glasses). | Raised $15M at $350M valuation (May 2023) on $707K ARR (495x multiple). Total $33M+ raised from Altman, a16z, NEA. Acquired by Meta Dec 2025. Desktop app shut down Dec 19, 2025. | Founded 2022 as Rewind AI. Raised $15M at $350M valuation in May 2023 on $707K ARR (495x revenue multiple). Raised $27M total. Rebranded to Limitless Apr 2024 with pivot to wearable pendant. Acquired by Meta Dec 5, 2025 for estimated $200-400M (undisclosed official terms). Hardware sales ceased; existing customers moved to free Unlimited Plan for one year. Team joined Meta Reality Labs (Ray-Ban smart glasses).",
+ "status": "acquired",
+ "source_urls": [
+ "https://techcrunch.com/2025/12/05/meta-acquires-ai-device-startup-limitless/"
+ ],
+ "sub_theme": "Screen/audio capture + AI recall",
+ "merged_from": 3
+ },
+ {
+ "name": "Limitless AI / Meta acquisition",
+ "what": "Verified: Acquired by Meta Dec 2025. $33M+ total raised. Desktop app sunset Dec 19, 2025.",
+ "evidence": "Confirmed by TechCrunch. $350M valuation on $707K ARR (2023). Pendant discontinued.",
+ "status": "acquired",
+ "source_urls": [
+ "https://techcrunch.com/2025/12/05/meta-acquires-ai-device-startup-limitless/"
+ ],
+ "sub_theme": "Verification",
+ "merged_from": 2
+ },
+ {
+ "name": "Linear Triage Intelligence / Product Intelligence",
+ "what": "Linear's AI that auto-triages incoming issues by suggesting teams, projects, assignees, labels, flagging duplicates, and linking related issues based on backlog patterns.",
+ "evidence": "Product Intelligence launched August 14, 2025 as Technology Preview on Business and Enterprise plans. Auto-apply triage suggestions shipped September 19, 2025. Uses search, ranking, and LLM-based reasoning, drawing on existing backlog as a dataset. Can be configured to auto-apply suggested properties without human approval.",
+ "status": "live",
+ "source_urls": [
+ "https://linear.app/changelog/2025-08-14-product-intelligence-technology-preview"
+ ],
+ "sub_theme": "Productivity tool AI agents",
+ "merged_from": 2
+ },
+ {
+ "name": "Magic Leap",
+ "what": "Augmented reality headset company that promised ambient computing and spatial interfaces.",
+ "evidence": "Raised ~$4.5B total from Google, Qualcomm, Alibaba, AT&T, and Saudi Arabia's PIF. Magic Leap One (2018) priced at $2,300, failed to meet sales targets. Cut ~1,000 workers (half the workforce) in 2020. Pivoted from consumer to enterprise. Cut entire sales and marketing teams (75 jobs) in July 2024. Now pivoting to license optics technology rather than sell headsets.",
+ "status": "live",
+ "source_urls": [
+ "https://www.roadtovr.com/magic-leap-layoff-2024-optics-pivot/"
+ ],
+ "sub_theme": "Ambient computing hardware",
+ "merged_from": 2
+ },
+ {
+ "name": "MCP (Model Context Protocol) Ecosystem",
+ "what": "Anthropic's open protocol for connecting AI models to external data sources and tools, now adopted across the industry as a standard for personal and enterprise data access.",
+ "evidence": "5,800+ MCP servers and 300+ MCP clients as of early 2026. 1,000+ community-built servers covering Google Drive, Slack, databases, and custom systems. OpenAI adopted MCP for ChatGPT partner connectors. Microsoft building Azure API Center integration for MCP server registries. Streamable HTTP transport enables remote MCP servers in production. 2026 roadmap: triggers/events, streamed results, security/authorization hardening, registry/discovery infrastructure.",
+ "status": "live",
+ "source_urls": ["https://en.wikipedia.org/wiki/Model_Context_Protocol"],
+ "sub_theme": "AI memory infrastructure",
+ "merged_from": 2
+ },
+ {
+ "name": "Mem0",
+ "what": "Memory infrastructure layer for AI agents providing persistent, structured memory via API.",
+ "evidence": "$24M Series A (Oct 2025) led by Basis Set Ventures, with Peak XV, GitHub Fund, YC. 41K+ GitHub stars, 13M+ PyPI downloads. 80K+ developers on cloud service. API calls grew from 35M (Q1 2025) to 186M (Q3 2025), ~30% MoM. Natively integrated into CrewAI, Flowise, Langflow. AWS selected Mem0 as exclusive memory provider for their Agent SDK.",
+ "status": "live",
+ "source_urls": [
+ "https://techcrunch.com/2025/10/28/mem0-raises-24m-from-yc-peak-xv-and-basis-set-to-build-the-memory-layer-for-ai-apps/"
+ ],
+ "sub_theme": "Agent memory infrastructure",
+ "merged_from": 2
+ },
+ {
+ "name": "Mem0 $24M raise",
+ "what": "Verified: $24M total (Seed + Series A). 186M API calls Q3 2025.",
+ "evidence": "Confirmed by TechCrunch, company blog. Backed by YC, Peak XV, GitHub Fund.",
+ "status": "live",
+ "source_urls": [
+ "https://techcrunch.com/2025/10/28/mem0-raises-24m-from-yc-peak-xv-and-basis-set-to-build-the-memory-layer-for-ai-apps/"
+ ],
+ "sub_theme": "Verification",
+ "merged_from": 2
+ },
+ {
+ "name": "Mem0 (not Mem.ai)",
+ "what": "Universal memory layer infrastructure for AI applications that stores, updates, and recalls information across conversations, used by AI agent frameworks.",
+ "evidence": "Raised $24M (Seed led by Kindred Ventures, Series A led by Basis Set Ventures, with Peak XV, GitHub Fund, Y Combinator). 41K+ GitHub stars, 13M+ Python package downloads. Processed 186M API calls in Q3 2025 (up from 35M in Q1, ~30% MoM growth). Claims 26% higher response accuracy vs OpenAI's memory, 91% lower p95 latency, 90%+ token cost savings. Exclusive memory provider for AWS Agent SDK. Natively integrated by CrewAI, Flowise, Langflow.",
+ "status": "live",
+ "source_urls": [
+ "https://techcrunch.com/2025/10/28/mem0-raises-24m-from-yc-peak-xv-and-basis-set-to-build-the-memory-layer-for-ai-apps/"
+ ],
+ "sub_theme": "AI memory infrastructure",
+ "merged_from": 2
+ },
+ {
+ "name": "Mem0 Memory Infrastructure",
+ "what": "AI agent memory infrastructure company providing persistent cross-session memory",
+ "evidence": "Raised $24M (Seed + Series A). 186M API calls in Q3 2025. 41K GitHub stars, 13M Python downloads. Graph-based memory with sub-second retrieval.",
+ "status": "live",
+ "source_urls": ["https://mem0.ai/series-a"],
+ "sub_theme": "Agent Memory Infrastructure",
+ "merged_from": 2
+ },
+ {
+ "name": "Mem0 Research: Memory Accuracy Boost",
+ "what": "Research showing persistent memory improves LLM accuracy by 26%",
+ "evidence": "26% accuracy improvement when agents have access to structured persistent memory vs. stateless operation.",
+ "status": "live",
+ "source_urls": ["https://mem0.ai/research"],
+ "sub_theme": "Agent Memory Infrastructure",
+ "merged_from": 2
+ },
+ {
+ "name": "Mem.ai",
+ "what": "AI-powered note-taking app that attempts to be a 'second brain' with AI-driven organization and recall, but has faced criticism for underpowered AI and missing basic features.",
+ "evidence": "Raised $29.1M total ($23.5M Series A led by OpenAI Startup Fund, valued at $110M post-money in Nov 2022). Revenue not publicly disclosed. Criticized for: no offline access, underpowered AI chat ('much inferior version of ChatGPT'), missing basic features (no highlighting, no scroll bar), slow performance. Faces competition from Microsoft OneNote, Google Keep, and Notion. Development appears stalled per user reports.",
+ "status": "live",
+ "source_urls": [
+ "https://medium.com/@theo-james/mem-ai-the-40m-second-brain-failure-burning-the-worlds-money-5f3176a34cbd"
+ ],
+ "sub_theme": "Personal knowledge management + AI",
+ "merged_from": 2
+ },
+ {
+ "name": "Memories.ai",
+ "what": "Building a Large Visual Memory Model (LVMM) for long-term visual memory, founded by two former Meta Reality Labs researchers",
+ "evidence": "Founded 2025 in San Francisco; $8M seed round led by Susa Ventures with Samsung Next participation; focuses on visual memory capabilities for AI",
+ "status": "live",
+ "source_urls": ["https://wellows.com/blog/ai-startups/"],
+ "sub_theme": "Memory and context infrastructure",
+ "merged_from": 2
+ },
+ {
+ "name": "Meta Proactive Chatbots (Project Omni)",
+ "what": "Meta's initiative to train AI chatbots to message users first on Messenger, WhatsApp, and Instagram, using conversation history for personalized re-engagement.",
+ "evidence": "Leaked documents obtained by Business Insider in July 2025. Internal project name 'Project Omni' at data labeling firm Alignerr. Chatbots only send follow-ups within 14 days after user-initiated conversation and if user sent at least 5 messages. Meta projected its generative AI products would generate $2B-$3B in 2025 revenue, up to $1.4T by 2035 (from unsealed court documents).",
+ "status": "announced",
+ "source_urls": [
+ "https://techcrunch.com/2025/07/03/meta-has-found-another-way-to-keep-you-engaged-chatbots-that-message-you-first/"
+ ],
+ "sub_theme": "Proactive consumer AI",
+ "merged_from": 2
+ },
+ {
+ "name": "METR - AI Developer Productivity Study",
+ "what": "Rigorous RCT showing AI tools made experienced open-source developers 19% slower, contradicting their own perception.",
+ "evidence": "16 experienced developers (avg 5 years on their projects, repos averaging 22k+ stars and 1M+ lines) worked on 246 real issues randomly assigned AI-allowed or AI-disallowed. Developers using AI took 19% longer. Developers expected AI to speed them up by 24%, and even after the study still believed it had sped them up by 20%. Less than 44% of AI-generated code was accepted. Tools used: primarily Cursor Pro with Claude 3.5/3.7 Sonnet.",
+ "status": "live",
+ "source_urls": [
+ "https://metr.org/blog/2025-07-10-early-2025-ai-experienced-os-dev-study/"
+ ],
+ "sub_theme": "Academic research on agent effectiveness",
+ "merged_from": 2
+ },
+ {
+ "name": "METR Task Horizon Doubling",
+ "what": "METR tracks task completion time horizons for frontier AI models, finding a 7-month doubling time",
+ "evidence": "Task length that agents complete with 50% reliability has been doubling every 7 months for 6 years. Time Horizon 1.1 added 34% more tasks (228 total) and doubled 8-hour+ tasks (31 total).",
+ "status": "live",
+ "source_urls": ["https://metr.org/time-horizons/"],
+ "sub_theme": "Autonomous Agent Duration",
+ "merged_from": 2
+ },
+ {
+ "name": "METR task horizon doubling (7 months)",
+ "what": "Verified: Task completion time horizons doubling approximately every 7 months",
+ "evidence": "Confirmed by METR's own publications. Time Horizon 1.1 released Jan 2026 with expanded task suite.",
+ "status": "live",
+ "source_urls": ["https://metr.org/time-horizons/"],
+ "sub_theme": "Verification",
+ "merged_from": 2
+ },
+ {
+ "name": "Microsoft Agent Framework (AutoGen + Semantic Kernel)",
+ "what": "Unified open-source agent framework merging AutoGen's multi-agent innovation with Semantic Kernel's enterprise stability.",
+ "evidence": "Announced Oct 2025. Release Candidate 1.0 on Feb 19, 2026. GA targeted end of Q1 2026. Both AutoGen and Semantic Kernel placed in maintenance mode (bug fixes only, no new features). Supports Python and .NET. Consolidates AI workloads into single SDK with observability.",
+ "status": "announced",
+ "source_urls": [
+ "https://visualstudiomagazine.com/articles/2025/10/01/semantic-kernel-autogen--open-source-microsoft-agent-framework.aspx"
+ ],
+ "sub_theme": "Multi-agent frameworks",
+ "merged_from": 2
+ },
+ {
+ "name": "Microsoft Cortana",
+ "what": "Microsoft's voice assistant integrated into Windows, competing with Siri, Alexa, and Google Assistant.",
+ "evidence": "Peak of 145M monthly active users in 2017, but only 10% of Windows 10 users regularly engaged by 2023. Never exceeded 2% voice assistant market share (vs Siri 36%, Google 35%, Alexa 25%). Mobile app removed March 2021. Windows standalone app deprecated August 2023. Replaced by Copilot. Microsoft invested $10B in OpenAI instead.",
+ "status": "shut down",
+ "source_urls": [
+ "https://techcrunch.com/2023/08/04/microsoft-kills-cortana-in-windows-as-it-focuses-on-next-gen-ai/"
+ ],
+ "sub_theme": "Big tech personal assistants",
+ "merged_from": 2
+ },
+ {
+ "name": "Microsoft Recall Privacy Backlash",
+ "what": "Microsoft's Recall feature (screenshots every few seconds for AI context) faced severe backlash, was delayed, redesigned from plaintext to encrypted, and made opt-in",
+ "evidence": "Initial version stored all data in plaintext database; researchers found it captured passwords, financial data, private messages, medical records; Microsoft made it opt-in and added full database encryption; later Gaming Copilot also caught capturing gameplay images and sending data by default",
+ "status": "live",
+ "source_urls": [
+ "https://time.com/6980911/microsoft-copilot-recall-ai-features-privacy-concerns/"
+ ],
+ "sub_theme": "Trust and adoption curve",
+ "merged_from": 2
+ },
+ {
+ "name": "Model Context Protocol (MCP) - Adoption Trajectory",
+ "what": "MCP server downloads grew from ~100K in November 2024 to 8M+ by April 2025; remote MCP servers up nearly 4x since May 2025.",
+ "evidence": "PulseMCP registry lists 5,500+ servers. Enterprise infrastructure support from AWS, Cloudflare, Google Cloud, Microsoft Azure. Claude has 75+ connectors powered by MCP. Claude Code plugin directory launched early 2026 with 72+ plugins across 24 categories.",
+ "status": "live",
+ "source_urls": ["https://mcpmanager.ai/blog/mcp-adoption-statistics/"],
+ "sub_theme": "MCP ecosystem",
+ "merged_from": 2
+ },
+ {
+ "name": "Model Context Protocol (MCP) - Ecosystem Scale",
+ "what": "Open protocol created by Anthropic for connecting AI agents to external tools and data sources, now governed by the Linux Foundation.",
+ "evidence": "10,000+ active public MCP servers. 97M+ monthly SDK downloads. Adopted by ChatGPT, Cursor, Gemini, Microsoft Copilot, VS Code. Donated to Agentic AI Foundation (Linux Foundation) co-founded by Anthropic, Block, OpenAI with support from Google, Microsoft, AWS, Cloudflare, Bloomberg.",
+ "status": "live",
+ "source_urls": [
+ "https://www.anthropic.com/news/donating-the-model-context-protocol-and-establishing-of-the-agentic-ai-foundation"
+ ],
+ "sub_theme": "MCP ecosystem",
+ "merged_from": 2
+ },
+ {
+ "name": "n8n",
+ "what": "Open-source workflow automation platform with native AI agent nodes, LangChain integration, and MCP client/server support, self-hostable.",
+ "evidence": "150K+ GitHub stars in 2025. $40M ARR as of July 2025. Raised $180M at $2.5B valuation in October 2025 (led by Accel, with NVentures/Nvidia). 3,000+ enterprise customers (Vodafone, Delivery Hero, Microsoft). 200,000 active users. Revenue grew 5X after pivoting to AI-friendly approach in 2022. 500+ integrations with dedicated AI Agent nodes, memory, evaluations, and multi-agent orchestration.",
+ "status": "live",
+ "source_urls": ["https://github.com/n8n-io/n8n"],
+ "sub_theme": "Workflow automation with AI",
+ "merged_from": 2
+ },
+ {
+ "name": "NanoClaw",
+ "what": "Lightweight, container-isolated alternative to OpenClaw with scheduled jobs, built on Anthropic's Agents SDK, ~3,900 lines of code across 15 files.",
+ "evidence": "10.5K+ GitHub stars. Runs in Linux containers with filesystem isolation, not merely behind permission checks. Connects to WhatsApp, Telegram, Slack, Discord, Gmail. Entire codebase is auditable at ~3,900 LOC. Same scheduled-job and daily-briefing capabilities as OpenClaw but in a smaller, security-focused package.",
+ "status": "live",
+ "source_urls": ["https://github.com/qwibitai/nanoclaw"],
+ "sub_theme": "Self-hosted proactive agents",
+ "merged_from": 2
+ },
+ {
+ "name": "Narrative Clip (formerly Memoto)",
+ "what": "Wearable lifelogging camera that automatically took a photo every 30 seconds.",
+ "evidence": "Raised $550K on Kickstarter in 2012, then additional millions in VC. Launched Clip 1 and Clip 2 (with Bluetooth/WiFi). Entered voluntary dissolution in September 2016. Could not compete with smartphone cameras and live-streaming services (Periscope, Facebook Live). Most people did not want continuous photo capture.",
+ "status": "shut down",
+ "source_urls": [
+ "https://petapixel.com/2016/09/28/lifelogging-camera-maker-narrative-going-business/"
+ ],
+ "sub_theme": "Lifelogging and personal data capture",
+ "merged_from": 2
+ },
+ {
+ "name": "Netflix FM-Intent / IntentRec",
+ "what": "Netflix's recommendation framework that predicts user session intent using hierarchical multi-task learning, estimating latent intent from short- and long-term implicit signals.",
+ "evidence": "Uses hierarchical multi-task neural network architecture. Estimates latent user intent from short- and long-term implicit signals as proxies. Uses intent prediction to predict next item user will engage with. Published on Netflix Tech Blog and arxiv (updated May 2025).",
+ "status": "live",
+ "source_urls": [
+ "https://netflixtechblog.com/fm-intent-predicting-user-session-intent-with-hierarchical-multi-task-learning-94c75e18f4b8"
+ ],
+ "sub_theme": "Intent inference and task prediction",
+ "merged_from": 2
+ },
+ {
+ "name": "Notion 3.0 AI Agents",
+ "what": "Autonomous agents inside Notion that work for up to 20 minutes on multi-step tasks across hundreds of pages, including breaking projects into tasks and assigning them.",
+ "evidence": "Launched September 18, 2025. Personal Agent works autonomously for up to 20 minutes. Can build project launch plans, break them into tasks, assign them, and draft docs. Pulls context from connected tools (Slack, Google Drive, Teams) via native integrations and MCP. Notion 3.2 (January 2026) brought agents to mobile with multi-model support (GPT-5.2, Claude Opus 4.5, Gemini 3).",
+ "status": "live",
+ "source_urls": ["https://www.notion.com/releases/2025-09-18"],
+ "sub_theme": "Productivity tool AI agents",
+ "merged_from": 2
+ },
+ {
+ "name": "Notion AI",
+ "what": "AI features embedded in Notion's workspace platform, including autonomous agents that execute multi-step workflows across connected data sources.",
+ "evidence": "$500M annualized revenue as of Sep 2025 (up from $400M in 2024, $250M in 2023, $67M in 2022). 100M+ users worldwide, 20M+ monthly active users. AI adoption: crossed 50% of paying customers using AI features in 2025 (up from 10-20% in 2024). Notion 3.0 (Sep 2025) launched autonomous AI Agents with multi-model support (GPT-5, Claude Opus 4.1, o3). AI now bundled into Business/Enterprise plans rather than sold as add-on.",
+ "status": "live",
+ "source_urls": [
+ "https://www.cnbc.com/2025/09/18/notion-launches-ai-agent-as-it-crosses-500-million-in-annual-revenue.html"
+ ],
+ "sub_theme": "Enterprise knowledge + AI",
+ "merged_from": 2
+ },
+ {
+ "name": "Notion AI Agents (3.0)",
+ "what": "Notion launched autonomous AI agents that execute multi-step workflows across workspace data, with 100M+ users providing personal/team context",
+ "evidence": "Notion 3.0 launched September 2025 with autonomous agents; 100M+ users worldwide; multi-model support (GPT-5, Claude Opus 4.1, o3); January 2026 3.2 release brought agents to mobile with intelligent auto-model selection; Business plan $20/user/month required for AI features",
+ "status": "live",
+ "source_urls": ["https://www.notion.com/releases/2025-09-18"],
+ "sub_theme": "Memory and context infrastructure",
+ "merged_from": 2
+ },
+ {
+ "name": "Obsidian + AI Plugins (Copilot, Smart Connections)",
+ "what": "Community-built AI plugins for Obsidian that provide chat-based vault search, personal context processing, and agent mode with tool calling over local markdown files.",
+ "evidence": "Obsidian Copilot plugin provides: Project Mode (AI-ready context from folders/tags), Agent Mode with tool calling, support for markdown/PDF/image/YouTube/URL context, custom system prompts stored as markdown files. Smart Connections plugin offers semantic search across notes. All data stays local in user's vault. No revenue/funding data available (open-source community plugins). Active development through 2026.",
+ "status": "live",
+ "source_urls": ["https://www.obsidiancopilot.com/en"],
+ "sub_theme": "Personal knowledge management + AI",
+ "merged_from": 2
+ },
+ {
+ "name": "OpenAI Codex",
+ "what": "Cloud-sandboxed coding agent with two execution modes: cloud sandbox for parallel background tasks and terminal CLI for local execution.",
+ "evidence": "Available to ChatGPT Plus users from June 2025. GPT-5.2-Codex is latest model (Jan 2026). Sandbox disables internet access during execution, limiting agent to provided repo + pre-installed deps. Auto-detects setup scripts. Configurable internet access and approval controls.",
+ "status": "live",
+ "source_urls": ["https://openai.com/index/introducing-codex/"],
+ "sub_theme": "Coding agent products",
+ "merged_from": 2
+ },
+ {
+ "name": "OpenAI Codex - GPT-5-Codex autonomous duration",
+ "what": "OpenAI's Codex agent demonstrated multi-hour autonomous coding sessions on complex tasks.",
+ "evidence": "GPT-5-Codex worked independently for more than 7 hours at a time on large tasks during testing, iterating on implementation, fixing test failures, and delivering working code. GPT-5.2-Codex runs 24-hour autonomous tasks. Codex-1 achieves 37% accuracy on first attempts on SWE-Bench, scaling to 70.2% with retries and 85% after 8 attempts. GPT-5.3-Codex achieved state-of-the-art on SWE-Bench Pro (56.8%), though this means it still fails nearly half of professional-level tasks.",
+ "status": "live",
+ "source_urls": ["https://openai.com/index/introducing-codex/"],
+ "sub_theme": "Autonomy duration and trust metrics",
+ "merged_from": 2
+ },
+ {
+ "name": "OpenAI Codex 25-hour autonomous run",
+ "what": "Codex ran for 25 hours uninterrupted building a design tool from scratch",
+ "evidence": "Used ~13M tokens, generated ~30k lines of code. GPT-5.3-Codex achieves 77.3% on Terminal-Bench, 56.8% on SWE-Bench Pro.",
+ "status": "live",
+ "source_urls": ["https://openai.com/index/introducing-gpt-5-3-codex/"],
+ "sub_theme": "Autonomous Agent Duration",
+ "merged_from": 2
+ },
+ {
+ "name": "OpenAI Operator/ChatGPT Agent",
+ "what": "OpenAI's computer-using agent integrated into ChatGPT, accessing user connectors and data for agentic web tasks",
+ "evidence": "Launched as Operator January 2025; integrated into ChatGPT as agent July 17, 2025; powered by Computer-Using Agent (CUA) model; accesses user connectors and logged-in websites via takeover mode; OpenAI acknowledged prompt injection may never be fully solved",
+ "status": "live",
+ "source_urls": ["https://openai.com/index/introducing-chatgpt-agent/"],
+ "sub_theme": "Memory and context infrastructure",
+ "merged_from": 2
+ },
+ {
+ "name": "OpenClaw",
+ "what": "Open-source self-hosted AI agent with cron-scheduled daily briefings that pull from connected data sources (Gmail, Calendar, GitHub, RSS, Todoist, Linear, Stripe).",
+ "evidence": "264K GitHub stars as of March 2026, surpassed React as most-starred software project on GitHub on March 3, 2026. Reached 100K stars on January 30, 2026, and 250K by March 3. The daily-briefing-hub skill combines Google Calendar, Gmail/Outlook, weather, GitHub PR/CI status, Hacker News/RSS, and Todoist/ClickUp/Linear tasks into a single prioritized morning summary. Silently skips missing integrations. Codebase supports three schedule types: at (one-time), every (interval), and cron (standard expressions).",
+ "status": "live",
+ "source_urls": ["https://github.com/openclaw/openclaw"],
+ "sub_theme": "Self-hosted proactive agents",
+ "merged_from": 2
+ },
+ {
+ "name": "OpenClaw (daily briefing agent)",
+ "what": "Open-source personal AI assistant with 210K+ GitHub stars, configurable daily briefings from connected data",
+ "evidence": "Breakout project of 2026. Surged from 9K to 210K+ stars. Connects to 50+ messaging apps. Users configure cron jobs for morning briefings pulling calendar, email, CRM, weather. Created by PSPDFKit founder.",
+ "status": "live",
+ "source_urls": [
+ "https://gist.github.com/mberman84/63163d6839053fbf15091238e5ada5c2"
+ ],
+ "sub_theme": "Daily Briefing / Task Generation Products",
+ "merged_from": 2
+ },
+ {
+ "name": "Otter.ai",
+ "what": "Meeting transcription platform that evolved from passive note-taking to voice-activated AI meeting agents that learn from company-wide meeting data.",
+ "evidence": "$100M ARR milestone reached Mar 2025 (up from $81M end of 2024). Raised ~$70M total ($50M in 2021 including $40M led by Spectrum Equity). Mar 2025: launched industry's first voice AI meeting agent suite. Otter Meeting Agent participates in meetings as voice-activated participant, answers questions using real-time transcript + company meeting history. Features vocabulary learning and preference personalization.",
+ "status": "live",
+ "source_urls": [
+ "https://otter.ai/blog/otter-ai-caps-transformational-2025-with-100m-arr-milestone-industry-first-ai-meeting-agents-and-global-enterprise-expansion"
+ ],
+ "sub_theme": "Meeting AI with personal context",
+ "merged_from": 2
+ },
+ {
+ "name": "OWASP Prompt Injection #1 Vulnerability",
+ "what": "Prompt injection ranked #1 critical vulnerability in OWASP 2025 Top 10 for LLM Applications, appearing in 73% of production AI deployments assessed",
+ "evidence": "OpenAI stated in December 2025 that prompt injection 'is unlikely to ever be fully solved'; indirect prompt injection targets where AI systems collect information (documents, emails, web pages); an attacker can embed hidden commands that override user instructions to extract emails, steal personal data, or access passwords",
+ "status": "live",
+ "source_urls": [
+ "https://genai.owasp.org/llmrisk/llm01-prompt-injection/"
+ ],
+ "sub_theme": "Privacy infrastructure for AI",
+ "merged_from": 2
+ },
+ {
+ "name": "Pardon the Interruption (Journal of Management, 2020)",
+ "what": "Integrative review of work interruption research across domains, providing a taxonomy of interruption types, frequencies, and cognitive impacts that predates but directly applies to AI agent workflows.",
+ "evidence": "Published in Journal of Management (2020). Reviews interruption research across workplace contexts. Identifies that interruptions introduce new tasks on top of ongoing activities. Operators must handle multiple simultaneous stimuli and information sources. Provides foundational vocabulary for categorizing agent interruptions.",
+ "status": "live",
+ "source_urls": [
+ "https://journals.sagepub.com/doi/abs/10.1177/0149206319887428"
+ ],
+ "sub_theme": "Interruption patterns beyond coding",
+ "merged_from": 2
+ },
+ {
+ "name": "Perplexity Personal Computer Agent",
+ "what": "Perplexity extending cloud AI agent to desktop, accessing local files and apps with user approval and audit trail",
+ "evidence": "March 2026 announcement; Google Drive integration live for all users; Enterprise Pro/Max users can sync Drive to personal repository; Personal Computer runs in 'secure environment with clear safeguards,' requires user approval for sensitive actions, generates full audit trail, includes kill switch",
+ "status": "announced",
+ "source_urls": [
+ "https://www.technology.org/2026/03/13/perplexity-personal-computer-ai/"
+ ],
+ "sub_theme": "Memory and context infrastructure",
+ "merged_from": 2
+ },
+ {
+ "name": "Personal.ai",
+ "what": "Platform that builds personal AI models from user's data, facts, and opinions to remember and recall memories core to identity.",
+ "evidence": "Raised $7.8-23.8M (sources vary by methodology). $138K annual revenue as of Dec 2023. Founded 2020 in San Francisco. Uses personal AI models trained on individual user's data. Enables organizations to create AI teammates with proprietary knowledge. Latest funding: undisclosed Seed round Sep 2024.",
+ "status": "live",
+ "source_urls": [
+ "https://tracxn.com/d/companies/personal-ai/__km3jbW0uSOjjopNTO7_osOJmYRwGU7-zqIKWilUz1co"
+ ],
+ "sub_theme": "Personal knowledge management + AI",
+ "merged_from": 2
+ },
+ {
+ "name": "Plaid Financial Data Infrastructure",
+ "what": "Financial data aggregation platform demonstrating the value of personal data connectivity, with AI-powered enrichment and over $800M estimated annual revenue",
+ "evidence": "Crossed estimated $800M annual revenue in 2025; 220+ new products/features in 2025; AI-powered auto-repair enabled 2M+ successful user logins and reduced degradation fix time by 90%; launched LendScore credit risk score in October 2025 using real-time cash flow data",
+ "status": "live",
+ "source_urls": ["https://sacra.com/c/plaid/"],
+ "sub_theme": "Economics of personal data ingestion",
+ "merged_from": 2
+ },
+ {
+ "name": "PPP: Training Proactive and Personalized LLM Agents",
+ "what": "Multi-objective reinforcement learning approach optimizing three dimensions: productivity (task completion), proactivity (asking essential questions), and personalization (adapting to user preferences).",
+ "evidence": "Published November 2025. Identifies that existing work focuses primarily on task success but effective real-world agents require jointly optimizing productivity, proactivity, and personalization.",
+ "status": "live",
+ "source_urls": ["https://arxiv.org/abs/2511.02208"],
+ "sub_theme": "Academic research on proactive agents",
+ "merged_from": 2
+ },
+ {
+ "name": "Proactive Agent (OpenReview / ICLR 2025)",
+ "what": "Research paper formalizing proactive agents that anticipate user needs and take initiative by suggesting tasks without explicit requests, with ProactiveBench dataset.",
+ "evidence": "ProactiveBench dataset contains 6,790 events. Defines proactive agent as one that perceives environmental context, infers user intentions without explicit prompts, and autonomously suggests actions. Published October 2024, updated through 2025.",
+ "status": "live",
+ "source_urls": ["https://arxiv.org/abs/2410.12361"],
+ "sub_theme": "Academic research on proactive agents",
+ "merged_from": 2
+ },
+ {
+ "name": "ProAgentBench",
+ "what": "Benchmark for evaluating LLM agents for proactive assistance using real-world data.",
+ "evidence": "Published February 2026. Evaluates LLM agents specifically for proactive (not reactive) assistance capabilities.",
+ "status": "live",
+ "source_urls": ["https://arxiv.org/html/2602.04482v1"],
+ "sub_theme": "Academic research on proactive agents",
+ "merged_from": 2
+ },
+ {
+ "name": "PROBE Benchmark (Beyond Reactivity)",
+ "what": "Benchmark measuring proactive problem-solving in LLM agents, requiring agents to identify and resolve critical bottlenecks hidden in realistic workplace datastores.",
+ "evidence": "1,000 diverse samples. Even state-of-the-art LLMs and specialized agentic frameworks achieve no more than 40% success on end-to-end proactive tasks. Published October 2025.",
+ "status": "live",
+ "source_urls": ["https://arxiv.org/abs/2510.19771"],
+ "sub_theme": "Academic research on proactive agents",
+ "merged_from": 2
+ },
+ {
+ "name": "Rabbit R1",
+ "what": "AI hardware device with 'Large Action Model' intended to replace app-based phone interactions, which suffered massive user abandonment.",
+ "evidence": "Raised $64.7M total from Khosla Ventures and others. Sold ~100,000 units, but only 5,000 used daily (5% retention). Supported only 6 apps (Spotify, Uber, DoorDash, etc.). CEO Jesse Lyu admitted it launched prematurely. Reports of unpaid salaries and employee strikes. Could not reliably book an Uber or fetch weather. | Raised $64.7M over 5 rounds. Valued at $100-150M. 130,000 units sold after CES 2024 announcement. Only 5,000 daily active users 5 months after launch (95% abandonment). MKBHD called it 'barely reviewable.' Entire interface discovered to be a single Android app. Founder Jesse Lyu admitted it launched too early. Still operational in late 2025 with improvements via OTA updates but minimal user base. | Raised $64.7M total from Khosla Ventures and others. Sold ~100,000 units, but only 5,000 used daily (5% retention). Supported only 6 apps (Spotify, Uber, DoorDash, etc.). CEO Jesse Lyu admitted it launched prematurely. Reports of unpaid salaries and employee strikes. Could not reliably book an Uber or fetch weather. | Raised $64.7M over 5 rounds. Valued at $100-150M. 130,000 units sold after CES 2024 announcement. Only 5,000 daily active users 5 months after launch (95% abandonment). MKBHD called it 'barely reviewable.' Entire interface discovered to be a single Android app. Founder Jesse Lyu admitted it launched too early. Still operational in late 2025 with improvements via OTA updates but minimal user base. | Raised $64.7M total from Khosla Ventures and others. Sold ~100,000 units, but only 5,000 used daily (5% retention). Supported only 6 apps (Spotify, Uber, DoorDash, etc.). CEO Jesse Lyu admitted it launched prematurely. Reports of unpaid salaries and employee strikes. Could not reliably book an Uber or fetch weather. | Raised $64.7M over 5 rounds. Valued at $100-150M. 130,000 units sold after CES 2024 announcement. Only 5,000 daily active users 5 months after launch (95% abandonment). MKBHD called it 'barely reviewable.' Entire interface discovered to be a single Android app. Founder Jesse Lyu admitted it launched too early. Still operational in late 2025 with improvements via OTA updates but minimal user base. | Raised $64.7M total from Khosla Ventures and others. Sold ~100,000 units, but only 5,000 used daily (5% retention). Supported only 6 apps (Spotify, Uber, DoorDash, etc.). CEO Jesse Lyu admitted it launched prematurely. Reports of unpaid salaries and employee strikes. Could not reliably book an Uber or fetch weather. | Raised $64.7M over 5 rounds. Valued at $100-150M. 130,000 units sold after CES 2024 announcement. Only 5,000 daily active users 5 months after launch (95% abandonment). MKBHD called it 'barely reviewable.' Entire interface discovered to be a single Android app. Founder Jesse Lyu admitted it launched too early. Still operational in late 2025 with improvements via OTA updates but minimal user base. | Raised $64.7M total from Khosla Ventures and others. Sold ~100,000 units, but only 5,000 used daily (5% retention). Supported only 6 apps (Spotify, Uber, DoorDash, etc.). CEO Jesse Lyu admitted it launched prematurely. Reports of unpaid salaries and employee strikes. Could not reliably book an Uber or fetch weather. | Raised $64.7M over 5 rounds. Valued at $100-150M. 130,000 units sold after CES 2024 announcement. Only 5,000 daily active users 5 months after launch (95% abandonment). MKBHD called it 'barely reviewable.' Entire interface discovered to be a single Android app. Founder Jesse Lyu admitted it launched too early. Still operational in late 2025 with improvements via OTA updates but minimal user base.",
+ "status": "live",
+ "source_urls": [
+ "https://9to5google.com/2024/09/26/rabbit-5000-people-use-the-r1-daily/",
+ "https://cybernews.com/tech/the-story-of-rabbit-r1/"
+ ],
+ "sub_theme": "Personal AI hardware",
+ "merged_from": 3
+ },
+ {
+ "name": "RAG Cost Economics for Personal Data",
+ "what": "Embedding and indexing personal documents is cheap (~$0.001-0.01/doc), but operational staffing and infrastructure costs often exceed cloud bills",
+ "evidence": "10,000 documents can be embedded and indexed for under $100; RAG cuts fine-tuning spend by 60-80%; but operational staffing for small teams costs $750k+ annually; fine-tuning a 70B model costs $50k-200k in compute; Google embedding generation at $0.15 per 1M tokens",
+ "status": "live",
+ "source_urls": [
+ "https://thedataguy.pro/blog/2025/07/the-economics-of-rag-cost-optimization-for-production-systems/"
+ ],
+ "sub_theme": "Economics of personal data ingestion",
+ "merged_from": 2
+ },
+ {
+ "name": "Rakuten + Claude Code - Autonomous Complex Task",
+ "what": "Case study of a coding agent completing a complex task in a 12.5M-line codebase with minimal human intervention, illustrating the upper bound of current agent autonomy.",
+ "evidence": "Rakuten engineers tested Claude Code on implementing activation vector extraction in vLLM (12.5M lines, multiple languages). Agent completed the task in 7 hours of autonomous work with 99.9% numerical accuracy. Cited in Anthropic's 2026 Agentic Coding Trends Report.",
+ "status": "live",
+ "source_urls": [
+ "https://resources.anthropic.com/2026-agentic-coding-trends-report"
+ ],
+ "sub_theme": "Error correction: human as quality gate",
+ "merged_from": 2
+ },
+ {
+ "name": "RedMonk - 10 Things Developers Want from Agentic IDEs (2025)",
+ "what": "Analyst report categorizing what developer-practitioners actually demand from agent oversight, distinguishing between trust controls, permission systems, and audit trails.",
+ "evidence": "Developers want: fine-grained permissions for what agents can/cannot do autonomously; approval gates before destructive actions (rm -rf, database writes, deployments); configurable autonomy levels per task type; clear audit trails of every agent action. Microsoft and Red Hat's MCP approach requires least-privilege permissions and surfaces all sensitive operations to the user.",
+ "status": "live",
+ "source_urls": [
+ "https://redmonk.com/kholterhoff/2025/12/22/10-things-developers-want-from-their-agentic-ides-in-2025/"
+ ],
+ "sub_theme": "Taxonomy: approval gates (trust) vs context provision (knowledge)",
+ "merged_from": 2
+ },
+ {
+ "name": "Relyance AI Consumer Trust Survey",
+ "what": "Survey showing 4 in 5 consumers believe companies train AI on their data without disclosure; only ~1 in 10 willing to share sensitive data",
+ "evidence": "2025 survey; 82% see data loss threat; only around 10% very willing to share financial, communication, or biometric data; more than half not willing to share even in exchange for better digital experiences",
+ "status": "live",
+ "source_urls": ["https://www.relyance.ai/consumer-ai-trust-survey-2025"],
+ "sub_theme": "Trust and adoption curve",
+ "merged_from": 2
+ },
+ {
+ "name": "Renovate (Mend.io)",
+ "what": "Cross-platform automated dependency update tool supporting 90+ package managers, creating PRs for dependency updates with grouping, scheduling, and auto-merge.",
+ "evidence": "90+ package manager support (broader than Dependabot's 30+). Works across GitHub, GitLab, Bitbucket, Azure DevOps. Claims approximately 90% time savings for dependency updates (Mend ROI whitepaper). Supports grouping, scheduling, auto-merge policies.",
+ "status": "live",
+ "source_urls": ["https://github.com/renovatebot/renovate"],
+ "sub_theme": "Automated maintenance (agent without prompting)",
+ "merged_from": 2
+ },
+ {
+ "name": "Rewind AI / Limitless",
+ "what": "Desktop activity recorder that captured everything on your screen and made it searchable, later pivoted to AI wearable pendant.",
+ "evidence": "Raised $33M+ from a16z, First Round Capital, NEA. Rewind's local processing approach limited it to users' CPUs, preventing use of best LLMs. Collected ~11GB per user per month. Rebranded to Limitless April 2024 and launched $99 hardware pendant. Acquired by Meta December 2025; Rewind desktop app shut down, pendant sales halted.",
+ "status": "acquired",
+ "source_urls": [
+ "https://techcrunch.com/2025/12/05/meta-acquires-ai-device-startup-limitless/"
+ ],
+ "sub_theme": "Lifelogging and personal data capture",
+ "merged_from": 2
+ },
+ {
+ "name": "Samsung Personal Data Engine (PDE)",
+ "what": "Samsung's on-device Personal Data Engine powered by RDFox knowledge graph technology creates hyper-personalized experiences while keeping data on-device",
+ "evidence": "Unveiled at Galaxy UNPACKED 2025 with Galaxy S25 series; powered by RDFox from Oxford Semantic Technologies; secured by Knox Vault and KEEP encryption; drives Now Brief and Smart Gallery search; Galaxy S26 (February 2026) added hardware-based Privacy Display on Ultra model",
+ "status": "live",
+ "source_urls": [
+ "https://www.computerweekly.com/news/366618319/Samsung-unpacks-Galaxy-AIs-personal-data-engine"
+ ],
+ "sub_theme": "On-device personal AI",
+ "merged_from": 2
+ },
+ {
+ "name": "Screenpipe",
+ "what": "Open-source (MIT) alternative to Rewind AI that continuously captures screen and audio locally, with AI-powered search and recall.",
+ "evidence": "17.2K GitHub stars as of early 2026. 202 subscriptions, $3.5K MRR (all organic). Works on macOS, Windows, Linux. Uses local OpenAI Whisper for speech-to-text. 9 apps in marketplace, 5 more in development. Investors include Embedding VC, Founders Inc, Top Harvest Capital. Major companies (Microsoft, Intel, Oracle, GitHub, Alibaba Cloud) testing it. Has not reached product-market fit per founders.",
+ "status": "live",
+ "source_urls": ["https://screenpi.pe/"],
+ "sub_theme": "Screen/audio capture + AI recall",
+ "merged_from": 2
+ },
+ {
+ "name": "Simon Willison ChatGPT Memory Critique",
+ "what": "Prominent developer flagged ChatGPT's memory feature as creating an unsolicited 'dossier' effect, demonstrating the context collapse problem in personal AI",
+ "evidence": "May 2025; ChatGPT inferred user's location (Half Moon Bay) from prior conversations and inserted it into an unrelated image generation request; illustrates 'context collapse' where data from different spheres of user activity (work, family, hobbies) blurs together",
+ "status": "live",
+ "source_urls": [
+ "https://simonwillison.net/2025/May/21/chatgpt-new-memory/"
+ ],
+ "sub_theme": "Trust and adoption curve",
+ "merged_from": 2
+ },
+ {
+ "name": "Skyflow Privacy Vault",
+ "what": "Privacy-as-infrastructure company providing tokenized data vaults that can serve as a trust layer between personal data and AI models",
+ "evidence": "Raised $100M total equity including $30M Series B extension led by Khosla Ventures (April 2024); supports nearly 1 billion records; processes over 2 billion API calls per quarter; LLM Privacy Vault product detects sensitive data and replaces with deterministic tokens before sending to AI models",
+ "status": "live",
+ "source_urls": [
+ "https://www.skyflow.com/post/generative-ai-data-privacy-skyflow-llm-privacy-vault"
+ ],
+ "sub_theme": "Privacy infrastructure for AI",
+ "merged_from": 2
+ },
+ {
+ "name": "Solid Project (Tim Berners-Lee / Inrupt)",
+ "what": "Standards-based personal data pods with time-bound access grants to apps and AI agents, proposed as alternative to platform data hoards",
+ "evidence": "Originally released 2016; Inrupt launched 2018; Berners-Lee's 2025 book positions Solid as counter to AI built on platform data; developer community remains small; Berners-Lee acknowledges 'not ready for general adoption yet'; network effects remain the core obstacle",
+ "status": "live",
+ "source_urls": [
+ "https://en.wikipedia.org/wiki/Solid_(web_decentralization_project)"
+ ],
+ "sub_theme": "Data portability infrastructure",
+ "merged_from": 2
+ },
+ {
+ "name": "Spain AEPD Agentic AI Guide",
+ "what": "Spain's data protection authority published a 71-page regulatory analysis of how agentic AI systems create structural privacy risks distinct from conventional AI tools",
+ "evidence": "Published February 2026; titled 'Agentic Artificial Intelligence from the Perspective of Data Protection'; identifies persistent memory profiles, autonomous multi-service access, and consequential actions without human checkpoints as novel risks",
+ "status": "live",
+ "source_urls": [
+ "https://ppc.land/spains-data-watchdog-maps-the-hidden-gdpr-risks-of-agentic-ai/"
+ ],
+ "sub_theme": "Regulatory barriers",
+ "merged_from": 2
+ },
+ {
+ "name": "Spotify AI DJ",
+ "what": "AI-powered DJ that proactively selects music based on full listening history, temporal patterns, and mood inference without explicit song requests.",
+ "evidence": "Launched February 2023. DJ Requests (voice and text) added May 2025. Personalized prompt suggestions added October 2025 (e.g., 'reggaeton beats for an energetic afternoon'). Prompted Playlists beta (December 2025) uses entire listening history from day one. System processes billions of data points using collaborative filtering, content-based filtering, skip rates, playlist additions, and temporal preferences (morning/afternoon/evening, seasonal patterns).",
+ "status": "live",
+ "source_urls": [
+ "https://newsroom.spotify.com/2023-02-22/spotify-debuts-a-new-ai-dj-right-in-your-pocket/"
+ ],
+ "sub_theme": "Proactive consumer AI",
+ "merged_from": 2
+ },
+ {
+ "name": "Supermemory",
+ "what": "AI memory startup extracting 'memories' from unstructured data for application context, founded by a 19-year-old",
+ "evidence": "Raised $2.6M seed led by Susa Ventures, Browder Capital, and SF1.vc; backed by Google executives; October 2025",
+ "status": "live",
+ "source_urls": [
+ "https://techcrunch.com/2025/10/06/a-19-year-old-nabs-backing-from-google-execs-for-his-ai-memory-startup-supermemory/"
+ ],
+ "sub_theme": "Memory and context infrastructure",
+ "merged_from": 2
+ },
+ {
+ "name": "SWE-bench Verified - Current State of Autonomous Coding",
+ "what": "Primary benchmark for measuring autonomous coding agent performance on real GitHub issues.",
+ "evidence": "Top scores cluster at 77.8-80.9% on SWE-bench Verified (tightest top-tier race yet). Claude Opus 4.5 leads at 80.9% Verified but only 45.9% on SWE-bench Pro (same model, half the score). SWE-bench Pro is considered more reliable due to data contamination in Verified's 500 Python-only tasks.",
+ "status": "live",
+ "source_urls": ["https://www.swebench.com/verified.html"],
+ "sub_theme": "Benchmarks and limitations",
+ "merged_from": 2
+ },
+ {
+ "name": "SWE-bench Verified Leaderboard (Feb 2026)",
+ "what": "Industry-standard benchmark for autonomous issue resolution on real GitHub repos.",
+ "evidence": "Top scores as of Feb 2026: Claude Opus 4.5 at 80.9%, Sonar Foundation Agent at 79.2%, Gemini 3 Flash at ~76.2%. Average score across all models is 62.2%. SWE-bench Pro (harder, uncontaminated) shows GPT-5.3-Codex at 56.8%, meaning top models still fail ~43-50% of professional-level tasks.",
+ "status": "live",
+ "source_urls": ["https://www.swebench.com/"],
+ "sub_theme": "Benchmark performance",
+ "merged_from": 2
+ },
+ {
+ "name": "Sweep AI",
+ "what": "AI junior developer that transforms GitHub issues and Jira tickets into pull requests by reading the project, planning changes, writing code, and creating PRs.",
+ "evidence": "Raised $2.8M (as of November 2023 TechCrunch report). Team was 2 employees planning to expand to 5. Reads project context, plans changes, writes code, and creates PRs from issue descriptions.",
+ "status": "live",
+ "source_urls": [
+ "https://techcrunch.com/2023/11/02/sweep-aims-to-automate-basic-dev-tasks-using-large-language-models/"
+ ],
+ "sub_theme": "Autonomous coding agents",
+ "merged_from": 2
+ },
+ {
+ "name": "Tab AI",
+ "what": "Wearable AI pendant that listens to conversations all day and builds a persistent personal context model, functioning like always-on ChatGPT without prompting.",
+ "evidence": "Created by Avi Schiffmann (built COVID dashboard used by millions). Raised $1.9M seed at ~$20M valuation. Designed as disk-shaped necklace. Retains and builds on prior conversation context. Originally scheduled for winter/spring 2024 launch; availability status unclear as of early 2026.",
+ "status": "announced",
+ "source_urls": [
+ "https://venturebeat.com/ai/tabs-always-on-ai-pendant-just-got-funded-but-do-we-need-it/"
+ ],
+ "sub_theme": "Personal AI hardware",
+ "merged_from": 2
+ },
+ {
+ "name": "Turing Post - State of AI Coding: Context, Trust, and Subagents",
+ "what": "Industry analysis identifying that developer distrust of coding agents stems primarily from context gaps, not model capability.",
+ "evidence": "When asked why developers don't trust AI coding agents, most said: 'We don't trust the context the model has.' Missing context identified as the critical issue: 'The critical logic sleeps in a Jira ticket from 2019, or worse, it's tribal knowledge.' An AI agent cannot know context that exists only in a person's head.",
+ "status": "live",
+ "source_urls": ["https://www.turingpost.com/p/aisoftwarestack"],
+ "sub_theme": "Taxonomy: approval gates (trust) vs context provision (knowledge)",
+ "merged_from": 2
+ },
+ {
+ "name": "UK ICO Agentic AI Early Views",
+ "what": "The UK Information Commissioner's Office flagged data minimization and transparency as specific compliance risks for agentic AI systems",
+ "evidence": "ICO notes that when an agent's scope is uncertain, defining what data is 'necessary' becomes harder; complexity of agent data flows makes it difficult to explain processing to individuals; agents communicating with other agents create unobservable data flows",
+ "status": "live",
+ "source_urls": [
+ "https://www.insideprivacy.com/artificial-intelligence/ico-shares-early-views-on-agentic-ai-data-protection/"
+ ],
+ "sub_theme": "Regulatory barriers",
+ "merged_from": 2
+ },
+ {
+ "name": "Usercentrics acquires MCP Manager",
+ "what": "Global consent management leader acquired MCP Manager to extend consent infrastructure into agentic AI workflows",
+ "evidence": "Announced January 14, 2026; first major privacy company to extend consent guardrails into MCP-based AI workflows; addresses gap where regular MCPs don't include consent checks for data access",
+ "status": "live",
+ "source_urls": [
+ "https://usercentrics.com/press/usercentrics-acquires-mcp-manager/"
+ ],
+ "sub_theme": "Regulatory barriers",
+ "merged_from": 2
+ },
+ {
+ "name": "Vana Protocol",
+ "what": "EVM-compatible Layer 1 blockchain for personal data sovereignty, enabling user-owned AI through private data transactions with DataDAOs",
+ "evidence": "Mainnet launched December 2024; over 12 million data points onboarded through multiple DataDAOs; VRC-20 token standard for data-backed digital assets introduced April 2025; YZi Labs investment with CZ advisory role; VANA trading $8-15 in 2025",
+ "status": "live",
+ "source_urls": ["https://www.vana.org/"],
+ "sub_theme": "Data portability infrastructure",
+ "merged_from": 2
+ },
+ {
+ "name": "Windsurf - Parallel agent sessions (Wave 13)",
+ "what": "Windsurf added parallel Cascade agent sessions in early 2026 for simultaneous multi-task development.",
+ "evidence": "Wave 13 (early 2026) added parallel agent sessions with dedicated terminal profiles. Windsurf ranked #1 in LogRocket rankings (Feb 2026) assessing 50+ features. Windsurf was acquired by Cognition Labs; its $82M ARR across 350+ enterprise accounts contributed to Cognition's $155M combined ARR.",
+ "status": "live",
+ "source_urls": [
+ "https://dev.to/pockit_tools/cursor-vs-windsurf-vs-claude-code-in-2026-the-honest-comparison-after-using-all-three-3gof"
+ ],
+ "sub_theme": "Autonomous coding products",
+ "merged_from": 2
+ },
+ {
+ "name": "x.ai (Amy/Andrew scheduling AI)",
+ "what": "AI scheduling assistant that handled meeting coordination via email, using virtual personas 'Amy' and 'Andrew'.",
+ "evidence": "Founded by Dennis Mortensen. Raised $44M total ($2M seed 2014, $23M 2016, $10M 2017). Acquired by Bizzabo June 2021 for undisclosed amount. Scheduling tool sunset October 31, 2021. Technology folded into Bizzabo's event management platform rather than continuing as standalone product.",
+ "status": "acquired",
+ "source_urls": [
+ "https://www.globenewswire.com/news-release/2021/06/03/2241277/0/en/Bizzabo-Acquires-x-ai-to-Launch-AI-powered-Scheduling-Accelerate-Personalized-Event-Experiences.html"
+ ],
+ "sub_theme": "AI scheduling assistants",
+ "merged_from": 2
+ },
+ {
+ "name": "YC S25 Batch AI Concentration",
+ "what": "88% of YC S25 batch (141 of 169 startups) classified as AI-native, with 50%+ building agentic AI, mostly domain-specific copilots rather than general personal AI",
+ "evidence": "Summer 2025; 169 startups total; highest concentration of AI startups in YC history; domain-specific focus: insurance claim appeals, mortgage applications; doubling down on narrow high-impact applications rather than general personal assistants",
+ "status": "live",
+ "source_urls": [
+ "https://catalaize.substack.com/p/y-combinator-s25-batch-profile-and"
+ ],
+ "sub_theme": "YC and accelerator landscape",
+ "merged_from": 2
+ },
+ {
+ "name": "YC W26 Batch AI Agents and Healthcare",
+ "what": "YC W26 batch building infrastructure for AI agents to operate inside companies, with healthcare making up nearly 10% of batch",
+ "evidence": "Winter 2026; nearly 50% of batch identified as AI agent companies; healthcare gaining momentum including wearable technologies and drug discovery; examples include Corvera (CPG operations), Arcline (legal drafting)",
+ "status": "live",
+ "source_urls": ["https://www.tldl.io/blog/yc-ai-startups-2026"],
+ "sub_theme": "YC and accelerator landscape",
+ "merged_from": 2
+ },
+ {
+ "name": "YouGov AI Trust Survey",
+ "what": "35% of Americans use AI weekly but only 5% trust it deeply; 40% say they would never enter personal or financial information into an AI tool",
+ "evidence": "2025 survey; 48% cite data exposure as primary adoption barrier (outranking hallucinations); 51% of Gen Z report weekly AI usage vs 25% of baby boomers; trust in AI saw a 16-point increase in 2025 but from a low base",
+ "status": "live",
+ "source_urls": [
+ "https://yougov.com/en-us/articles/53701-most-americans-use-ai-but-still-dont-trust-it"
+ ],
+ "sub_theme": "Trust and adoption curve",
+ "merged_from": 2
+ },
+ {
+ "name": "Zapier AI / Zapier Agents / Zapier Canvas",
+ "what": "AI-powered workflow automation platform that generates workflows from natural language and converts visual process diagrams into functioning automations.",
+ "evidence": "8,000+ app integrations. Zapier Canvas converts visual diagrams into live automations. Natural language workflow creation (e.g., 'When I get a Gmail, summarize it and post to Slack'). Zapier Agents carry out specific workflows. Focus on AI orchestration, agents, copilot, human-in-the-loop, and MCP as of 2026.",
+ "status": "live",
+ "source_urls": ["https://zapier.com/"],
+ "sub_theme": "Workflow automation with AI",
+ "merged_from": 2
+ },
+ {
+ "name": "Zep",
+ "what": "Context engineering platform for AI agents combining chat history, business data, and user behavior.",
+ "evidence": "$500K seed (Apr 2024). YC W24. Tracks how facts change over time (temporal reasoning). Integrates structured business data with conversational history. Graph-based memory with relationship modeling. No significant follow-on funding disclosed as of March 2026.",
+ "status": "live",
+ "source_urls": ["https://www.getzep.com/"],
+ "sub_theme": "Agent memory infrastructure",
+ "merged_from": 2
+ }
+ ],
+ "gaps": [
+ "No public autonomous task completion rate benchmark exists for Cursor's agent mode or background agent feature. Cursor publishes no equivalent of SWE-bench results.",
+ "Cline has no published autonomous completion rate metrics despite 5M+ users. They created Cline Bench (Nov 2025) for evaluating agents but no public leaderboard results were found.",
+ "The METR study used early-2025 tools (Claude 3.5/3.7 Sonnet). METR announced in Feb 2026 they are redesigning their experiment, so no updated results with current models exist yet.",
+ "No rigorous academic study was found measuring autonomous agent operation duration specifically (as opposed to benchmark accuracy). Anthropic's telemetry analysis is the closest, but it measures user behavior rather than agent capability limits.",
+ "GitHub Copilot Workspace was sunset before publishing formal task completion benchmarks. The replacement Coding Agent has volume data (1.2M PRs/month) but no quality/completion rate data.",
+ "No data was found on how agent reliability changes as task duration increases (i.e., reliability curves over time). The compounding error rate calculation (85% per step = 20% over 10 steps) is theoretical, not empirically measured on real agent runs.",
+ "Token cost data for long-running autonomous sessions is sparse. GPT-5-Codex reports 93.7% fewer tokens than GPT-5 for simple turns but no data on cost of 7+ hour sessions.",
+ "GitHub Copilot specific user count and revenue figures for 2025-2026 were not available in search results (Microsoft bundles into broader GitHub/365 reporting).",
+ "Rift (coding agent tool) appears to be defunct or very low profile - no meaningful results found. It may have been absorbed or abandoned.",
+ "Specific accuracy rates or task completion metrics for Devin, Codex, and Copilot agent in production (outside of benchmarks) are not publicly disclosed.",
+ "Cost-per-task data for autonomous agents (API spend per completed coding task) is not systematically published by any vendor.",
+ "Zep's traction metrics (users, API calls, revenue) beyond the $500K seed are not publicly available, making it hard to assess product-market fit relative to Mem0.",
+ "Detailed technical comparison of memory architectures (Mem0 vs Zep vs Claude Code's built-in memory vs LangGraph checkpointing) lacks independent benchmarking.",
+ "Agent-to-agent protocol adoption numbers (how many production A2A deployments exist) are not available beyond the '150+ supporting organizations' figure, which measures intent not deployment.",
+ "Clara Labs: Search results show it is still operating (acquired by TopFunnel in 2019). Could not confirm any shutdown. May not qualify as a failure.",
+ "Exact acquisition price for Rewind/Limitless by Meta was not disclosed.",
+ "Exact acquisition price for x.ai by Bizzabo was not disclosed.",
+ "Rabbit R1 current financial status is unclear; reports of unpaid salaries but no confirmed shutdown or bankruptcy filing found.",
+ "Could not find specific cost-per-query data for Facebook M's human operators.",
+ "Did not find additional personal AI startups raising >$5M that shut down 2023-2026 beyond the ones listed. The landscape is large and many smaller failures go unreported.",
+ "Intercom Fin is not a failure; included for contrast. Could not find well-sourced enterprise AI assistant failures with specific numbers.",
+ "Magic Leap is tangential to 'personal AI assistant' theme; included because it represents ambient computing failure but it was primarily an AR/spatial computing bet.",
+ "No public data found on the specific breakdown of WHY developers interrupt agents (e.g., what % is 'wrong direction' vs 'missing context' vs 'safety concern'). Anthropic's autonomy paper shows frequency shifts but not a categorical breakdown of interruption reasons.",
+ "No quantitative data found distinguishing approval-gate interventions (trust) from context-provision interventions (knowledge) from error-correction interventions (capability). The Turing Post finding that developers distrust context rather than capability is qualitative, not measured.",
+ "Copilot's 70% rejection rate is not broken down by reason. We know 30% of suggestions are accepted but not why the other 70% are rejected (wrong code? wrong timing? partial match? style mismatch?).",
+ "No published research found measuring how much human-in-the-loop overhead could be replaced by personal data (preferences, coding style, project context) vs institutional knowledge vs real-time judgment.",
+ "Limited data on HITL patterns in non-coding autonomous agent workflows (e.g., customer service, data analysis, content creation) with the same level of rigor as the coding domain.",
+ "Anthropic's Constitutional AI paper shows principles CAN replace per-instance human labels, but there is no published measurement of how much residual human oversight Constitutional AI eliminates in production deployment vs standard RLHF.",
+ "No found data on the cost of human-in-the-loop interventions (developer time per interruption, context-switching penalty) in agent workflows specifically.",
+ "Personal.ai revenue and user numbers beyond 2023 ($138K ARR) are not publicly available; the company appears small relative to funding but specific current metrics are missing.",
+ "Mem.ai current user counts and revenue are not publicly disclosed; impossible to assess actual traction vs. the $110M post-money valuation.",
+ "Tab AI's actual shipping status and real-world user feedback are unclear; the product was announced in 2024 but confirmation of general availability is sparse.",
+ "Google Gemini's personal data integration (Gmail, Calendar, Drive context in Gemini Advanced) was not deeply researched in this wave; it belongs in a follow-up.",
+ "Microsoft Copilot's personal data context (across M365 apps, Microsoft Graph) was not covered in this wave and is a major player that warrants dedicated research.",
+ "No specific products were found that ingest cross-platform personal data (e.g., Spotify + GitHub + LinkedIn together) and generate task priorities or agent actions. This is the gap Vana would fill.",
+ "Screenpipe's actual user count (beyond 202 subscriptions) and daily active usage data are not available.",
+ "Privacy/regulatory outcomes for always-listening wearables (Tab, Limitless pendant) are not well-documented beyond general concerns.",
+ "Accuracy and hallucination rates for personal-data-augmented AI responses are not benchmarked across products (only Mem0 publishes comparative claims).",
+ "Specific data on what percentage of human interventions are context-related vs. approval-related",
+ "Published comparison of agent performance with vs. without personal user context",
+ "Revenue data for agent memory companies (Mem0, Zep)",
+ "Academic research specifically on personal data as agent context (most research focuses on task-specific context, not cross-platform personal data)",
+ "Specific adoption/usage statistics for Apple Intelligence features are not publicly reported; no data on what percentage of eligible users have enabled on-device AI features",
+ "No specific YC W26 or S25 startups found focused on personal data portability for AI agent context specifically; the intersection of data portability + AI agents appears to be a white space in accelerator batches",
+ "Cost-per-user economics for personal AI context (storage + retrieval + inference) at consumer scale are not well-documented; enterprise RAG costs are published but consumer personal AI unit economics are proprietary",
+ "No survey data found specifically measuring user willingness to share personal data with AI agents (as opposed to AI chatbots or companies generally); the agent-specific trust question appears unstudied",
+ "Personal data store companies (Digi.me, Meeco, MyDex, Solid) have no publicly available user count or revenue figures for 2025-2026, making it impossible to assess actual traction",
+ "No data found on how much personal context is needed before an AI agent becomes meaningfully more useful than a generic one (the minimum viable personalization threshold); the cold start research is theoretical, not empirical from deployed products",
+ "Healthcare personal data + AI agents is notably absent from findings; despite HIPAA and health data being among the most valuable personal data, no specific companies or products were found building personal health data as agent context",
+ "The intersection of MCP protocol adoption and personal data access patterns is undocumented; no data on how many MCP servers handle personal vs. enterprise data",
+ "No specific figures found on how much Google, Apple, or OpenAI spend on personal data storage and processing per user for their AI memory/context features",
+ "No hard accuracy or precision numbers found for OpenClaw daily briefing quality or user satisfaction rates.",
+ "ChatGPT Pulse was live for only ~3 months before being shelved; no usage metrics, retention data, or user feedback were published.",
+ "No revenue or pricing data found for OpenClaw or NanoClaw (both are open-source, likely no direct revenue model).",
+ "Limited data on what percentage of ChatGPT users actually use Scheduled Tasks, or how many tasks are active across the user base.",
+ "No products found that specifically generate coding agent prompts from personal data (the exact intersection Vana is exploring). The closest are OpenClaw daily briefings (which summarize but don't generate actionable coding tasks) and GitHub Copilot coding agent (which acts on issues but doesn't infer tasks from personal data).",
+ "No data found on failure rates or user abandonment for proactive/scheduled AI features across any platform.",
+ "Google CC (Your Day Ahead) is too new (December 2025 launch) to have published usage or retention metrics.",
+ "Gemini Goal Scheduled Actions (February 2026) has no published adoption data yet.",
+ "The PROBE benchmark's 40% ceiling for proactive agents suggests the technical problem is far from solved, but no commercial product has published comparable accuracy metrics for their proactive features.",
+ "No products found that combine personal data from multiple life domains (code + social + media + conversations) into coding-specific task generation. Each product stays in its lane: code tools generate code tasks, productivity tools generate productivity tasks, media tools recommend media."
+ ]
+}
diff --git a/research/personal-data-agents/findings/wave1-agent-autonomy.json b/research/personal-data-agents/findings/wave1-agent-autonomy.json
new file mode 100644
index 00000000..d7d3e686
--- /dev/null
+++ b/research/personal-data-agents/findings/wave1-agent-autonomy.json
@@ -0,0 +1,118 @@
+{
+ "agent_question": "Current autonomous agent duration and reliability benchmarks",
+ "findings": [
+ {
+ "name": "Anthropic - Measuring Agent Autonomy (Claude Code telemetry)",
+ "what": "Anthropic analyzed millions of Claude Code interactions to measure how long agents operate autonomously and how user trust evolves.",
+ "evidence": "Between Oct 2025 and Jan 2026, the 99.9th percentile turn duration nearly doubled from under 25 minutes to over 45 minutes. Median turn duration remained ~45 seconds (fluctuating 40-55s). Users with <50 sessions use full auto-approve ~20% of the time; by 750 sessions this rises to >40%. Experienced users interrupt Claude more often, not less, reflecting a shift from pre-approval to autonomous-with-intervention oversight. Software engineering accounts for ~50% of all agentic tool calls on the API. Only 0.8% of all actions are irreversible.",
+ "status": "live",
+ "source_url": "https://www.anthropic.com/research/measuring-agent-autonomy",
+ "sub_theme": "Autonomy duration and trust metrics"
+ },
+ {
+ "name": "Anthropic - Effective Harnesses for Long-Running Agents",
+ "what": "Anthropic engineering blog on the core challenge of multi-context-window agent sessions spanning hours or days.",
+ "evidence": "Long-running agents must work in discrete sessions, each starting with no memory of prior work. Context windows are limited, so agents need bridging mechanisms (e.g., claude-progress.txt + git history). Even frontier models like Opus 4.5 running in a loop across multiple context windows fail to build production-quality apps from high-level prompts alone, tending to attempt too much at once. The harness pattern uses different prompts for the first context window vs. subsequent ones.",
+ "status": "live",
+ "source_url": "https://www.anthropic.com/engineering/effective-harnesses-for-long-running-agents",
+ "sub_theme": "Autonomy duration and trust metrics"
+ },
+ {
+ "name": "OpenAI Codex - GPT-5-Codex autonomous duration",
+ "what": "OpenAI's Codex agent demonstrated multi-hour autonomous coding sessions on complex tasks.",
+ "evidence": "GPT-5-Codex worked independently for more than 7 hours at a time on large tasks during testing, iterating on implementation, fixing test failures, and delivering working code. GPT-5.2-Codex runs 24-hour autonomous tasks. Codex-1 achieves 37% accuracy on first attempts on SWE-Bench, scaling to 70.2% with retries and 85% after 8 attempts. GPT-5.3-Codex achieved state-of-the-art on SWE-Bench Pro (56.8%), though this means it still fails nearly half of professional-level tasks.",
+ "status": "live",
+ "source_url": "https://openai.com/index/introducing-codex/",
+ "sub_theme": "Autonomy duration and trust metrics"
+ },
+ {
+ "name": "Claude Opus 4.6 - SWE-bench Verified",
+ "what": "Latest Claude model achieves top-tier autonomous coding benchmark scores.",
+ "evidence": "Opus 4.6 reached 81.42% on SWE-bench Verified (with prompt modification). Opus 4.5 scored 80.9%. Sonnet 4.5 scored 77.2%, up from Sonnet 4's 72.7%. Note: OpenAI has stopped reporting Verified scores after finding training data contamination across all frontier models, recommending SWE-Bench Pro instead.",
+ "status": "live",
+ "source_url": "https://www.anthropic.com/research/swe-bench-sonnet",
+ "sub_theme": "Benchmark performance"
+ },
+ {
+ "name": "SWE-bench Verified Leaderboard (Feb 2026)",
+ "what": "Industry-standard benchmark for autonomous issue resolution on real GitHub repos.",
+ "evidence": "Top scores as of Feb 2026: Claude Opus 4.5 at 80.9%, Sonar Foundation Agent at 79.2%, Gemini 3 Flash at ~76.2%. Average score across all models is 62.2%. SWE-bench Pro (harder, uncontaminated) shows GPT-5.3-Codex at 56.8%, meaning top models still fail ~43-50% of professional-level tasks.",
+ "status": "live",
+ "source_url": "https://www.swebench.com/",
+ "sub_theme": "Benchmark performance"
+ },
+ {
+ "name": "Cognition Labs (Devin) - Real-world task completion",
+ "what": "First marketed 'AI software engineer' with widely scrutinized real-world completion rates.",
+ "evidence": "Original Devin resolved 13.86% of real GitHub issues on SWE-bench (7x improvement over prior 1.96% baseline). Independent testing showed 15-30% success rates in practice. Devin 2.0 (April 2025) claimed 83% more junior-level tasks per compute unit vs. 1.x. Pricing dropped from $500/month to $20/month. Cognition hit $155M ARR in July 2025 ($73M from Devin, $82M from acquired Windsurf). Raised ~$696M total, valued at $10.2B as of Sept 2025. Goldman Sachs piloted Devin alongside 12,000 developers, reporting 20% efficiency gains.",
+ "status": "live",
+ "source_url": "https://cognition.ai/blog/funding-growth-and-the-next-frontier-of-ai-coding-agents",
+ "sub_theme": "Autonomous coding products"
+ },
+ {
+ "name": "GitHub Copilot Coding Agent",
+ "what": "GitHub's autonomous agent that creates pull requests from issues, replacing the sunset Copilot Workspace.",
+ "evidence": "Copilot Workspace (launched as preview April 2024) was sunset May 30, 2025 and rebuilt into the Copilot Coding Agent, GA to all paid subscribers in Sept 2025. The coding agent contributes to ~1.2 million pull requests per month. Developers using Copilot complete tasks 55% faster (study of 4,800 developers). However, 29.1% of Python code generated contains potential security weaknesses. No public autonomous task completion rate benchmark was found.",
+ "status": "live",
+ "source_url": "https://github.com/newsroom/press-releases/agent-mode",
+ "sub_theme": "Autonomous coding products"
+ },
+ {
+ "name": "METR - AI Developer Productivity Study",
+ "what": "Rigorous RCT showing AI tools made experienced open-source developers 19% slower, contradicting their own perception.",
+ "evidence": "16 experienced developers (avg 5 years on their projects, repos averaging 22k+ stars and 1M+ lines) worked on 246 real issues randomly assigned AI-allowed or AI-disallowed. Developers using AI took 19% longer. Developers expected AI to speed them up by 24%, and even after the study still believed it had sped them up by 20%. Less than 44% of AI-generated code was accepted. Tools used: primarily Cursor Pro with Claude 3.5/3.7 Sonnet.",
+ "status": "live",
+ "source_url": "https://metr.org/blog/2025-07-10-early-2025-ai-experienced-os-dev-study/",
+ "sub_theme": "Academic research on agent effectiveness"
+ },
+ {
+ "name": "Compounding error rate in multi-step agent workflows",
+ "what": "Research on why per-step accuracy is misleading for autonomous agent reliability.",
+ "evidence": "If an AI agent achieves 85% accuracy per action, a 10-step workflow only succeeds about 20% of the time (0.85^10 = 0.197). Graham Neubig found agents autonomously solve 30-40% of tasks without human intervention, with performance dropping sharply in domains requiring broader context: administrative work (0% success), financial analysis (8.3% success).",
+ "status": "live",
+ "source_url": "https://firstpagesage.com/seo-blog/agentic-ai-statistics/",
+ "sub_theme": "Academic research on agent effectiveness"
+ },
+ {
+ "name": "Claude Code - Sub-agent and background task architecture",
+ "what": "Claude Code supports parallel sub-agents and background tasks but with significant isolation constraints.",
+ "evidence": "Sub-agents work in isolation with their own context windows and no direct awareness of other sub-agents. They cannot collaborate in real-time; the main agent must wait for all reports before synthesizing. Sub-agents have temporary context windows and cannot ask clarifying questions. Background tasks (Ctrl+B) keep long-running processes active without blocking the main session. Sustained multi-day autonomy on real codebases is described as 'not a solved problem' due to context limitations and brittleness.",
+ "status": "live",
+ "source_url": "https://www.anthropic.com/news/enabling-claude-code-to-work-more-autonomously",
+ "sub_theme": "Agent architecture and limitations"
+ },
+ {
+ "name": "Windsurf - Parallel agent sessions (Wave 13)",
+ "what": "Windsurf added parallel Cascade agent sessions in early 2026 for simultaneous multi-task development.",
+ "evidence": "Wave 13 (early 2026) added parallel agent sessions with dedicated terminal profiles. Windsurf ranked #1 in LogRocket rankings (Feb 2026) assessing 50+ features. Windsurf was acquired by Cognition Labs; its $82M ARR across 350+ enterprise accounts contributed to Cognition's $155M combined ARR.",
+ "status": "live",
+ "source_url": "https://dev.to/pockit_tools/cursor-vs-windsurf-vs-claude-code-in-2026-the-honest-comparison-after-using-all-three-3gof",
+ "sub_theme": "Autonomous coding products"
+ },
+ {
+ "name": "Aider - SWE-bench performance (open source)",
+ "what": "Open-source CLI coding agent with published benchmark scores.",
+ "evidence": "Aider achieved 49.2% solve rate on SWE-bench Verified as of March 2026.",
+ "status": "live",
+ "source_url": "https://is4.ai/blog/our-blog-1/cline-vs-aider-comparison-2026-313",
+ "sub_theme": "Benchmark performance"
+ },
+ {
+ "name": "Enterprise agent adoption reality check (2025)",
+ "what": "Despite universal experimentation, mass autonomous agent adoption did not materialize in 2025.",
+ "evidence": "Nearly 99% of enterprise developers experimented with AI agents in 2025, but mass adoption never materialized. Bounded scope, human oversight, and specific workflows showed the most pragmatic results. The current practical sweet spot is Level 3-4: supervised autonomy where you provide goals and guardrails, the agent executes independently, and you approve/reject at decision points.",
+ "status": "live",
+ "source_url": "https://firstpagesage.com/seo-blog/agentic-ai-statistics/",
+ "sub_theme": "Industry adoption reality"
+ }
+ ],
+ "gaps": [
+ "No public autonomous task completion rate benchmark exists for Cursor's agent mode or background agent feature. Cursor publishes no equivalent of SWE-bench results.",
+ "Cline has no published autonomous completion rate metrics despite 5M+ users. They created Cline Bench (Nov 2025) for evaluating agents but no public leaderboard results were found.",
+ "The METR study used early-2025 tools (Claude 3.5/3.7 Sonnet). METR announced in Feb 2026 they are redesigning their experiment, so no updated results with current models exist yet.",
+ "No rigorous academic study was found measuring autonomous agent operation duration specifically (as opposed to benchmark accuracy). Anthropic's telemetry analysis is the closest, but it measures user behavior rather than agent capability limits.",
+ "GitHub Copilot Workspace was sunset before publishing formal task completion benchmarks. The replacement Coding Agent has volume data (1.2M PRs/month) but no quality/completion rate data.",
+ "No data was found on how agent reliability changes as task duration increases (i.e., reliability curves over time). The compounding error rate calculation (85% per step = 20% over 10 steps) is theoretical, not empirically measured on real agent runs.",
+ "Token cost data for long-running autonomous sessions is sparse. GPT-5-Codex reports 93.7% fewer tokens than GPT-5 for simple turns but no data on cost of 7+ hour sessions."
+ ]
+}
diff --git a/research/personal-data-agents/findings/wave1-competitive.json b/research/personal-data-agents/findings/wave1-competitive.json
new file mode 100644
index 00000000..05ff7b94
--- /dev/null
+++ b/research/personal-data-agents/findings/wave1-competitive.json
@@ -0,0 +1,175 @@
+{
+ "agent_question": "Competitive landscape for reducing human-in-the-loop in coding agents",
+ "research_date": "2026-03-17",
+ "findings": [
+ {
+ "name": "Model Context Protocol (MCP) - Ecosystem Scale",
+ "what": "Open protocol created by Anthropic for connecting AI agents to external tools and data sources, now governed by the Linux Foundation.",
+ "evidence": "10,000+ active public MCP servers. 97M+ monthly SDK downloads. Adopted by ChatGPT, Cursor, Gemini, Microsoft Copilot, VS Code. Donated to Agentic AI Foundation (Linux Foundation) co-founded by Anthropic, Block, OpenAI with support from Google, Microsoft, AWS, Cloudflare, Bloomberg.",
+ "status": "live",
+ "source_url": "https://www.anthropic.com/news/donating-the-model-context-protocol-and-establishing-of-the-agentic-ai-foundation",
+ "sub_theme": "MCP ecosystem"
+ },
+ {
+ "name": "Model Context Protocol (MCP) - Adoption Trajectory",
+ "what": "MCP server downloads grew from ~100K in November 2024 to 8M+ by April 2025; remote MCP servers up nearly 4x since May 2025.",
+ "evidence": "PulseMCP registry lists 5,500+ servers. Enterprise infrastructure support from AWS, Cloudflare, Google Cloud, Microsoft Azure. Claude has 75+ connectors powered by MCP. Claude Code plugin directory launched early 2026 with 72+ plugins across 24 categories.",
+ "status": "live",
+ "source_url": "https://mcpmanager.ai/blog/mcp-adoption-statistics/",
+ "sub_theme": "MCP ecosystem"
+ },
+ {
+ "name": "Google Agent2Agent Protocol (A2A)",
+ "what": "Open protocol for agent-to-agent communication, task delegation, and capability discovery between AI agents.",
+ "evidence": "Launched April 9, 2025 at Cloud Next. 150+ supporting organizations including Atlassian, PayPal, Salesforce, SAP, ServiceNow. Version 0.3 added gRPC support. Agents advertise capabilities via JSON 'Agent Cards'. Designed as complement to MCP (MCP = tools/context, A2A = agent coordination).",
+ "status": "live",
+ "source_url": "https://developers.googleblog.com/en/a2a-a-new-era-of-agent-interoperability/",
+ "sub_theme": "Agent-to-agent protocols"
+ },
+ {
+ "name": "Anthropic Claude Code",
+ "what": "CLI-based autonomous coding agent with hooks, CLAUDE.md context files, memory system, and agent SDK.",
+ "evidence": "$2.5B ARR as of February 2026, reached in ~9 months from public launch (May 2025). Anthropic overall at $19B ARR in March 2026 ($380B valuation, $30B Series G). Uses 6-layer memory system loaded at session start. Hooks provide deterministic lifecycle control (session start, prompt submit). CLAUDE.md files provide static per-project context. Agent SDK powers the same execution loop externally.",
+ "status": "live",
+ "source_url": "https://www.uncoveralpha.com/p/anthropics-claude-code-is-having",
+ "sub_theme": "Coding agent products"
+ },
+ {
+ "name": "Cursor (Anysphere)",
+ "what": "AI-native code editor with codebase indexing, .cursorrules context, and cloud-based autonomous agents.",
+ "evidence": "$29.3B valuation, $3.4B total raised across 7 rounds. $1.2B ARR in 2025 (1,100% YoY growth from $100M). 1M+ daily active users, 360K paying customers. Indexes entire codebase with custom embedding model. .cursorrules file provides standing instructions for AI. Cloud Agents (new 2026) run autonomously on Cursor infrastructure, startable from browser/phone/Slack.",
+ "status": "live",
+ "source_url": "https://techcrunch.com/2025/06/05/cursors-anysphere-nabs-9-9b-valuation-soars-past-500m-arr/",
+ "sub_theme": "Coding agent products"
+ },
+ {
+ "name": "Cognition (Devin + Windsurf)",
+ "what": "Autonomous coding agent that handles full project lifecycle in a secure sandbox, plus IDE-based coding via acquired Windsurf.",
+ "evidence": "$10.2B valuation, $400M+ raised (Founders Fund led). ARR grew from $1M (Sep 2024) to $73M (Jun 2025). Windsurf acquisition (Jul 2025) more than doubled ARR. Total net burn under $20M across company history. Enterprise customers: Goldman Sachs, Citi, Dell, Cisco, Palantir. Devin Wiki and Devin Search added for codebase understanding. Multi-agent dispatch capability added early 2025.",
+ "status": "live",
+ "source_url": "https://techcrunch.com/2025/09/08/cognition-ai-defies-turbulence-with-a-400m-raise-at-10-2b-valuation/",
+ "sub_theme": "Coding agent products"
+ },
+ {
+ "name": "OpenAI Codex",
+ "what": "Cloud-sandboxed coding agent with two execution modes: cloud sandbox for parallel background tasks and terminal CLI for local execution.",
+ "evidence": "Available to ChatGPT Plus users from June 2025. GPT-5.2-Codex is latest model (Jan 2026). Sandbox disables internet access during execution, limiting agent to provided repo + pre-installed deps. Auto-detects setup scripts. Configurable internet access and approval controls.",
+ "status": "live",
+ "source_url": "https://openai.com/index/introducing-codex/",
+ "sub_theme": "Coding agent products"
+ },
+ {
+ "name": "GitHub Copilot Coding Agent",
+ "what": "Asynchronous autonomous coding agent embedded in GitHub, with agent mode in VS Code and recently GA'd CLI agent.",
+ "evidence": "GA for all paid Copilot subscribers as of Sep 25, 2025. Copilot CLI became GA Feb 25, 2026 with agentic capabilities: plans, builds, reviews, remembers across sessions. Agent mode expanding to JetBrains, Eclipse, Xcode. Excels at low-to-medium complexity tasks in well-tested codebases. Copilot Chat open-sourced in VS Code.",
+ "status": "live",
+ "source_url": "https://github.blog/news-insights/product-news/github-copilot-meet-the-new-coding-agent/",
+ "sub_theme": "Coding agent products"
+ },
+ {
+ "name": "Cline",
+ "what": "Open-source autonomous coding agent VS Code extension with Plan/Act modes and MCP integration.",
+ "evidence": "5M+ developers. $32M Series A (Jul 2025) from Emergence Capital. Founded 2024 by Saoud Rizwan. Samsung beta-testing for Device eXperience division. Free and open-source, requires external API keys. Only fully open-source, local-first agent purpose-built for production development.",
+ "status": "live",
+ "source_url": "https://cline.bot/",
+ "sub_theme": "Coding agent products"
+ },
+ {
+ "name": "Aider",
+ "what": "Open-source CLI AI pair programmer with repository map, automatic context management, and git-native workflow.",
+ "evidence": "39K+ GitHub stars, 4.1M+ installations, 15B tokens processed. 49.2% solve rate on SWE-bench Verified (with Claude 3.5 Sonnet, early 2026). Repository map creates condensed codebase overview. Automatically pulls context from related files. Every AI change gets its own git commit. Connects to 100+ models.",
+ "status": "live",
+ "source_url": "https://aider.chat/",
+ "sub_theme": "CLI-based agent tools"
+ },
+ {
+ "name": "Context7 (Upstash)",
+ "what": "MCP server that injects live, up-to-date library documentation into AI coding assistants' context windows.",
+ "evidence": "9,000+ libraries and frameworks indexed. Two core tools: resolve-library-id and query-docs. Integrates with Cursor, VS Code, Claude Code. Built by Upstash. CLI + Skills mode (no MCP required) or MCP mode. Solves the stale training data problem for library APIs.",
+ "status": "live",
+ "source_url": "https://upstash.com/blog/context7-llmtxt-cursor",
+ "sub_theme": "Code intelligence platforms"
+ },
+ {
+ "name": "Greptile",
+ "what": "AI code review and validation platform that indexes GitHub repositories for codebase-aware analysis.",
+ "evidence": "$25M Series A (Sep 2025) led by Benchmark at ~$180M valuation. $45.5M total raised. Founded by Daksh Gupta (Georgia Tech), YC W24. Analyzes repos and validates every code update. Independent code validation layer across repositories.",
+ "status": "live",
+ "source_url": "https://siliconangle.com/2025/09/23/greptile-bags-25m-funding-take-coderabbit-graphite-ai-code-validation/",
+ "sub_theme": "Code intelligence platforms"
+ },
+ {
+ "name": "Augment Code",
+ "what": "Enterprise AI coding platform focused on understanding large codebases, backed by Eric Schmidt.",
+ "evidence": "$252M raised at $977M valuation (Apr 2024). 188 employees as of Feb 2026. Backed by Eric Schmidt, Index Ventures, Sutter Hill, Lightspeed. Ranks 3rd among 144 active competitors by total funding. No new public funding rounds in 2025-2026.",
+ "status": "live",
+ "source_url": "https://techcrunch.com/2024/04/24/eric-schmidt-backed-augment-a-github-copilot-rival-launches-out-of-stealth-with-252m/",
+ "sub_theme": "Coding agent products"
+ },
+ {
+ "name": "Mem0",
+ "what": "Memory infrastructure layer for AI agents providing persistent, structured memory via API.",
+ "evidence": "$24M Series A (Oct 2025) led by Basis Set Ventures, with Peak XV, GitHub Fund, YC. 41K+ GitHub stars, 13M+ PyPI downloads. 80K+ developers on cloud service. API calls grew from 35M (Q1 2025) to 186M (Q3 2025), ~30% MoM. Natively integrated into CrewAI, Flowise, Langflow. AWS selected Mem0 as exclusive memory provider for their Agent SDK.",
+ "status": "live",
+ "source_url": "https://techcrunch.com/2025/10/28/mem0-raises-24m-from-yc-peak-xv-and-basis-set-to-build-the-memory-layer-for-ai-apps/",
+ "sub_theme": "Agent memory infrastructure"
+ },
+ {
+ "name": "Zep",
+ "what": "Context engineering platform for AI agents combining chat history, business data, and user behavior.",
+ "evidence": "$500K seed (Apr 2024). YC W24. Tracks how facts change over time (temporal reasoning). Integrates structured business data with conversational history. Graph-based memory with relationship modeling. No significant follow-on funding disclosed as of March 2026.",
+ "status": "live",
+ "source_url": "https://www.getzep.com/",
+ "sub_theme": "Agent memory infrastructure"
+ },
+ {
+ "name": "LangChain / LangSmith",
+ "what": "Most widely adopted AI agent framework (LangChain) with commercial observability and testing platform (LangSmith).",
+ "evidence": "$125M Series B (Oct 2025) at $1.25B valuation, led by IVP with Sequoia, Benchmark. 47M+ PyPI downloads (Jan 2026). ARR $12-16M+ as of mid-2025 (company says that figure is 'low for where we are today'). LangSmith monthly trace volume 12x'd YoY. Total raised: $160M across seed, Series A, Series B.",
+ "status": "live",
+ "source_url": "https://fortune.com/2025/10/20/exclusive-early-ai-darling-langchain-is-now-a-unicorn-with-a-fresh-125-million-in-funding/",
+ "sub_theme": "Multi-agent frameworks"
+ },
+ {
+ "name": "CrewAI",
+ "what": "Multi-agent automation platform for enterprises with structured role-based memory and RAG support.",
+ "evidence": "$18M total raised (Series A $12.5M, Oct 2024, led by Insight Partners). $3.2M revenue as of Jul 2025. Backed by Andrew Ng and Dharmesh Shah (HubSpot). Natively integrates Mem0 for memory. Fastest-growing multi-agent framework by adoption.",
+ "status": "live",
+ "source_url": "https://siliconangle.com/2024/10/22/agentic-ai-startup-crewai-closes-18m-funding-round/",
+ "sub_theme": "Multi-agent frameworks"
+ },
+ {
+ "name": "Microsoft Agent Framework (AutoGen + Semantic Kernel)",
+ "what": "Unified open-source agent framework merging AutoGen's multi-agent innovation with Semantic Kernel's enterprise stability.",
+ "evidence": "Announced Oct 2025. Release Candidate 1.0 on Feb 19, 2026. GA targeted end of Q1 2026. Both AutoGen and Semantic Kernel placed in maintenance mode (bug fixes only, no new features). Supports Python and .NET. Consolidates AI workloads into single SDK with observability.",
+ "status": "announced",
+ "source_url": "https://visualstudiomagazine.com/articles/2025/10/01/semantic-kernel-autogen--open-source-microsoft-agent-framework.aspx",
+ "sub_theme": "Multi-agent frameworks"
+ },
+ {
+ "name": "SWE-bench Verified - Current State of Autonomous Coding",
+ "what": "Primary benchmark for measuring autonomous coding agent performance on real GitHub issues.",
+ "evidence": "Top scores cluster at 77.8-80.9% on SWE-bench Verified (tightest top-tier race yet). Claude Opus 4.5 leads at 80.9% Verified but only 45.9% on SWE-bench Pro (same model, half the score). SWE-bench Pro is considered more reliable due to data contamination in Verified's 500 Python-only tasks.",
+ "status": "live",
+ "source_url": "https://www.swebench.com/verified.html",
+ "sub_theme": "Benchmarks and limitations"
+ },
+ {
+ "name": "Autonomous Agent Reliability Gap",
+ "what": "Compound failure rates and trust deficits remain the core barrier to reducing human-in-the-loop.",
+ "evidence": "At 85% per-action accuracy, a 10-step workflow succeeds only 20% of the time. 46% of developers actively distrust AI code accuracy, only 3% highly trust it. 66% say top frustration is 'almost right, but not quite'. 99% of enterprise devs experimented with agents in 2025 but mass adoption did not materialize. Merged PRs correlate with small, localized changes; failed PRs are invasive and sprawling.",
+ "status": "live",
+ "source_url": "https://medium.com/@vivek.babu/where-autonomous-coding-agents-fail-a-forensic-audit-of-real-world-prs-59d66e33efe9",
+ "sub_theme": "Benchmarks and limitations"
+ }
+ ],
+ "gaps": [
+ "GitHub Copilot specific user count and revenue figures for 2025-2026 were not available in search results (Microsoft bundles into broader GitHub/365 reporting).",
+ "Rift (coding agent tool) appears to be defunct or very low profile - no meaningful results found. It may have been absorbed or abandoned.",
+ "Specific accuracy rates or task completion metrics for Devin, Codex, and Copilot agent in production (outside of benchmarks) are not publicly disclosed.",
+ "Cost-per-task data for autonomous agents (API spend per completed coding task) is not systematically published by any vendor.",
+ "Zep's traction metrics (users, API calls, revenue) beyond the $500K seed are not publicly available, making it hard to assess product-market fit relative to Mem0.",
+ "Detailed technical comparison of memory architectures (Mem0 vs Zep vs Claude Code's built-in memory vs LangGraph checkpointing) lacks independent benchmarking.",
+ "Agent-to-agent protocol adoption numbers (how many production A2A deployments exist) are not available beyond the '150+ supporting organizations' figure, which measures intent not deployment."
+ ]
+}
diff --git a/research/personal-data-agents/findings/wave1-direct.json b/research/personal-data-agents/findings/wave1-direct.json
new file mode 100644
index 00000000..0b4c7168
--- /dev/null
+++ b/research/personal-data-agents/findings/wave1-direct.json
@@ -0,0 +1,107 @@
+{
+ "agent_question": "Direct research on agent autonomy metrics, memory infrastructure, and personal data context",
+ "findings": [
+ {
+ "name": "Anthropic Measuring Agent Autonomy",
+ "what": "Anthropic published real-world data on Claude Code autonomous operation duration and intervention frequency",
+ "evidence": "Between Oct 2025 and Jan 2026, 99.9th percentile turn duration nearly doubled from under 25 min to over 45 min. Average human interventions per session decreased from 5.4 to 3.3. Success rate on challenging tasks doubled.",
+ "status": "live",
+ "source_url": "https://www.anthropic.com/research/measuring-agent-autonomy",
+ "sub_theme": "Autonomous Agent Duration"
+ },
+ {
+ "name": "Claude Opus 4.6 Task Horizon",
+ "what": "Claude Opus 4.6 crossed a full work-day task horizon at 14.5 hours",
+ "evidence": "50% time horizon of approximately 14.5 hours (doubling every 123 days). Claude Opus 4.5 was at 4 hours 49 minutes. Average session length increased from 4 minutes (autocomplete era) to 23 minutes (agentic era).",
+ "status": "live",
+ "source_url": "https://getbeam.dev/blog/anthropic-agentic-coding-trends-2026.html",
+ "sub_theme": "Autonomous Agent Duration"
+ },
+ {
+ "name": "METR Task Horizon Doubling",
+ "what": "METR tracks task completion time horizons for frontier AI models, finding a 7-month doubling time",
+ "evidence": "Task length that agents complete with 50% reliability has been doubling every 7 months for 6 years. Time Horizon 1.1 added 34% more tasks (228 total) and doubled 8-hour+ tasks (31 total).",
+ "status": "live",
+ "source_url": "https://metr.org/time-horizons/",
+ "sub_theme": "Autonomous Agent Duration"
+ },
+ {
+ "name": "OpenAI Codex 25-hour autonomous run",
+ "what": "Codex ran for 25 hours uninterrupted building a design tool from scratch",
+ "evidence": "Used ~13M tokens, generated ~30k lines of code. GPT-5.3-Codex achieves 77.3% on Terminal-Bench, 56.8% on SWE-Bench Pro.",
+ "status": "live",
+ "source_url": "https://openai.com/index/introducing-gpt-5-3-codex/",
+ "sub_theme": "Autonomous Agent Duration"
+ },
+ {
+ "name": "Mem0 Memory Infrastructure",
+ "what": "AI agent memory infrastructure company providing persistent cross-session memory",
+ "evidence": "Raised $24M (Seed + Series A). 186M API calls in Q3 2025. 41K GitHub stars, 13M Python downloads. Graph-based memory with sub-second retrieval.",
+ "status": "live",
+ "source_url": "https://mem0.ai/series-a",
+ "sub_theme": "Agent Memory Infrastructure"
+ },
+ {
+ "name": "Mem0 Research: Memory Accuracy Boost",
+ "what": "Research showing persistent memory improves LLM accuracy by 26%",
+ "evidence": "26% accuracy improvement when agents have access to structured persistent memory vs. stateless operation.",
+ "status": "live",
+ "source_url": "https://mem0.ai/research",
+ "sub_theme": "Agent Memory Infrastructure"
+ },
+ {
+ "name": "Anthropic Agentic Coding Trends 2026",
+ "what": "Comprehensive report on how coding agents are reshaping software development",
+ "evidence": "Documents the shift from autocomplete to agentic coding. Average session 23 minutes. Background tasks, sub-agents, and hooks as autonomy mechanisms.",
+ "status": "live",
+ "source_url": "https://resources.anthropic.com/hubfs/2026%20Agentic%20Coding%20Trends%20Report.pdf",
+ "sub_theme": "Industry Reports"
+ },
+ {
+ "name": "Gartner Agentic AI Forecast",
+ "what": "Gartner predicts 80% of customer service issues resolved autonomously by 2029",
+ "evidence": "80% autonomous resolution rate, 30% operational cost reduction predicted. Current 2026 reality is more nuanced with HITL still required for edge cases.",
+ "status": "announced",
+ "source_url": "https://acuvate.com/blog/2026-agentic-ai-expert-predictions/",
+ "sub_theme": "Industry Reports"
+ },
+ {
+ "name": "Cognition Labs (Devin)",
+ "what": "Autonomous coding agent with $10.2B valuation and $73M ARR",
+ "evidence": "Raised $400M in Sept 2025 at $10.2B valuation. ARR grew from $1M (Sept 2024) to $73M (June 2025). Acquired Windsurf, combined ARR ~$155M. Customers include Goldman Sachs, Cisco, Palantir.",
+ "status": "live",
+ "source_url": "https://techcrunch.com/2025/09/08/cognition-ai-defies-turbulence-with-a-400m-raise-at-10-2b-valuation/",
+ "sub_theme": "Autonomous Agent Companies"
+ },
+ {
+ "name": "Limitless AI (formerly Rewind AI)",
+ "what": "Personal screen recording + AI recall that pivoted to hardware then was acquired by Meta",
+ "evidence": "Raised $15M at $350M valuation (May 2023) on $707K ARR (495x multiple). Total $33M+ raised from Altman, a16z, NEA. Acquired by Meta Dec 2025. Desktop app shut down Dec 19, 2025.",
+ "status": "acquired",
+ "source_url": "https://techcrunch.com/2025/12/05/meta-acquires-ai-device-startup-limitless/",
+ "sub_theme": "Personal Data + AI Products"
+ },
+ {
+ "name": "OpenClaw (daily briefing agent)",
+ "what": "Open-source personal AI assistant with 210K+ GitHub stars, configurable daily briefings from connected data",
+ "evidence": "Breakout project of 2026. Surged from 9K to 210K+ stars. Connects to 50+ messaging apps. Users configure cron jobs for morning briefings pulling calendar, email, CRM, weather. Created by PSPDFKit founder.",
+ "status": "live",
+ "source_url": "https://gist.github.com/mberman84/63163d6839053fbf15091238e5ada5c2",
+ "sub_theme": "Daily Briefing / Task Generation Products"
+ },
+ {
+ "name": "ChatGPT Agent Connectors",
+ "what": "ChatGPT's built-in connectors for email, calendar, and task management with autonomous scheduling",
+ "evidence": "Can summarize inbox, find meeting availability, schedule recurring tasks (e.g. weekly metrics reports every Monday). Connectors authenticated per user.",
+ "status": "live",
+ "source_url": "https://openai.com/index/introducing-chatgpt-agent/",
+ "sub_theme": "Daily Briefing / Task Generation Products"
+ }
+ ],
+ "gaps": [
+ "Specific data on what percentage of human interventions are context-related vs. approval-related",
+ "Published comparison of agent performance with vs. without personal user context",
+ "Revenue data for agent memory companies (Mem0, Zep)",
+ "Academic research specifically on personal data as agent context (most research focuses on task-specific context, not cross-platform personal data)"
+ ]
+}
diff --git a/research/personal-data-agents/findings/wave1-graveyard.json b/research/personal-data-agents/findings/wave1-graveyard.json
new file mode 100644
index 00000000..e0853662
--- /dev/null
+++ b/research/personal-data-agents/findings/wave1-graveyard.json
@@ -0,0 +1,119 @@
+{
+ "agent_question": "Failed personal AI and data-driven assistant products",
+ "findings": [
+ {
+ "name": "Humane AI Pin",
+ "what": "Wearable AI pin pitched as a 'beyond the smartphone' device, priced at $699 plus $24/month subscription.",
+ "evidence": "Raised $230M in VC. Sold ~10,000 units against a 100,000-unit plan. Returns outpaced sales from May-August 2024. Revenue of ~$9M lifetime. Marques Brownlee review calling it out viewed 8.5M+ times. HP acquired assets for $116M in February 2025, less than half the capital raised. User data deleted after Feb 28, 2025.",
+ "status": "shut down",
+ "source_url": "https://techcrunch.com/2025/02/18/humanes-ai-pin-is-dead-as-hp-buys-startups-assets-for-116m/",
+ "sub_theme": "AI hardware devices"
+ },
+ {
+ "name": "Rabbit R1",
+ "what": "AI hardware companion device promising to replace apps with a 'Large Action Model' for $199.",
+ "evidence": "Raised $64.7M total from Khosla Ventures and others. Sold ~100,000 units, but only 5,000 used daily (5% retention). Supported only 6 apps (Spotify, Uber, DoorDash, etc.). CEO Jesse Lyu admitted it launched prematurely. Reports of unpaid salaries and employee strikes. Could not reliably book an Uber or fetch weather.",
+ "status": "live",
+ "source_url": "https://9to5google.com/2024/09/26/rabbit-5000-people-use-the-r1-daily/",
+ "sub_theme": "AI hardware devices"
+ },
+ {
+ "name": "Facebook M",
+ "what": "Human-assisted AI assistant inside Facebook Messenger that could complete arbitrary tasks like restaurant reservations and shopping.",
+ "evidence": "Launched August 2015 to ~2,000 users in California. Never scaled beyond ~10,000 test users. Over 70% of requests required human operators ('M trainers'). Scaling to Messenger's 1.3B users would have required prohibitively large human workforce. Shut down January 2018 after ~2.5 years.",
+ "status": "shut down",
+ "source_url": "https://techcrunch.com/2018/01/08/facebook-is-shutting-down-its-standalone-personal-assistant-m/",
+ "sub_theme": "Human-in-the-loop AI assistants"
+ },
+ {
+ "name": "Builder.ai (Natasha)",
+ "what": "AI assistant 'Natasha' that promised anyone could build apps without code, backed by Microsoft and Qatar's sovereign wealth fund.",
+ "evidence": "Raised $445M total including $250M Series D. Valued at $1.2B. Filed for bankruptcy May 2025. Investigation revealed ~700 human engineers in India did most of the work, not AI. Claimed $220M in 2024 revenue; independent audit found ~$50M. Creditor Viola Credit seized $37M triggering insolvency.",
+ "status": "shut down",
+ "source_url": "https://techstartups.com/2025/05/24/builder-ai-a-microsoft-backed-ai-startup-once-valued-at-1-2-billion-files-for-bankruptcy-is-ai-becoming-another-com-bubble/",
+ "sub_theme": "Human-in-the-loop AI assistants"
+ },
+ {
+ "name": "Microsoft Cortana",
+ "what": "Microsoft's voice assistant integrated into Windows, competing with Siri, Alexa, and Google Assistant.",
+ "evidence": "Peak of 145M monthly active users in 2017, but only 10% of Windows 10 users regularly engaged by 2023. Never exceeded 2% voice assistant market share (vs Siri 36%, Google 35%, Alexa 25%). Mobile app removed March 2021. Windows standalone app deprecated August 2023. Replaced by Copilot. Microsoft invested $10B in OpenAI instead.",
+ "status": "shut down",
+ "source_url": "https://techcrunch.com/2023/08/04/microsoft-kills-cortana-in-windows-as-it-focuses-on-next-gen-ai/",
+ "sub_theme": "Big tech personal assistants"
+ },
+ {
+ "name": "Google Now / Google Assistant personal context features",
+ "what": "Google's proactive personal assistant that surfaced contextual cards from email, calendar, location, and personal data.",
+ "evidence": "Google Now launched 2012, gradually deprecated from 2016. Replaced by Google Assistant, which itself removed 17 'underutilized' features in January 2024 including: checking personal travel itineraries, asking about contacts, sending emails/payments/reservations by voice. Google Assistant itself deprecated March 2026, replaced by Gemini.",
+ "status": "shut down",
+ "source_url": "https://blog.google/products/assistant/google-assistant-update-january-2024/",
+ "sub_theme": "Big tech personal assistants"
+ },
+ {
+ "name": "Inflection AI (Pi)",
+ "what": "Personal AI chatbot focused on empathetic conversation rather than task completion, built by DeepMind co-founder Mustafa Suleyman.",
+ "evidence": "Raised $1.525B at $4B valuation. Had 1M daily active users and 6M monthly active users. In March 2024, Microsoft acqui-hired nearly all 70 employees and paid $650M to license its models. Pi still exists but with usage caps and a skeleton crew. Could not find a sustainable business model despite strong user engagement.",
+ "status": "acquired",
+ "source_url": "https://www.eesel.ai/blog/inflection-ai",
+ "sub_theme": "Personal AI chatbots"
+ },
+ {
+ "name": "Rewind AI / Limitless",
+ "what": "Desktop activity recorder that captured everything on your screen and made it searchable, later pivoted to AI wearable pendant.",
+ "evidence": "Raised $33M+ from a16z, First Round Capital, NEA. Rewind's local processing approach limited it to users' CPUs, preventing use of best LLMs. Collected ~11GB per user per month. Rebranded to Limitless April 2024 and launched $99 hardware pendant. Acquired by Meta December 2025; Rewind desktop app shut down, pendant sales halted.",
+ "status": "acquired",
+ "source_url": "https://techcrunch.com/2025/12/05/meta-acquires-ai-device-startup-limitless/",
+ "sub_theme": "Lifelogging and personal data capture"
+ },
+ {
+ "name": "Narrative Clip (formerly Memoto)",
+ "what": "Wearable lifelogging camera that automatically took a photo every 30 seconds.",
+ "evidence": "Raised $550K on Kickstarter in 2012, then additional millions in VC. Launched Clip 1 and Clip 2 (with Bluetooth/WiFi). Entered voluntary dissolution in September 2016. Could not compete with smartphone cameras and live-streaming services (Periscope, Facebook Live). Most people did not want continuous photo capture.",
+ "status": "shut down",
+ "source_url": "https://petapixel.com/2016/09/28/lifelogging-camera-maker-narrative-going-business/",
+ "sub_theme": "Lifelogging and personal data capture"
+ },
+ {
+ "name": "x.ai (Amy/Andrew scheduling AI)",
+ "what": "AI scheduling assistant that handled meeting coordination via email, using virtual personas 'Amy' and 'Andrew'.",
+ "evidence": "Founded by Dennis Mortensen. Raised $44M total ($2M seed 2014, $23M 2016, $10M 2017). Acquired by Bizzabo June 2021 for undisclosed amount. Scheduling tool sunset October 31, 2021. Technology folded into Bizzabo's event management platform rather than continuing as standalone product.",
+ "status": "acquired",
+ "source_url": "https://www.globenewswire.com/news-release/2021/06/03/2241277/0/en/Bizzabo-Acquires-x-ai-to-Launch-AI-powered-Scheduling-Accelerate-Personalized-Event-Experiences.html",
+ "sub_theme": "AI scheduling assistants"
+ },
+ {
+ "name": "Magic Leap",
+ "what": "Augmented reality headset company that promised ambient computing and spatial interfaces.",
+ "evidence": "Raised ~$4.5B total from Google, Qualcomm, Alibaba, AT&T, and Saudi Arabia's PIF. Magic Leap One (2018) priced at $2,300, failed to meet sales targets. Cut ~1,000 workers (half the workforce) in 2020. Pivoted from consumer to enterprise. Cut entire sales and marketing teams (75 jobs) in July 2024. Now pivoting to license optics technology rather than sell headsets.",
+ "status": "live",
+ "source_url": "https://www.roadtovr.com/magic-leap-layoff-2024-optics-pivot/",
+ "sub_theme": "Ambient computing hardware"
+ },
+ {
+ "name": "Forward Health (CarePods)",
+ "what": "AI-powered autonomous medical kiosks ('CarePods') for self-service primary care.",
+ "evidence": "Raised ~$400M+ (some sources say up to $657M including debt). Planned 3,200 CarePods in 2024; launched only 3, one removed shortly after. Blood draws frequently failed. Patients got stuck inside pods. Revenue under $100M since founding (2016). Shut down all operations November 2024.",
+ "status": "shut down",
+ "source_url": "https://www.fiercehealthcare.com/health-tech/primary-care-player-forward-shutters-after-raising-400m-rolling-out-carepods",
+ "sub_theme": "Autonomous AI agents in physical world"
+ },
+ {
+ "name": "Intercom Fin AI Agent",
+ "what": "AI customer support agent by Intercom, handling front-line customer service conversations.",
+ "evidence": "Handled 8M+ queries. First version (GPT-powered, 2023) achieved 23% resolution rate. Second version (Claude-powered, 2024) improved to 51% average resolution rate across thousands of customers. Still requires human escalation for ~49% of conversations. Pricing is $0.99 per resolved conversation.",
+ "status": "live",
+ "source_url": "https://thelettertwo.com/2024/10/12/intercom-releases-fin-2-ai-agent-switching-anthropic-from-openai/",
+ "sub_theme": "Enterprise AI agents"
+ }
+ ],
+ "gaps": [
+ "Clara Labs: Search results show it is still operating (acquired by TopFunnel in 2019). Could not confirm any shutdown. May not qualify as a failure.",
+ "Exact acquisition price for Rewind/Limitless by Meta was not disclosed.",
+ "Exact acquisition price for x.ai by Bizzabo was not disclosed.",
+ "Rabbit R1 current financial status is unclear; reports of unpaid salaries but no confirmed shutdown or bankruptcy filing found.",
+ "Could not find specific cost-per-query data for Facebook M's human operators.",
+ "Did not find additional personal AI startups raising >$5M that shut down 2023-2026 beyond the ones listed. The landscape is large and many smaller failures go unreported.",
+ "Intercom Fin is not a failure; included for contrast. Could not find well-sourced enterprise AI assistant failures with specific numbers.",
+ "Magic Leap is tangential to 'personal AI assistant' theme; included because it represents ambient computing failure but it was primarily an AR/spatial computing bet."
+ ]
+}
diff --git a/research/personal-data-agents/findings/wave1-hitl-taxonomy.json b/research/personal-data-agents/findings/wave1-hitl-taxonomy.json
new file mode 100644
index 00000000..66f46d07
--- /dev/null
+++ b/research/personal-data-agents/findings/wave1-hitl-taxonomy.json
@@ -0,0 +1,158 @@
+{
+ "agent_question": "Taxonomy of human-in-the-loop interventions in coding agent workflows",
+ "findings": [
+ {
+ "name": "Anthropic - Measuring Agent Autonomy in Practice",
+ "what": "Anthropic analyzed millions of Claude Code interactions to categorize autonomy patterns, interruption types, and how human oversight behavior evolves with experience.",
+ "evidence": "Newer users (<50 sessions) auto-approve ~20% of sessions; by 750 sessions this rises to >40%. Interruption rate increases from ~5% of work steps for new users to ~9% for experienced ones. On complex tasks, Claude self-stops for clarification >2x as often as humans interrupt it. 80% of tool calls come from agents with at least one safeguard in place. 73% of actions appear to have a human in the loop. Only 0.8% of actions are irreversible. Published February 2026.",
+ "status": "live",
+ "source_url": "https://www.anthropic.com/research/measuring-agent-autonomy",
+ "sub_theme": "Taxonomy: trust calibration and oversight strategy shift"
+ },
+ {
+ "name": "Anthropic - 2026 Agentic Coding Trends Report",
+ "what": "Report based on Claude Code usage data showing how developer-agent collaboration patterns have changed, including session length, multi-file edits, and autonomy delegation.",
+ "evidence": "78% of Claude Code sessions in Q1 2026 involve multi-file edits, up from 34% in Q1 2025. Average session length increased from 4 minutes (autocomplete era) to 23 minutes (agentic era). Tool calls per session average 47. 99.9th percentile turn duration nearly doubled from <25 min to >45 min between Oct 2025 and Jan 2026. Developer acceptance rate of agent changes is 89% when the agent provides a diff summary vs 62% for raw output. Only 0-20% of tasks can be fully delegated.",
+ "status": "live",
+ "source_url": "https://resources.anthropic.com/2026-agentic-coding-trends-report",
+ "sub_theme": "Taxonomy: trust calibration and oversight strategy shift"
+ },
+ {
+ "name": "Anthropic - Constitutional AI",
+ "what": "Research showing how human oversight can be compressed from thousands of individual preference labels to ~10 natural-language principles, with AI self-critique replacing per-instance human review.",
+ "evidence": "Standard RLHF requires tens of thousands of human preference labels. Constitutional AI reduces this to ~10 principles stated in natural language. The model self-critiques and revises its own outputs, then an AI preference model replaces human labelers in the RL phase. Constitutional RL produces Pareto improvements over RLHF (both more helpful and more harmless). Published December 2022.",
+ "status": "live",
+ "source_url": "https://arxiv.org/abs/2212.08073",
+ "sub_theme": "Alignment: replacing human feedback with structured knowledge"
+ },
+ {
+ "name": "Atlassian HULA Framework",
+ "what": "Human-in-the-Loop LLM-based Agents framework deployed at Atlassian for Jira issue resolution, with measured rates of human intervention at each stage of the coding pipeline.",
+ "evidence": "Deployed to 2,600 practitioners by Sept 2024 across 22,000+ eligible issues. Engineers used HULA on 663 work items in 2 months. Plan generation succeeded for 79% of items. Plan approval rate by humans: 82% (433/527). Code generated for 87% of approved plans. 25% reached pull request stage. 59% of generated PRs merged. ~900 total PRs merged. Unit test pass rate: 31% on SWE-bench. Code rated highly similar to human code in 45% of cases. Published November 2024.",
+ "status": "live",
+ "source_url": "https://arxiv.org/abs/2411.12924",
+ "sub_theme": "Taxonomy: staged intervention points in development pipelines"
+ },
+ {
+ "name": "Atlassian HULA - Challenges and Future Directions",
+ "what": "Follow-up paper identifying two major HITL challenges: high computational costs of unit testing for validation and variability in LLM-based code quality evaluation.",
+ "evidence": "Presented at MSR 2025 Industry Track. Highlights that human review remains necessary because automated evaluation (unit tests + GPT-based similarity scoring) is inconsistent and expensive. Suggests the human-in-the-loop burden cannot be eliminated purely by adding automated checks.",
+ "status": "live",
+ "source_url": "https://arxiv.org/abs/2506.11009",
+ "sub_theme": "Taxonomy: staged intervention points in development pipelines"
+ },
+ {
+ "name": "Cognition Devin - 2025 Performance Review",
+ "what": "Cognition published Devin's annual performance data showing PR merge rates as a proxy for how often human reviewers accept or reject autonomous agent work.",
+ "evidence": "PR merge rate doubled from 34% to 67% year-over-year. Devin merged hundreds of thousands of PRs total. 4x faster at problem-solving, 2x more efficient in resource consumption vs prior year. Described as 'senior-level at codebase understanding but junior at execution.' Customers include Goldman Sachs, Citi, Santander, Nubank. 33% of PRs are still rejected by human reviewers.",
+ "status": "live",
+ "source_url": "https://cognition.ai/blog/devin-annual-performance-review-2025",
+ "sub_theme": "Error correction: human as quality gate"
+ },
+ {
+ "name": "Cognition (Devin) - Funding and Scale",
+ "what": "Cognition's financial trajectory provides context for the market size of human-in-the-loop coding agent products.",
+ "evidence": "$10.2B valuation as of September 2025. $400M raise led by Founders Fund. ARR grew from $1M (Sept 2024) to $73M (June 2025). Combined ARR ~$150M after Windsurf acquisition. Net burn under $20M since founding. Usage-based pricing via Agent Compute Units (ACUs).",
+ "status": "live",
+ "source_url": "https://techcrunch.com/2025/09/08/cognition-ai-defies-turbulence-with-a-400m-raise-at-10-2b-valuation/",
+ "sub_theme": "Market context"
+ },
+ {
+ "name": "GitHub Copilot - Suggestion Acceptance Rates",
+ "what": "GitHub's data on how frequently developers accept, modify, or reject AI code suggestions provides the largest-scale dataset on human intervention frequency.",
+ "evidence": "Average acceptance rate: ~30% of suggestions shown. Rises from 28.9% in first 3 months to 32.1% in next 3 months. Java developers: 61% acceptance rate. 88% code retention rate for accepted suggestions. ZoomInfo enterprise study: 33% suggestion acceptance, 20% line acceptance, 72% developer satisfaction. 4.7M paid subscribers (Jan 2026). 20M cumulative users (July 2025). Used by ~90% of Fortune 100.",
+ "status": "live",
+ "source_url": "https://docs.github.com/en/copilot/concepts/copilot-usage-metrics/copilot-metrics",
+ "sub_theme": "Error correction: human as quality gate"
+ },
+ {
+ "name": "GitHub Copilot - Revenue Scale",
+ "what": "Copilot's financial scale contextualizes the market for tools where humans review every AI suggestion.",
+ "evidence": "4.7M paid subscribers as of Jan 2026. Paid subscriptions grew ~75% YoY. Nadella stated in 2024 that Copilot was a larger business than all of GitHub at the time of Microsoft's 2018 acquisition ($7.5B). Analyst estimates place ARR in mid-hundreds of millions to approaching $1B.",
+ "status": "live",
+ "source_url": "https://www.getpanto.ai/blog/github-copilot-statistics",
+ "sub_theme": "Market context"
+ },
+ {
+ "name": "Cursor (Anysphere) - RL-based Suggestion Filtering",
+ "what": "Cursor uses real-time reinforcement learning on user accept/reject signals to learn when to suggest and when to stay silent, directly encoding human intervention patterns into the model.",
+ "evidence": "Upgraded Tab model produces 21% fewer suggestions but 28% higher acceptance rate. Model receives reward on accept, penalty on reject, nothing on silence. Uses on-policy data from currently-deployed model, retraining multiple times per day. $1B+ ARR as of Nov 2025. $29.3B valuation. 1M+ daily active users (Dec 2025). 9,900% YoY ARR growth.",
+ "status": "live",
+ "source_url": "https://analyticsindiamag.com/ai-news-updates/cursor-is-using-real-time-reinforcement-learning-to-improve-suggestions-for-developers/",
+ "sub_theme": "Alignment: replacing human feedback with structured knowledge"
+ },
+ {
+ "name": "RedMonk - 10 Things Developers Want from Agentic IDEs (2025)",
+ "what": "Analyst report categorizing what developer-practitioners actually demand from agent oversight, distinguishing between trust controls, permission systems, and audit trails.",
+ "evidence": "Developers want: fine-grained permissions for what agents can/cannot do autonomously; approval gates before destructive actions (rm -rf, database writes, deployments); configurable autonomy levels per task type; clear audit trails of every agent action. Microsoft and Red Hat's MCP approach requires least-privilege permissions and surfaces all sensitive operations to the user.",
+ "status": "live",
+ "source_url": "https://redmonk.com/kholterhoff/2025/12/22/10-things-developers-want-from-their-agentic-ides-in-2025/",
+ "sub_theme": "Taxonomy: approval gates (trust) vs context provision (knowledge)"
+ },
+ {
+ "name": "From Prompt Engineering to Prompt Science (ACM / arXiv)",
+ "what": "Research framing prompt engineering as a human-in-the-loop alignment problem, where humans iteratively correct AI misalignment through prompt refinement rather than model retraining.",
+ "evidence": "Identifies four causes of misalignment: unpredictability, lack of transparency, value misalignment, and inherent complexity. Notes alignment is bi-directional (AI to human AND human to AI). A user's definition of alignment evolves over time as they discover new requirements. Proposes multi-phase verification as a systematic replacement for ad-hoc prompt tweaking. Published January 2024.",
+ "status": "live",
+ "source_url": "https://arxiv.org/abs/2401.04122",
+ "sub_theme": "Prompt engineering as human-in-the-loop"
+ },
+ {
+ "name": "CoPrompter (ACM IUI 2025)",
+ "what": "Framework that makes prompt-to-output alignment measurable by generating evaluation criteria from prompt requirements and letting users score outputs against a checklist.",
+ "evidence": "Generates evaluation criteria questions from prompt requirements. Users edit checklist to define alignment. Addresses the problem that prompt engineers cannot identify all points of misalignment through manual inspection alone. Published 2025.",
+ "status": "live",
+ "source_url": "https://dl.acm.org/doi/10.1145/3708359.3712102",
+ "sub_theme": "Prompt engineering as human-in-the-loop"
+ },
+ {
+ "name": "Turing Post - State of AI Coding: Context, Trust, and Subagents",
+ "what": "Industry analysis identifying that developer distrust of coding agents stems primarily from context gaps, not model capability.",
+ "evidence": "When asked why developers don't trust AI coding agents, most said: 'We don't trust the context the model has.' Missing context identified as the critical issue: 'The critical logic sleeps in a Jira ticket from 2019, or worse, it's tribal knowledge.' An AI agent cannot know context that exists only in a person's head.",
+ "status": "live",
+ "source_url": "https://www.turingpost.com/p/aisoftwarestack",
+ "sub_theme": "Taxonomy: approval gates (trust) vs context provision (knowledge)"
+ },
+ {
+ "name": "ByteBridge - From Human-in-the-Loop to Human-on-the-Loop",
+ "what": "Analysis of the transition from per-action approval (HITL) to exception-based oversight (HOTL), where agents run autonomously by default and humans intervene only on anomalies.",
+ "evidence": "HITL: human must review and approve before each action. HOTL: agents autonomous by default, human oversight available when it counts. Moving to HOTL doesn't mean abandoning HITL; it means using HITL strategically with optional interventions on demand. Published January 2026.",
+ "status": "live",
+ "source_url": "https://bytebridge.medium.com/from-human-in-the-loop-to-human-on-the-loop-evolving-ai-agent-autonomy-c0ae62c3bf91",
+ "sub_theme": "Taxonomy: trust calibration and oversight strategy shift"
+ },
+ {
+ "name": "Rakuten + Claude Code - Autonomous Complex Task",
+ "what": "Case study of a coding agent completing a complex task in a 12.5M-line codebase with minimal human intervention, illustrating the upper bound of current agent autonomy.",
+ "evidence": "Rakuten engineers tested Claude Code on implementing activation vector extraction in vLLM (12.5M lines, multiple languages). Agent completed the task in 7 hours of autonomous work with 99.9% numerical accuracy. Cited in Anthropic's 2026 Agentic Coding Trends Report.",
+ "status": "live",
+ "source_url": "https://resources.anthropic.com/2026-agentic-coding-trends-report",
+ "sub_theme": "Error correction: human as quality gate"
+ },
+ {
+ "name": "Devin - Task-Specific Performance (Independent Tests)",
+ "what": "Independent testing reveals that Devin's success rate varies dramatically by task type, with routine tasks succeeding far more often than complex ones.",
+ "evidence": "SWE-bench: solved 13.86% of issues end-to-end. Complex real-world tasks: ~15% completion without human assistance. Security vulnerability fixes: 1.5 min vs 30 min for humans (20x). File migrations: 3-4 hours vs 30-40 hours for humans (10x). The gap between routine and complex task completion rates explains why human oversight remains critical.",
+ "status": "live",
+ "source_url": "https://trickle.so/blog/devin-ai-review",
+ "sub_theme": "Error correction: human as quality gate"
+ },
+ {
+ "name": "Pardon the Interruption (Journal of Management, 2020)",
+ "what": "Integrative review of work interruption research across domains, providing a taxonomy of interruption types, frequencies, and cognitive impacts that predates but directly applies to AI agent workflows.",
+ "evidence": "Published in Journal of Management (2020). Reviews interruption research across workplace contexts. Identifies that interruptions introduce new tasks on top of ongoing activities. Operators must handle multiple simultaneous stimuli and information sources. Provides foundational vocabulary for categorizing agent interruptions.",
+ "status": "live",
+ "source_url": "https://journals.sagepub.com/doi/abs/10.1177/0149206319887428",
+ "sub_theme": "Interruption patterns beyond coding"
+ }
+ ],
+ "gaps": [
+ "No public data found on the specific breakdown of WHY developers interrupt agents (e.g., what % is 'wrong direction' vs 'missing context' vs 'safety concern'). Anthropic's autonomy paper shows frequency shifts but not a categorical breakdown of interruption reasons.",
+ "No quantitative data found distinguishing approval-gate interventions (trust) from context-provision interventions (knowledge) from error-correction interventions (capability). The Turing Post finding that developers distrust context rather than capability is qualitative, not measured.",
+ "Copilot's 70% rejection rate is not broken down by reason. We know 30% of suggestions are accepted but not why the other 70% are rejected (wrong code? wrong timing? partial match? style mismatch?).",
+ "No published research found measuring how much human-in-the-loop overhead could be replaced by personal data (preferences, coding style, project context) vs institutional knowledge vs real-time judgment.",
+ "Limited data on HITL patterns in non-coding autonomous agent workflows (e.g., customer service, data analysis, content creation) with the same level of rigor as the coding domain.",
+ "Anthropic's Constitutional AI paper shows principles CAN replace per-instance human labels, but there is no published measurement of how much residual human oversight Constitutional AI eliminates in production deployment vs standard RLHF.",
+ "No found data on the cost of human-in-the-loop interventions (developer time per interruption, context-switching penalty) in agent workflows specifically."
+ ]
+}
diff --git a/research/personal-data-agents/findings/wave1-products.json b/research/personal-data-agents/findings/wave1-products.json
new file mode 100644
index 00000000..c2e02ee5
--- /dev/null
+++ b/research/personal-data-agents/findings/wave1-products.json
@@ -0,0 +1,152 @@
+{
+ "agent_question": "Products combining personal data aggregation with AI agent capabilities",
+ "findings": [
+ {
+ "name": "ChatGPT Memory + Connectors",
+ "what": "OpenAI's ChatGPT retains persistent facts across conversations and now connects to Gmail, Google Drive, GitHub, Outlook, and other services for real-time personal context.",
+ "evidence": "800M+ weekly active users as of Sep 2025. $13B ARR as of Aug 2025. Memory launched Feb 2024; expanded Apr 2025 to reference all past conversations (not just saved memories). Free users get short-term continuity; Plus/Pro get long-term memory. Connectors (Gmail, Google Calendar, Google Drive, GitHub, Outlook, Teams, etc.) available to Team/Enterprise/Edu plans. MCP-based partner connectors (Stripe, Amplitude, Monday.com, etc.) launched 2025. 5M paying business users as of Aug 2025.",
+ "status": "live",
+ "source_url": "https://openai.com/index/memory-and-new-controls-for-chatgpt/",
+ "sub_theme": "LLM platforms with personal data ingestion"
+ },
+ {
+ "name": "ChatGPT Connectors (data integration layer)",
+ "what": "First-party integrations that let ChatGPT read from Gmail, Google Drive, SharePoint, Dropbox, Box, GitHub, Linear, HubSpot, and more, with synced indexing for some sources.",
+ "evidence": "Google Drive uses synced connector that indexes content in advance for fast retrieval. Most connectors are read-only; actions beyond reading require explicit user confirmation. Available to Team, Enterprise, and Edu plans, not free tier.",
+ "status": "live",
+ "source_url": "https://help.openai.com/en/articles/11487775-connectors-in-chatgpt",
+ "sub_theme": "LLM platforms with personal data ingestion"
+ },
+ {
+ "name": "Limitless AI (formerly Rewind AI)",
+ "what": "Screen recording and conversation capture AI that evolved from desktop app to wearable pendant, then was acquired by Meta.",
+ "evidence": "Founded 2022 as Rewind AI. Raised $15M at $350M valuation in May 2023 on $707K ARR (495x revenue multiple). Raised $27M total. Rebranded to Limitless Apr 2024 with pivot to wearable pendant. Acquired by Meta Dec 5, 2025 for estimated $200-400M (undisclosed official terms). Hardware sales ceased; existing customers moved to free Unlimited Plan for one year. Team joined Meta Reality Labs (Ray-Ban smart glasses).",
+ "status": "acquired",
+ "source_url": "https://techcrunch.com/2025/12/05/meta-acquires-ai-device-startup-limitless/",
+ "sub_theme": "Screen/audio capture + AI recall"
+ },
+ {
+ "name": "Screenpipe",
+ "what": "Open-source (MIT) alternative to Rewind AI that continuously captures screen and audio locally, with AI-powered search and recall.",
+ "evidence": "17.2K GitHub stars as of early 2026. 202 subscriptions, $3.5K MRR (all organic). Works on macOS, Windows, Linux. Uses local OpenAI Whisper for speech-to-text. 9 apps in marketplace, 5 more in development. Investors include Embedding VC, Founders Inc, Top Harvest Capital. Major companies (Microsoft, Intel, Oracle, GitHub, Alibaba Cloud) testing it. Has not reached product-market fit per founders.",
+ "status": "live",
+ "source_url": "https://screenpi.pe/",
+ "sub_theme": "Screen/audio capture + AI recall"
+ },
+ {
+ "name": "Humane AI Pin",
+ "what": "Wearable AI device with camera, projector, and LLM access intended to replace smartphones, which failed commercially and was shut down.",
+ "evidence": "Raised $230-242M from investors including Sam Altman and Marc Benioff. Valued at $884-984M in Mar 2023. Returns outpaced sales by summer 2024. Valuation fell to $25M by Oct 2024. Price cut from $699 to $499. Charging case recalled for battery fire risk. HP acquired assets for $116M in Feb 2025. All devices bricked Feb 28, 2025. Only customers who bought after Nov 15, 2024 got refunds.",
+ "status": "shut down",
+ "source_url": "https://techcrunch.com/2025/02/18/humanes-ai-pin-is-dead-as-hp-buys-startups-assets-for-116m/",
+ "sub_theme": "Personal AI hardware"
+ },
+ {
+ "name": "Rabbit R1",
+ "what": "AI hardware device with 'Large Action Model' intended to replace app-based phone interactions, which suffered massive user abandonment.",
+ "evidence": "Raised $64.7M over 5 rounds. Valued at $100-150M. 130,000 units sold after CES 2024 announcement. Only 5,000 daily active users 5 months after launch (95% abandonment). MKBHD called it 'barely reviewable.' Entire interface discovered to be a single Android app. Founder Jesse Lyu admitted it launched too early. Still operational in late 2025 with improvements via OTA updates but minimal user base.",
+ "status": "live",
+ "source_url": "https://cybernews.com/tech/the-story-of-rabbit-r1/",
+ "sub_theme": "Personal AI hardware"
+ },
+ {
+ "name": "Tab AI",
+ "what": "Wearable AI pendant that listens to conversations all day and builds a persistent personal context model, functioning like always-on ChatGPT without prompting.",
+ "evidence": "Created by Avi Schiffmann (built COVID dashboard used by millions). Raised $1.9M seed at ~$20M valuation. Designed as disk-shaped necklace. Retains and builds on prior conversation context. Originally scheduled for winter/spring 2024 launch; availability status unclear as of early 2026.",
+ "status": "announced",
+ "source_url": "https://venturebeat.com/ai/tabs-always-on-ai-pendant-just-got-funded-but-do-we-need-it/",
+ "sub_theme": "Personal AI hardware"
+ },
+ {
+ "name": "Glean",
+ "what": "Enterprise AI search platform that builds a personalized knowledge graph per employee from 100+ workplace data sources.",
+ "evidence": "$7.2B valuation after $150M Series F in Jun 2025 (up from $4.6B in Sep 2024). $100M+ ARR achieved in under 3 years (fiscal year ending Jan 2025). Integrates with Google Workspace, Microsoft 365, Slack, Salesforce, and more. Sep 2025: launched third-generation assistant with personal graph per employee (tracks projects, collaborators, work style). Personal graph enables agents that summarize weekly work or prepare performance reviews.",
+ "status": "live",
+ "source_url": "https://www.glean.com/press/glean-achieves-100m-arr-in-three-years-delivering-true-ai-roi-to-the-enterprise",
+ "sub_theme": "Enterprise knowledge + AI"
+ },
+ {
+ "name": "Dust.tt",
+ "what": "Enterprise AI agent platform using Claude and MCP protocol to create specialized agents connected to company knowledge and business applications.",
+ "evidence": "Raised $21.5M total ($16M Series A led by Sequoia Capital). Hit $6M ARR by Dec 2025. Uses Anthropic Claude models + MCP protocol. MCP integrations with Asana, Jira, GitHub, Google Drive, Gong, Gmail, Google Calendar, Salesforce, HubSpot, Notion. 2026 vision: multi-player agents where teams share context with AI teammates, infrastructure for managing thousands of agents.",
+ "status": "live",
+ "source_url": "https://venturebeat.com/ai/dust-hits-6m-arr-helping-enterprises-build-ai-agents-that-actually-do-stuff-instead-of-just-talking",
+ "sub_theme": "Enterprise knowledge + AI"
+ },
+ {
+ "name": "Notion AI",
+ "what": "AI features embedded in Notion's workspace platform, including autonomous agents that execute multi-step workflows across connected data sources.",
+ "evidence": "$500M annualized revenue as of Sep 2025 (up from $400M in 2024, $250M in 2023, $67M in 2022). 100M+ users worldwide, 20M+ monthly active users. AI adoption: crossed 50% of paying customers using AI features in 2025 (up from 10-20% in 2024). Notion 3.0 (Sep 2025) launched autonomous AI Agents with multi-model support (GPT-5, Claude Opus 4.1, o3). AI now bundled into Business/Enterprise plans rather than sold as add-on.",
+ "status": "live",
+ "source_url": "https://www.cnbc.com/2025/09/18/notion-launches-ai-agent-as-it-crosses-500-million-in-annual-revenue.html",
+ "sub_theme": "Enterprise knowledge + AI"
+ },
+ {
+ "name": "Otter.ai",
+ "what": "Meeting transcription platform that evolved from passive note-taking to voice-activated AI meeting agents that learn from company-wide meeting data.",
+ "evidence": "$100M ARR milestone reached Mar 2025 (up from $81M end of 2024). Raised ~$70M total ($50M in 2021 including $40M led by Spectrum Equity). Mar 2025: launched industry's first voice AI meeting agent suite. Otter Meeting Agent participates in meetings as voice-activated participant, answers questions using real-time transcript + company meeting history. Features vocabulary learning and preference personalization.",
+ "status": "live",
+ "source_url": "https://otter.ai/blog/otter-ai-caps-transformational-2025-with-100m-arr-milestone-industry-first-ai-meeting-agents-and-global-enterprise-expansion",
+ "sub_theme": "Meeting AI with personal context"
+ },
+ {
+ "name": "Mem.ai",
+ "what": "AI-powered note-taking app that attempts to be a 'second brain' with AI-driven organization and recall, but has faced criticism for underpowered AI and missing basic features.",
+ "evidence": "Raised $29.1M total ($23.5M Series A led by OpenAI Startup Fund, valued at $110M post-money in Nov 2022). Revenue not publicly disclosed. Criticized for: no offline access, underpowered AI chat ('much inferior version of ChatGPT'), missing basic features (no highlighting, no scroll bar), slow performance. Faces competition from Microsoft OneNote, Google Keep, and Notion. Development appears stalled per user reports.",
+ "status": "live",
+ "source_url": "https://medium.com/@theo-james/mem-ai-the-40m-second-brain-failure-burning-the-worlds-money-5f3176a34cbd",
+ "sub_theme": "Personal knowledge management + AI"
+ },
+ {
+ "name": "Personal.ai",
+ "what": "Platform that builds personal AI models from user's data, facts, and opinions to remember and recall memories core to identity.",
+ "evidence": "Raised $7.8-23.8M (sources vary by methodology). $138K annual revenue as of Dec 2023. Founded 2020 in San Francisco. Uses personal AI models trained on individual user's data. Enables organizations to create AI teammates with proprietary knowledge. Latest funding: undisclosed Seed round Sep 2024.",
+ "status": "live",
+ "source_url": "https://tracxn.com/d/companies/personal-ai/__km3jbW0uSOjjopNTO7_osOJmYRwGU7-zqIKWilUz1co",
+ "sub_theme": "Personal knowledge management + AI"
+ },
+ {
+ "name": "Mem0 (not Mem.ai)",
+ "what": "Universal memory layer infrastructure for AI applications that stores, updates, and recalls information across conversations, used by AI agent frameworks.",
+ "evidence": "Raised $24M (Seed led by Kindred Ventures, Series A led by Basis Set Ventures, with Peak XV, GitHub Fund, Y Combinator). 41K+ GitHub stars, 13M+ Python package downloads. Processed 186M API calls in Q3 2025 (up from 35M in Q1, ~30% MoM growth). Claims 26% higher response accuracy vs OpenAI's memory, 91% lower p95 latency, 90%+ token cost savings. Exclusive memory provider for AWS Agent SDK. Natively integrated by CrewAI, Flowise, Langflow.",
+ "status": "live",
+ "source_url": "https://techcrunch.com/2025/10/28/mem0-raises-24m-from-yc-peak-xv-and-basis-set-to-build-the-memory-layer-for-ai-apps/",
+ "sub_theme": "AI memory infrastructure"
+ },
+ {
+ "name": "MCP (Model Context Protocol) Ecosystem",
+ "what": "Anthropic's open protocol for connecting AI models to external data sources and tools, now adopted across the industry as a standard for personal and enterprise data access.",
+ "evidence": "5,800+ MCP servers and 300+ MCP clients as of early 2026. 1,000+ community-built servers covering Google Drive, Slack, databases, and custom systems. OpenAI adopted MCP for ChatGPT partner connectors. Microsoft building Azure API Center integration for MCP server registries. Streamable HTTP transport enables remote MCP servers in production. 2026 roadmap: triggers/events, streamed results, security/authorization hardening, registry/discovery infrastructure.",
+ "status": "live",
+ "source_url": "https://en.wikipedia.org/wiki/Model_Context_Protocol",
+ "sub_theme": "AI memory infrastructure"
+ },
+ {
+ "name": "Obsidian + AI Plugins (Copilot, Smart Connections)",
+ "what": "Community-built AI plugins for Obsidian that provide chat-based vault search, personal context processing, and agent mode with tool calling over local markdown files.",
+ "evidence": "Obsidian Copilot plugin provides: Project Mode (AI-ready context from folders/tags), Agent Mode with tool calling, support for markdown/PDF/image/YouTube/URL context, custom system prompts stored as markdown files. Smart Connections plugin offers semantic search across notes. All data stays local in user's vault. No revenue/funding data available (open-source community plugins). Active development through 2026.",
+ "status": "live",
+ "source_url": "https://www.obsidiancopilot.com/en",
+ "sub_theme": "Personal knowledge management + AI"
+ },
+ {
+ "name": "Apple Intelligence (Personal Context)",
+ "what": "Apple's on-device AI system that was supposed to give Siri access to emails, messages, photos, and personal data for contextual assistance, but has been repeatedly delayed.",
+ "evidence": "Announced WWDC 2024. Personal context features (Siri accessing emails, photos, messages, form-filling from driver's license photos, etc.) delayed from 2025 to spring 2026. Craig Federighi stated: 'when it comes to automating capabilities on devices in a reliable way, no one's doing it really well right now.' On-device model is ~3B parameters on Apple Silicon. As of Feb 2026, 'Siri testing isn't going well' per AppleInsider; features may slip past iOS 26.4. Apple has ~2.2B active devices that would receive this.",
+ "status": "announced",
+ "source_url": "https://www.cnbc.com/2025/03/07/apple-delays-siri-ai-improvements-to-2026.html",
+ "sub_theme": "Platform-native personal AI"
+ }
+ ],
+ "gaps": [
+ "Personal.ai revenue and user numbers beyond 2023 ($138K ARR) are not publicly available; the company appears small relative to funding but specific current metrics are missing.",
+ "Mem.ai current user counts and revenue are not publicly disclosed; impossible to assess actual traction vs. the $110M post-money valuation.",
+ "Tab AI's actual shipping status and real-world user feedback are unclear; the product was announced in 2024 but confirmation of general availability is sparse.",
+ "Google Gemini's personal data integration (Gmail, Calendar, Drive context in Gemini Advanced) was not deeply researched in this wave; it belongs in a follow-up.",
+ "Microsoft Copilot's personal data context (across M365 apps, Microsoft Graph) was not covered in this wave and is a major player that warrants dedicated research.",
+ "No specific products were found that ingest cross-platform personal data (e.g., Spotify + GitHub + LinkedIn together) and generate task priorities or agent actions. This is the gap Vana would fill.",
+ "Screenpipe's actual user count (beyond 202 subscriptions) and daily active usage data are not available.",
+ "Privacy/regulatory outcomes for always-listening wearables (Tab, Limitless pendant) are not well-documented beyond general concerns.",
+ "Accuracy and hallucination rates for personal-data-augmented AI responses are not benchmarked across products (only Mem0 publishes comparative claims)."
+ ]
+}
diff --git a/research/personal-data-agents/findings/wave2-knowledge-gaps.json b/research/personal-data-agents/findings/wave2-knowledge-gaps.json
new file mode 100644
index 00000000..c5f4f93c
--- /dev/null
+++ b/research/personal-data-agents/findings/wave2-knowledge-gaps.json
@@ -0,0 +1,320 @@
+{
+ "agent_question": "Missing angles in personal data as agent context research",
+ "findings": [
+ {
+ "name": "Spain AEPD Agentic AI Guide",
+ "what": "Spain's data protection authority published a 71-page regulatory analysis of how agentic AI systems create structural privacy risks distinct from conventional AI tools",
+ "evidence": "Published February 2026; titled 'Agentic Artificial Intelligence from the Perspective of Data Protection'; identifies persistent memory profiles, autonomous multi-service access, and consequential actions without human checkpoints as novel risks",
+ "status": "live",
+ "source_url": "https://ppc.land/spains-data-watchdog-maps-the-hidden-gdpr-risks-of-agentic-ai/",
+ "sub_theme": "Regulatory barriers"
+ },
+ {
+ "name": "UK ICO Agentic AI Early Views",
+ "what": "The UK Information Commissioner's Office flagged data minimization and transparency as specific compliance risks for agentic AI systems",
+ "evidence": "ICO notes that when an agent's scope is uncertain, defining what data is 'necessary' becomes harder; complexity of agent data flows makes it difficult to explain processing to individuals; agents communicating with other agents create unobservable data flows",
+ "status": "live",
+ "source_url": "https://www.insideprivacy.com/artificial-intelligence/ico-shares-early-views-on-agentic-ai-data-protection/",
+ "sub_theme": "Regulatory barriers"
+ },
+ {
+ "name": "CCPA Automated Decision-Making Regulations",
+ "what": "California finalized CCPA regulations specifically addressing automated decision-making technology (ADMT), creating new compliance requirements for AI agents processing personal data",
+ "evidence": "Approved September 22, 2025; compliance required from January 1, 2026 for some provisions, January 1, 2027 for ADMT consumer rights; CPPA issued over $100 million in enforcement actions in 2024; ADMT broadly defined as any technology that processes personal information to replace or substantially replace human decision-making",
+ "status": "live",
+ "source_url": "https://www.wiley.law/alert-California-Finalizes-Pivotal-CCPA-Regulations-on-AI-Cyber-Audits-and-Risk-Governance",
+ "sub_theme": "Regulatory barriers"
+ },
+ {
+ "name": "EU AI Act High-Risk Deadline August 2026",
+ "what": "Full enforcement of high-risk AI system requirements under EU AI Act Annex III takes effect August 2, 2026, with penalties up to 35M EUR or 7% of global turnover",
+ "evidence": "Requires conformity assessments, technical documentation, CE marking, EU database registration; specific attention to data minimization in large-scale ingestion models, purpose limitation across agentic workflows, transparency in conversational interfaces, accuracy in real-time data synthesis",
+ "status": "announced",
+ "source_url": "https://www.legalnodes.com/article/eu-ai-act-2026-updates-compliance-requirements-and-business-risks",
+ "sub_theme": "Regulatory barriers"
+ },
+ {
+ "name": "EDPB LLM Anonymization Report",
+ "what": "European Data Protection Board clarified that large language models rarely achieve anonymization standards",
+ "evidence": "April 2025 report; controllers deploying third-party LLMs must conduct comprehensive legitimate interests assessments; undermines the argument that LLM training anonymizes personal data",
+ "status": "live",
+ "source_url": "https://www.dpocentre.com/data-protection-ai-governance-2025-2026/",
+ "sub_theme": "Regulatory barriers"
+ },
+ {
+ "name": "Usercentrics acquires MCP Manager",
+ "what": "Global consent management leader acquired MCP Manager to extend consent infrastructure into agentic AI workflows",
+ "evidence": "Announced January 14, 2026; first major privacy company to extend consent guardrails into MCP-based AI workflows; addresses gap where regular MCPs don't include consent checks for data access",
+ "status": "live",
+ "source_url": "https://usercentrics.com/press/usercentrics-acquires-mcp-manager/",
+ "sub_theme": "Regulatory barriers"
+ },
+ {
+ "name": "Relyance AI Consumer Trust Survey",
+ "what": "Survey showing 4 in 5 consumers believe companies train AI on their data without disclosure; only ~1 in 10 willing to share sensitive data",
+ "evidence": "2025 survey; 82% see data loss threat; only around 10% very willing to share financial, communication, or biometric data; more than half not willing to share even in exchange for better digital experiences",
+ "status": "live",
+ "source_url": "https://www.relyance.ai/consumer-ai-trust-survey-2025",
+ "sub_theme": "Trust and adoption curve"
+ },
+ {
+ "name": "YouGov AI Trust Survey",
+ "what": "35% of Americans use AI weekly but only 5% trust it deeply; 40% say they would never enter personal or financial information into an AI tool",
+ "evidence": "2025 survey; 48% cite data exposure as primary adoption barrier (outranking hallucinations); 51% of Gen Z report weekly AI usage vs 25% of baby boomers; trust in AI saw a 16-point increase in 2025 but from a low base",
+ "status": "live",
+ "source_url": "https://yougov.com/en-us/articles/53701-most-americans-use-ai-but-still-dont-trust-it",
+ "sub_theme": "Trust and adoption curve"
+ },
+ {
+ "name": "ARF Data Sharing Trust Study",
+ "what": "Trust in AI surged 16 points in 2025 but consumers take an increasingly transactional view of data sharing, demanding clear value exchange",
+ "evidence": "Nearly 60% of consumers willing to share data for personalized shopping recommendations; willingness varies sharply by context; only 19% trust AI in finance",
+ "status": "live",
+ "source_url": "https://www.prnewswire.com/news-releases/trust-in-ai-surges-as-consumers-take-a-more-transactional-view-of-data-sharing-arf-study-finds-302667046.html",
+ "sub_theme": "Trust and adoption curve"
+ },
+ {
+ "name": "Simon Willison ChatGPT Memory Critique",
+ "what": "Prominent developer flagged ChatGPT's memory feature as creating an unsolicited 'dossier' effect, demonstrating the context collapse problem in personal AI",
+ "evidence": "May 2025; ChatGPT inferred user's location (Half Moon Bay) from prior conversations and inserted it into an unrelated image generation request; illustrates 'context collapse' where data from different spheres of user activity (work, family, hobbies) blurs together",
+ "status": "live",
+ "source_url": "https://simonwillison.net/2025/May/21/chatgpt-new-memory/",
+ "sub_theme": "Trust and adoption curve"
+ },
+ {
+ "name": "Microsoft Recall Privacy Backlash",
+ "what": "Microsoft's Recall feature (screenshots every few seconds for AI context) faced severe backlash, was delayed, redesigned from plaintext to encrypted, and made opt-in",
+ "evidence": "Initial version stored all data in plaintext database; researchers found it captured passwords, financial data, private messages, medical records; Microsoft made it opt-in and added full database encryption; later Gaming Copilot also caught capturing gameplay images and sending data by default",
+ "status": "live",
+ "source_url": "https://time.com/6980911/microsoft-copilot-recall-ai-features-privacy-concerns/",
+ "sub_theme": "Trust and adoption curve"
+ },
+ {
+ "name": "Apple Intelligence On-Device Foundation Models",
+ "what": "Apple deployed a ~3B parameter on-device model on Apple silicon, enabling AI features without sending user data to external servers, with zero API costs for developers",
+ "evidence": "Supports 16 languages (expanded from English-only); Foundation Models framework gives developers direct access at zero inference cost; 2.3 billion active Apple devices in 2025; Siri overhaul expected spring 2026 with multi-step task capability",
+ "status": "live",
+ "source_url": "https://machinelearning.apple.com/research/apple-foundation-models-2025-updates",
+ "sub_theme": "On-device personal AI"
+ },
+ {
+ "name": "Samsung Personal Data Engine (PDE)",
+ "what": "Samsung's on-device Personal Data Engine powered by RDFox knowledge graph technology creates hyper-personalized experiences while keeping data on-device",
+ "evidence": "Unveiled at Galaxy UNPACKED 2025 with Galaxy S25 series; powered by RDFox from Oxford Semantic Technologies; secured by Knox Vault and KEEP encryption; drives Now Brief and Smart Gallery search; Galaxy S26 (February 2026) added hardware-based Privacy Display on Ultra model",
+ "status": "live",
+ "source_url": "https://www.computerweekly.com/news/366618319/Samsung-unpacks-Galaxy-AIs-personal-data-engine",
+ "sub_theme": "On-device personal AI"
+ },
+ {
+ "name": "Google Personal Intelligence",
+ "what": "Google connected Gemini to Gmail, Photos, YouTube history, and Search history under the 'Personal Intelligence' brand, making it the largest-scale personal data AI integration",
+ "evidence": "Launched January 14, 2026 in Gemini app; expanded to AI Mode in Search on January 22, 2026; rolling out to free Gemini users in US as of March 2026; built on Gemini 3 Pro and Flash models; enables cross-referencing private emails with real-time market data, managing travel itineraries",
+ "status": "live",
+ "source_url": "https://blog.google/innovation-and-ai/products/gemini-app/personal-intelligence/",
+ "sub_theme": "On-device personal AI"
+ },
+ {
+ "name": "Data Transfer Initiative (DTI)",
+ "what": "Independent nonprofit (spun out of Google's Data Transfer Project in 2022) with Apple, Google, and Meta as founding partners, building open-source data portability infrastructure",
+ "evidence": "Apple and Google expanded direct photo/video transfer between Google Photos and iCloud Photos; EU DMA designates iOS and Android as services obligated to facilitate effective data portability; Apple and Google announced OS-level switching collaboration in late 2025",
+ "status": "live",
+ "source_url": "https://dtinit.org/blog/2024/07/10/DTI-members-new-photo-video-tool",
+ "sub_theme": "Data portability infrastructure"
+ },
+ {
+ "name": "Solid Project (Tim Berners-Lee / Inrupt)",
+ "what": "Standards-based personal data pods with time-bound access grants to apps and AI agents, proposed as alternative to platform data hoards",
+ "evidence": "Originally released 2016; Inrupt launched 2018; Berners-Lee's 2025 book positions Solid as counter to AI built on platform data; developer community remains small; Berners-Lee acknowledges 'not ready for general adoption yet'; network effects remain the core obstacle",
+ "status": "live",
+ "source_url": "https://en.wikipedia.org/wiki/Solid_(web_decentralization_project)",
+ "sub_theme": "Data portability infrastructure"
+ },
+ {
+ "name": "Digi.me / Meeco / MyDex Personal Data Stores",
+ "what": "First-generation personal data vault companies offering encrypted user-controlled storage with selective sharing",
+ "evidence": "Digi.me: end-to-end encrypted vault for financial, fitness, medical data; Meeco: ISO 27001 accredited, Zero Knowledge Value architecture, pioneering since 2012; MyDex: CIC (Community Interest Company) focused on health conditions and identity; all remain small-scale with limited consumer adoption after years of operation",
+ "status": "live",
+ "source_url": "https://pmc.ncbi.nlm.nih.gov/articles/PMC9921726/",
+ "sub_theme": "Data portability infrastructure"
+ },
+ {
+ "name": "RAG Cost Economics for Personal Data",
+ "what": "Embedding and indexing personal documents is cheap (~$0.001-0.01/doc), but operational staffing and infrastructure costs often exceed cloud bills",
+ "evidence": "10,000 documents can be embedded and indexed for under $100; RAG cuts fine-tuning spend by 60-80%; but operational staffing for small teams costs $750k+ annually; fine-tuning a 70B model costs $50k-200k in compute; Google embedding generation at $0.15 per 1M tokens",
+ "status": "live",
+ "source_url": "https://thedataguy.pro/blog/2025/07/the-economics-of-rag-cost-optimization-for-production-systems/",
+ "sub_theme": "Economics of personal data ingestion"
+ },
+ {
+ "name": "Data Monetization Market",
+ "what": "Global data monetization market valued at $3.75B in 2024, projected to reach $28.16B by 2033",
+ "evidence": "CAGR of 25.1%; North America 41.21% market share in 2024; 30% of large organizations expected to directly monetize data externally by 2025; 60% of companies cite compliance concerns as primary barrier; organizations commercializing data via APIs report recurring revenue growth exceeding 20% annually",
+ "status": "live",
+ "source_url": "https://www.grandviewresearch.com/industry-analysis/data-monetization-market",
+ "sub_theme": "Economics of personal data ingestion"
+ },
+ {
+ "name": "Cursor (Anysphere) Codebase Context at Scale",
+ "what": "AI code editor demonstrating the economics of personal codebase context: $500M ARR, 1M+ daily active users, $20-200/month tiers",
+ "evidence": "Over 1 million daily active users as of December 2025; $500M ARR within 2.5 years; valued at $29.3B; raised $3.4B across 7 funding rounds; RAG-based system analyzes entire codebase for context; Pro tier $20/month, Ultra tier $200/month for 20x usage",
+ "status": "live",
+ "source_url": "https://research.contrary.com/company/cursor",
+ "sub_theme": "Economics of personal data ingestion"
+ },
+ {
+ "name": "Plaid Financial Data Infrastructure",
+ "what": "Financial data aggregation platform demonstrating the value of personal data connectivity, with AI-powered enrichment and over $800M estimated annual revenue",
+ "evidence": "Crossed estimated $800M annual revenue in 2025; 220+ new products/features in 2025; AI-powered auto-repair enabled 2M+ successful user logins and reduced degradation fix time by 90%; launched LendScore credit risk score in October 2025 using real-time cash flow data",
+ "status": "live",
+ "source_url": "https://sacra.com/c/plaid/",
+ "sub_theme": "Economics of personal data ingestion"
+ },
+ {
+ "name": "Cold Start Personalization Research",
+ "what": "A single AI task can involve 20-30 preference dimensions but individual users care about only 2-4, making cold start a navigation problem in high-dimensional space",
+ "evidence": "February 2025 arxiv paper; with a limited interaction budget, the assistant cannot ask about all dimensions and must find the sparse subset relevant to each user within a handful of questions; strategies include onboarding elicitation, leveraging existing data (e.g., loyalty card history), and real-time streaming updates that turn cold start into a short-term condition",
+ "status": "live",
+ "source_url": "https://arxiv.org/html/2602.15012",
+ "sub_theme": "Cold start problem"
+ },
+ {
+ "name": "Skyflow Privacy Vault",
+ "what": "Privacy-as-infrastructure company providing tokenized data vaults that can serve as a trust layer between personal data and AI models",
+ "evidence": "Raised $100M total equity including $30M Series B extension led by Khosla Ventures (April 2024); supports nearly 1 billion records; processes over 2 billion API calls per quarter; LLM Privacy Vault product detects sensitive data and replaces with deterministic tokens before sending to AI models",
+ "status": "live",
+ "source_url": "https://www.skyflow.com/post/generative-ai-data-privacy-skyflow-llm-privacy-vault",
+ "sub_theme": "Privacy infrastructure for AI"
+ },
+ {
+ "name": "OWASP Prompt Injection #1 Vulnerability",
+ "what": "Prompt injection ranked #1 critical vulnerability in OWASP 2025 Top 10 for LLM Applications, appearing in 73% of production AI deployments assessed",
+ "evidence": "OpenAI stated in December 2025 that prompt injection 'is unlikely to ever be fully solved'; indirect prompt injection targets where AI systems collect information (documents, emails, web pages); an attacker can embed hidden commands that override user instructions to extract emails, steal personal data, or access passwords",
+ "status": "live",
+ "source_url": "https://genai.owasp.org/llmrisk/llm01-prompt-injection/",
+ "sub_theme": "Privacy infrastructure for AI"
+ },
+ {
+ "name": "Supermemory",
+ "what": "AI memory startup extracting 'memories' from unstructured data for application context, founded by a 19-year-old",
+ "evidence": "Raised $2.6M seed led by Susa Ventures, Browder Capital, and SF1.vc; backed by Google executives; October 2025",
+ "status": "live",
+ "source_url": "https://techcrunch.com/2025/10/06/a-19-year-old-nabs-backing-from-google-execs-for-his-ai-memory-startup-supermemory/",
+ "sub_theme": "Memory and context infrastructure"
+ },
+ {
+ "name": "Memories.ai",
+ "what": "Building a Large Visual Memory Model (LVMM) for long-term visual memory, founded by two former Meta Reality Labs researchers",
+ "evidence": "Founded 2025 in San Francisco; $8M seed round led by Susa Ventures with Samsung Next participation; focuses on visual memory capabilities for AI",
+ "status": "live",
+ "source_url": "https://wellows.com/blog/ai-startups/",
+ "sub_theme": "Memory and context infrastructure"
+ },
+ {
+ "name": "Dust.tt Enterprise AI Agents",
+ "what": "Enterprise AI agent platform connecting to company data (Notion, Google Drive, Slack, Intercom) for contextual assistance",
+ "evidence": "Founded by ex-Stripe acquisition founders; $21.5M raised including $16M Series A led by Sequoia; $7.3M ARR as of mid-2025 with 66-person team; 80,000 agents created, 12 million conversations in 2025; customers like Clay, Qonto achieve 70%+ weekly AI adoption rates",
+ "status": "live",
+ "source_url": "https://dust.tt/blog/dust-wrapped-2025",
+ "sub_theme": "Memory and context infrastructure"
+ },
+ {
+ "name": "Notion AI Agents (3.0)",
+ "what": "Notion launched autonomous AI agents that execute multi-step workflows across workspace data, with 100M+ users providing personal/team context",
+ "evidence": "Notion 3.0 launched September 2025 with autonomous agents; 100M+ users worldwide; multi-model support (GPT-5, Claude Opus 4.1, o3); January 2026 3.2 release brought agents to mobile with intelligent auto-model selection; Business plan $20/user/month required for AI features",
+ "status": "live",
+ "source_url": "https://www.notion.com/releases/2025-09-18",
+ "sub_theme": "Memory and context infrastructure"
+ },
+ {
+ "name": "Perplexity Personal Computer Agent",
+ "what": "Perplexity extending cloud AI agent to desktop, accessing local files and apps with user approval and audit trail",
+ "evidence": "March 2026 announcement; Google Drive integration live for all users; Enterprise Pro/Max users can sync Drive to personal repository; Personal Computer runs in 'secure environment with clear safeguards,' requires user approval for sensitive actions, generates full audit trail, includes kill switch",
+ "status": "announced",
+ "source_url": "https://www.technology.org/2026/03/13/perplexity-personal-computer-ai/",
+ "sub_theme": "Memory and context infrastructure"
+ },
+ {
+ "name": "OpenAI Operator/ChatGPT Agent",
+ "what": "OpenAI's computer-using agent integrated into ChatGPT, accessing user connectors and data for agentic web tasks",
+ "evidence": "Launched as Operator January 2025; integrated into ChatGPT as agent July 17, 2025; powered by Computer-Using Agent (CUA) model; accesses user connectors and logged-in websites via takeover mode; OpenAI acknowledged prompt injection may never be fully solved",
+ "status": "live",
+ "source_url": "https://openai.com/index/introducing-chatgpt-agent/",
+ "sub_theme": "Memory and context infrastructure"
+ },
+ {
+ "name": "Anthropic Claude Memory (Markdown-based)",
+ "what": "Anthropic chose transparent file-based memory (CLAUDE.md files) over vector databases, now available for Pro, Max, Team, and Enterprise users",
+ "evidence": "Launched September 2025 for Team/Enterprise; expanded to Pro and Max; 1M token context window generally available for Opus 4.6 and Sonnet 4.6; MCP protocol enables connections to Google Drive, Slack, GitHub, Postgres; Desktop Extensions for one-click MCP server installation",
+ "status": "live",
+ "source_url": "https://platform.claude.com/docs/en/agents-and-tools/tool-use/memory-tool",
+ "sub_theme": "Memory and context infrastructure"
+ },
+ {
+ "name": "ChatGPT Full Chat History Memory",
+ "what": "OpenAI expanded ChatGPT memory to reference all past chats, not just explicit memories, creating de facto personal data dossier",
+ "evidence": "April 2025: memory can reference all past chats; June 2025: rolled out to free users; Incognito mode available for all; privacy team lead says 'memories are only visible to you'; context collapse problem demonstrated by location inference across unrelated conversations",
+ "status": "live",
+ "source_url": "https://www.axios.com/2025/07/11/chatgpt-memory-update",
+ "sub_theme": "Memory and context infrastructure"
+ },
+ {
+ "name": "Humane AI Pin Shutdown",
+ "what": "AI wearable that promised ambient personal AI context failed on reliability, speed, battery life, and heat; sold to HP after burning $230M",
+ "evidence": "Launched late 2024; discontinued February 2025; sold team and IP to HP for $116M; customers' pins remotely disabled after cloud services shut down; reviews cited unreliable, slow, confusing experience",
+ "status": "shut down",
+ "source_url": "https://techcrunch.com/2025/12/09/top-ai-startups-that-shut-down-in-2025-what-founders-can-learn/",
+ "sub_theme": "Failed products and lessons"
+ },
+ {
+ "name": "Limitless (Rewind AI) Acquired by Meta",
+ "what": "Personal data recording pendant company acquired by Meta to build 'personal superintelligence' wearables, after pivoting from desktop screen recording",
+ "evidence": "Acquired December 2025; raised $33M+ from a16z, First Round, NEA; $15M at $350M valuation in May 2023 (495x multiple on $707K ARR); pendant sold for $99 with $20/month Pro plan; stopped selling hardware post-acquisition; customers moved to free Unlimited Plan",
+ "status": "acquired",
+ "source_url": "https://techcrunch.com/2025/12/05/meta-acquires-ai-device-startup-limitless/",
+ "sub_theme": "Failed products and lessons"
+ },
+ {
+ "name": "2025 AI Startup Shutdown Trends",
+ "what": "The market is aggressively filtering for companies with proprietary data advantage, real unit economics, and deep enterprise integration; thin GPT-wrappers are dying",
+ "evidence": "966 startups shut down in 2024 (25.6% increase from 769 in 2023); 2023-2024 cycle rewarded speed and UX leading to a long tail of thin GPT-wrapper products; 2025 data shows market shifted to require proprietary data advantage and real unit economics",
+ "status": "live",
+ "source_url": "https://simpleclosure.com/blog/posts/state-of-startup-shutdowns-2025/",
+ "sub_theme": "Failed products and lessons"
+ },
+ {
+ "name": "YC S25 Batch AI Concentration",
+ "what": "88% of YC S25 batch (141 of 169 startups) classified as AI-native, with 50%+ building agentic AI, mostly domain-specific copilots rather than general personal AI",
+ "evidence": "Summer 2025; 169 startups total; highest concentration of AI startups in YC history; domain-specific focus: insurance claim appeals, mortgage applications; doubling down on narrow high-impact applications rather than general personal assistants",
+ "status": "live",
+ "source_url": "https://catalaize.substack.com/p/y-combinator-s25-batch-profile-and",
+ "sub_theme": "YC and accelerator landscape"
+ },
+ {
+ "name": "YC W26 Batch AI Agents and Healthcare",
+ "what": "YC W26 batch building infrastructure for AI agents to operate inside companies, with healthcare making up nearly 10% of batch",
+ "evidence": "Winter 2026; nearly 50% of batch identified as AI agent companies; healthcare gaining momentum including wearable technologies and drug discovery; examples include Corvera (CPG operations), Arcline (legal drafting)",
+ "status": "live",
+ "source_url": "https://www.tldl.io/blog/yc-ai-startups-2026",
+ "sub_theme": "YC and accelerator landscape"
+ },
+ {
+ "name": "Vana Protocol",
+ "what": "EVM-compatible Layer 1 blockchain for personal data sovereignty, enabling user-owned AI through private data transactions with DataDAOs",
+ "evidence": "Mainnet launched December 2024; over 12 million data points onboarded through multiple DataDAOs; VRC-20 token standard for data-backed digital assets introduced April 2025; YZi Labs investment with CZ advisory role; VANA trading $8-15 in 2025",
+ "status": "live",
+ "source_url": "https://www.vana.org/",
+ "sub_theme": "Data portability infrastructure"
+ }
+ ],
+ "gaps": [
+ "Specific adoption/usage statistics for Apple Intelligence features are not publicly reported; no data on what percentage of eligible users have enabled on-device AI features",
+ "No specific YC W26 or S25 startups found focused on personal data portability for AI agent context specifically; the intersection of data portability + AI agents appears to be a white space in accelerator batches",
+ "Cost-per-user economics for personal AI context (storage + retrieval + inference) at consumer scale are not well-documented; enterprise RAG costs are published but consumer personal AI unit economics are proprietary",
+ "No survey data found specifically measuring user willingness to share personal data with AI agents (as opposed to AI chatbots or companies generally); the agent-specific trust question appears unstudied",
+ "Personal data store companies (Digi.me, Meeco, MyDex, Solid) have no publicly available user count or revenue figures for 2025-2026, making it impossible to assess actual traction",
+ "No data found on how much personal context is needed before an AI agent becomes meaningfully more useful than a generic one (the minimum viable personalization threshold); the cold start research is theoretical, not empirical from deployed products",
+ "Healthcare personal data + AI agents is notably absent from findings; despite HIPAA and health data being among the most valuable personal data, no specific companies or products were found building personal health data as agent context",
+ "The intersection of MCP protocol adoption and personal data access patterns is undocumented; no data on how many MCP servers handle personal vs. enterprise data",
+ "No specific figures found on how much Google, Apple, or OpenAI spend on personal data storage and processing per user for their AI memory/context features"
+ ]
+}
diff --git a/research/personal-data-agents/findings/wave2-product-shapes.json b/research/personal-data-agents/findings/wave2-product-shapes.json
new file mode 100644
index 00000000..81b73487
--- /dev/null
+++ b/research/personal-data-agents/findings/wave2-product-shapes.json
@@ -0,0 +1,217 @@
+{
+ "agent_question": "Product shapes for generating agent tasks from personal data",
+ "findings": [
+ {
+ "name": "OpenClaw",
+ "what": "Open-source self-hosted AI agent with cron-scheduled daily briefings that pull from connected data sources (Gmail, Calendar, GitHub, RSS, Todoist, Linear, Stripe).",
+ "evidence": "264K GitHub stars as of March 2026, surpassed React as most-starred software project on GitHub on March 3, 2026. Reached 100K stars on January 30, 2026, and 250K by March 3. The daily-briefing-hub skill combines Google Calendar, Gmail/Outlook, weather, GitHub PR/CI status, Hacker News/RSS, and Todoist/ClickUp/Linear tasks into a single prioritized morning summary. Silently skips missing integrations. Codebase supports three schedule types: at (one-time), every (interval), and cron (standard expressions).",
+ "status": "live",
+ "source_url": "https://github.com/openclaw/openclaw",
+ "sub_theme": "Self-hosted proactive agents"
+ },
+ {
+ "name": "NanoClaw",
+ "what": "Lightweight, container-isolated alternative to OpenClaw with scheduled jobs, built on Anthropic's Agents SDK, ~3,900 lines of code across 15 files.",
+ "evidence": "10.5K+ GitHub stars. Runs in Linux containers with filesystem isolation, not merely behind permission checks. Connects to WhatsApp, Telegram, Slack, Discord, Gmail. Entire codebase is auditable at ~3,900 LOC. Same scheduled-job and daily-briefing capabilities as OpenClaw but in a smaller, security-focused package.",
+ "status": "live",
+ "source_url": "https://github.com/qwibitai/nanoclaw",
+ "sub_theme": "Self-hosted proactive agents"
+ },
+ {
+ "name": "ChatGPT Scheduled Tasks",
+ "what": "OpenAI's built-in feature for scheduling recurring prompts that execute at predetermined times (daily, weekly, monthly) and deliver results via push notification or email.",
+ "evidence": "Launched January 2025 in beta for Plus, Pro, and Teams plans. Limit of 10 active tasks at a time. Tasks run independently of whether the user is online. Available on web, Android, iOS, macOS. Still in beta as of 2026.",
+ "status": "live",
+ "source_url": "https://help.openai.com/en/articles/10291617-scheduled-tasks-in-chatgpt",
+ "sub_theme": "Platform-native scheduled agents"
+ },
+ {
+ "name": "ChatGPT Pulse",
+ "what": "OpenAI's proactive daily briefing feature that researches overnight based on user memory and chat history, delivering personalized morning updates without a prompt.",
+ "evidence": "Launched September 25, 2025 as early preview for Pro users on iOS and Android. Synthesizes information from user memory, chat history, and direct feedback. Shelved in December 2025 when CEO Sam Altman issued 'Code Red' to refocus on core ChatGPT improvements amid competitive pressure.",
+ "status": "shut down",
+ "source_url": "https://techcrunch.com/2025/09/25/openai-launches-chatgpt-pulse-to-proactively-write-you-morning-briefs/",
+ "sub_theme": "Platform-native scheduled agents"
+ },
+ {
+ "name": "Google CC (Your Day Ahead)",
+ "what": "Google Labs experimental AI agent that sends a daily morning briefing email by connecting to Gmail, Google Calendar, and Google Drive without requiring a search or prompt.",
+ "evidence": "Launched December 16, 2025 in early access for U.S. and Canada users 18+. Built on Gemini models. Users steer CC by replying to the email. Includes thumbs up/down feedback. Can draft emails and create calendar links. Priority access given to Google AI Ultra and Gemini Advanced subscribers.",
+ "status": "live",
+ "source_url": "https://blog.google/technology/google-labs/cc-ai-agent/",
+ "sub_theme": "Platform-native scheduled agents"
+ },
+ {
+ "name": "Gemini Scheduled Actions + Goal Scheduled Actions",
+ "what": "Google Gemini's built-in feature for scheduling recurring AI tasks, with a newer 'Goal' variant where the AI monitors objectives and adjusts actions over time.",
+ "evidence": "Scheduled Actions available to Google AI Pro and Ultra subscribers. Limit of 10 active scheduled actions. Goal Scheduled Actions rolled out February 2026, introducing proactive AI that reviews outputs from previous instructions and adjusts next actions. New Android UI launched March 2026 for naming, describing, and scheduling tasks.",
+ "status": "live",
+ "source_url": "https://blog.google/products-and-platforms/products/gemini/scheduled-actions-gemini-app/",
+ "sub_theme": "Platform-native scheduled agents"
+ },
+ {
+ "name": "Meta Proactive Chatbots (Project Omni)",
+ "what": "Meta's initiative to train AI chatbots to message users first on Messenger, WhatsApp, and Instagram, using conversation history for personalized re-engagement.",
+ "evidence": "Leaked documents obtained by Business Insider in July 2025. Internal project name 'Project Omni' at data labeling firm Alignerr. Chatbots only send follow-ups within 14 days after user-initiated conversation and if user sent at least 5 messages. Meta projected its generative AI products would generate $2B-$3B in 2025 revenue, up to $1.4T by 2035 (from unsealed court documents).",
+ "status": "announced",
+ "source_url": "https://techcrunch.com/2025/07/03/meta-has-found-another-way-to-keep-you-engaged-chatbots-that-message-you-first/",
+ "sub_theme": "Proactive consumer AI"
+ },
+ {
+ "name": "Notion 3.0 AI Agents",
+ "what": "Autonomous agents inside Notion that work for up to 20 minutes on multi-step tasks across hundreds of pages, including breaking projects into tasks and assigning them.",
+ "evidence": "Launched September 18, 2025. Personal Agent works autonomously for up to 20 minutes. Can build project launch plans, break them into tasks, assign them, and draft docs. Pulls context from connected tools (Slack, Google Drive, Teams) via native integrations and MCP. Notion 3.2 (January 2026) brought agents to mobile with multi-model support (GPT-5.2, Claude Opus 4.5, Gemini 3).",
+ "status": "live",
+ "source_url": "https://www.notion.com/releases/2025-09-18",
+ "sub_theme": "Productivity tool AI agents"
+ },
+ {
+ "name": "Linear Triage Intelligence / Product Intelligence",
+ "what": "Linear's AI that auto-triages incoming issues by suggesting teams, projects, assignees, labels, flagging duplicates, and linking related issues based on backlog patterns.",
+ "evidence": "Product Intelligence launched August 14, 2025 as Technology Preview on Business and Enterprise plans. Auto-apply triage suggestions shipped September 19, 2025. Uses search, ranking, and LLM-based reasoning, drawing on existing backlog as a dataset. Can be configured to auto-apply suggested properties without human approval.",
+ "status": "live",
+ "source_url": "https://linear.app/changelog/2025-08-14-product-intelligence-technology-preview",
+ "sub_theme": "Productivity tool AI agents"
+ },
+ {
+ "name": "Zapier AI / Zapier Agents / Zapier Canvas",
+ "what": "AI-powered workflow automation platform that generates workflows from natural language and converts visual process diagrams into functioning automations.",
+ "evidence": "8,000+ app integrations. Zapier Canvas converts visual diagrams into live automations. Natural language workflow creation (e.g., 'When I get a Gmail, summarize it and post to Slack'). Zapier Agents carry out specific workflows. Focus on AI orchestration, agents, copilot, human-in-the-loop, and MCP as of 2026.",
+ "status": "live",
+ "source_url": "https://zapier.com/",
+ "sub_theme": "Workflow automation with AI"
+ },
+ {
+ "name": "n8n",
+ "what": "Open-source workflow automation platform with native AI agent nodes, LangChain integration, and MCP client/server support, self-hostable.",
+ "evidence": "150K+ GitHub stars in 2025. $40M ARR as of July 2025. Raised $180M at $2.5B valuation in October 2025 (led by Accel, with NVentures/Nvidia). 3,000+ enterprise customers (Vodafone, Delivery Hero, Microsoft). 200,000 active users. Revenue grew 5X after pivoting to AI-friendly approach in 2022. 500+ integrations with dedicated AI Agent nodes, memory, evaluations, and multi-agent orchestration.",
+ "status": "live",
+ "source_url": "https://github.com/n8n-io/n8n",
+ "sub_theme": "Workflow automation with AI"
+ },
+ {
+ "name": "GitHub Copilot Coding Agent",
+ "what": "GitHub's autonomous coding agent that takes assigned issues, spins up a GitHub Actions environment, implements changes, and creates pull requests.",
+ "evidence": "Available with Copilot Pro, Pro+, Business, and Enterprise plans. Runs in secure GitHub Actions-powered environment. Automates branch creation, commits, PR opening, and description writing. Targets low-to-medium complexity tasks. All PRs require independent human review; agent cannot approve or merge its own work. As of June 4, 2025, uses one premium request per model request.",
+ "status": "live",
+ "source_url": "https://docs.github.com/en/copilot/concepts/agents/coding-agent/about-coding-agent",
+ "sub_theme": "Autonomous coding agents"
+ },
+ {
+ "name": "Claude Code Action (GitHub)",
+ "what": "Anthropic's GitHub Action that responds to @claude mentions, issue assignments, and PR comments to implement code changes and answer questions.",
+ "evidence": "Available on GitHub as of February 2026 via Agent HQ for Copilot Pro+ and Enterprise customers. Can commit code and comment on pull requests. Intelligently detects when to activate based on workflow context.",
+ "status": "live",
+ "source_url": "https://github.com/anthropics/claude-code-action",
+ "sub_theme": "Autonomous coding agents"
+ },
+ {
+ "name": "Devin (Cognition Labs)",
+ "what": "Autonomous AI software developer that completes development tasks end-to-end, from understanding codebases to writing and testing code.",
+ "evidence": "$155M+ ARR, growing from $1M in September 2024 to ~$73M by June 2025. $10.2B valuation after $400M Series C in late 2025. Acquired Windsurf (tens of millions ARR, hundreds of enterprise customers) in July 2025. Key customers include Goldman Sachs, Palantir, Cisco, Mercado Libre. Devin 2.0 (April 2025) dropped pricing from $500/month to $20/month Core plan.",
+ "status": "live",
+ "source_url": "https://cognition.ai/blog/devin-annual-performance-review-2025",
+ "sub_theme": "Autonomous coding agents"
+ },
+ {
+ "name": "Sweep AI",
+ "what": "AI junior developer that transforms GitHub issues and Jira tickets into pull requests by reading the project, planning changes, writing code, and creating PRs.",
+ "evidence": "Raised $2.8M (as of November 2023 TechCrunch report). Team was 2 employees planning to expand to 5. Reads project context, plans changes, writes code, and creates PRs from issue descriptions.",
+ "status": "live",
+ "source_url": "https://techcrunch.com/2023/11/02/sweep-aims-to-automate-basic-dev-tasks-using-large-language-models/",
+ "sub_theme": "Autonomous coding agents"
+ },
+ {
+ "name": "Dependabot",
+ "what": "GitHub's built-in automated dependency update tool that creates pull requests for outdated or vulnerable dependencies without human prompting.",
+ "evidence": "846,000+ repositories with Dependabot configured (2025 GitHub Octoverse report). 137% year-over-year adoption growth. Supports 30+ ecosystems (npm, pip, Maven, Docker, Go, Terraform, GitHub Actions, pnpm, Bun, Helm, Swift, etc.). Free for all GitHub repositories. More than 75% reduction in remediation time for code maintenance tasks.",
+ "status": "live",
+ "source_url": "https://docs.renovatebot.com/bot-comparison/",
+ "sub_theme": "Automated maintenance (agent without prompting)"
+ },
+ {
+ "name": "Renovate (Mend.io)",
+ "what": "Cross-platform automated dependency update tool supporting 90+ package managers, creating PRs for dependency updates with grouping, scheduling, and auto-merge.",
+ "evidence": "90+ package manager support (broader than Dependabot's 30+). Works across GitHub, GitLab, Bitbucket, Azure DevOps. Claims approximately 90% time savings for dependency updates (Mend ROI whitepaper). Supports grouping, scheduling, auto-merge policies.",
+ "status": "live",
+ "source_url": "https://github.com/renovatebot/renovate",
+ "sub_theme": "Automated maintenance (agent without prompting)"
+ },
+ {
+ "name": "Limitless (formerly Rewind AI)",
+ "what": "AI wearable pendant that continuously records conversations and meetings, providing AI-powered summaries and searchable personal data.",
+ "evidence": "Raised $33M+ total from Sam Altman, a16z, First Round Capital, NEA. $15M at $350M valuation in May 2023 (495x multiple on $707K ARR). Pendant sold for $99, 100-hour battery life. Free tier: unlimited audio storage + 10 hours AI features/month. Pro: $20/month unlimited AI. Acquired by Meta on December 5, 2025; hardware sales stopped, subscription fees waived for existing users.",
+ "status": "acquired",
+ "source_url": "https://techcrunch.com/2025/12/05/meta-acquires-ai-device-startup-limitless/",
+ "sub_theme": "Personal data capture and AI"
+ },
+ {
+ "name": "Spotify AI DJ",
+ "what": "AI-powered DJ that proactively selects music based on full listening history, temporal patterns, and mood inference without explicit song requests.",
+ "evidence": "Launched February 2023. DJ Requests (voice and text) added May 2025. Personalized prompt suggestions added October 2025 (e.g., 'reggaeton beats for an energetic afternoon'). Prompted Playlists beta (December 2025) uses entire listening history from day one. System processes billions of data points using collaborative filtering, content-based filtering, skip rates, playlist additions, and temporal preferences (morning/afternoon/evening, seasonal patterns).",
+ "status": "live",
+ "source_url": "https://newsroom.spotify.com/2023-02-22/spotify-debuts-a-new-ai-dj-right-in-your-pocket/",
+ "sub_theme": "Proactive consumer AI"
+ },
+ {
+ "name": "Proactive Agent (OpenReview / ICLR 2025)",
+ "what": "Research paper formalizing proactive agents that anticipate user needs and take initiative by suggesting tasks without explicit requests, with ProactiveBench dataset.",
+ "evidence": "ProactiveBench dataset contains 6,790 events. Defines proactive agent as one that perceives environmental context, infers user intentions without explicit prompts, and autonomously suggests actions. Published October 2024, updated through 2025.",
+ "status": "live",
+ "source_url": "https://arxiv.org/abs/2410.12361",
+ "sub_theme": "Academic research on proactive agents"
+ },
+ {
+ "name": "PROBE Benchmark (Beyond Reactivity)",
+ "what": "Benchmark measuring proactive problem-solving in LLM agents, requiring agents to identify and resolve critical bottlenecks hidden in realistic workplace datastores.",
+ "evidence": "1,000 diverse samples. Even state-of-the-art LLMs and specialized agentic frameworks achieve no more than 40% success on end-to-end proactive tasks. Published October 2025.",
+ "status": "live",
+ "source_url": "https://arxiv.org/abs/2510.19771",
+ "sub_theme": "Academic research on proactive agents"
+ },
+ {
+ "name": "PPP: Training Proactive and Personalized LLM Agents",
+ "what": "Multi-objective reinforcement learning approach optimizing three dimensions: productivity (task completion), proactivity (asking essential questions), and personalization (adapting to user preferences).",
+ "evidence": "Published November 2025. Identifies that existing work focuses primarily on task success but effective real-world agents require jointly optimizing productivity, proactivity, and personalization.",
+ "status": "live",
+ "source_url": "https://arxiv.org/abs/2511.02208",
+ "sub_theme": "Academic research on proactive agents"
+ },
+ {
+ "name": "ProAgentBench",
+ "what": "Benchmark for evaluating LLM agents for proactive assistance using real-world data.",
+ "evidence": "Published February 2026. Evaluates LLM agents specifically for proactive (not reactive) assistance capabilities.",
+ "status": "live",
+ "source_url": "https://arxiv.org/html/2602.04482v1",
+ "sub_theme": "Academic research on proactive agents"
+ },
+ {
+ "name": "Netflix FM-Intent / IntentRec",
+ "what": "Netflix's recommendation framework that predicts user session intent using hierarchical multi-task learning, estimating latent intent from short- and long-term implicit signals.",
+ "evidence": "Uses hierarchical multi-task neural network architecture. Estimates latent user intent from short- and long-term implicit signals as proxies. Uses intent prediction to predict next item user will engage with. Published on Netflix Tech Blog and arxiv (updated May 2025).",
+ "status": "live",
+ "source_url": "https://netflixtechblog.com/fm-intent-predicting-user-session-intent-with-hierarchical-multi-task-learning-94c75e18f4b8",
+ "sub_theme": "Intent inference and task prediction"
+ },
+ {
+ "name": "Google Intent Extraction (Small Models, Big Results)",
+ "what": "Google Research method using small multimodal LLMs to understand sequences of user interactions on web and mobile, decomposing intent extraction into two stages.",
+ "evidence": "Presented at EMNLP 2025. Separates user intent understanding into: (1) summarizing each screen separately, then (2) extracting intent from the sequence of summaries. Makes intent extraction tractable for small on-device models.",
+ "status": "live",
+ "source_url": "https://research.google/blog/small-models-big-results-achieving-superior-intent-extraction-through-decomposition/",
+ "sub_theme": "Intent inference and task prediction"
+ }
+ ],
+ "gaps": [
+ "No hard accuracy or precision numbers found for OpenClaw daily briefing quality or user satisfaction rates.",
+ "ChatGPT Pulse was live for only ~3 months before being shelved; no usage metrics, retention data, or user feedback were published.",
+ "No revenue or pricing data found for OpenClaw or NanoClaw (both are open-source, likely no direct revenue model).",
+ "Limited data on what percentage of ChatGPT users actually use Scheduled Tasks, or how many tasks are active across the user base.",
+ "No products found that specifically generate coding agent prompts from personal data (the exact intersection Vana is exploring). The closest are OpenClaw daily briefings (which summarize but don't generate actionable coding tasks) and GitHub Copilot coding agent (which acts on issues but doesn't infer tasks from personal data).",
+ "No data found on failure rates or user abandonment for proactive/scheduled AI features across any platform.",
+ "Google CC (Your Day Ahead) is too new (December 2025 launch) to have published usage or retention metrics.",
+ "Gemini Goal Scheduled Actions (February 2026) has no published adoption data yet.",
+ "The PROBE benchmark's 40% ceiling for proactive agents suggests the technical problem is far from solved, but no commercial product has published comparable accuracy metrics for their proactive features.",
+ "No products found that combine personal data from multiple life domains (code + social + media + conversations) into coding-specific task generation. Each product stays in its lane: code tools generate code tasks, productivity tools generate productivity tasks, media tools recommend media."
+ ]
+}
diff --git a/research/personal-data-agents/findings/wave3-verification.json b/research/personal-data-agents/findings/wave3-verification.json
new file mode 100644
index 00000000..349f9baa
--- /dev/null
+++ b/research/personal-data-agents/findings/wave3-verification.json
@@ -0,0 +1,70 @@
+{
+ "agent_question": "Verification of key numerical claims from wave 1 and wave 2",
+ "findings": [
+ {
+ "name": "Cursor $29.3B valuation",
+ "what": "Verified: Cursor raised $2.3B Series D at $29.3B valuation (Nov 2025). Now in talks for $50B round.",
+ "evidence": "Confirmed by CNBC, BusinessWire, TechCrunch. ARR crossed $2B as of March 2026. $1B ARR in 24 months. $1.2B ARR in 2025 per Sacra.",
+ "status": "live",
+ "source_url": "https://www.cnbc.com/2025/11/13/cursor-ai-startup-funding-round-valuation.html",
+ "sub_theme": "Verification"
+ },
+ {
+ "name": "Claude Code $2.5B ARR",
+ "what": "Verified: Claude Code run-rate revenue above $2.5B, accounts for over half of Anthropic enterprise spending",
+ "evidence": "Confirmed by multiple sources. Anthropic at $14B total ARR. Claude Code reached $1B ARR within 6 months of May 2025 launch. $30B Series G at $380B valuation (Feb 2026).",
+ "status": "live",
+ "source_url": "https://www.saastr.com/anthropic-just-hit-14-billion-in-arr-up-from-1-billion-just-14-months-ago/",
+ "sub_theme": "Verification"
+ },
+ {
+ "name": "Humane AI Pin shutdown",
+ "what": "Verified: HP acquired Humane assets for $116M. AI Pins stopped functioning Feb 28, 2025.",
+ "evidence": "Confirmed by Fortune, TechCrunch, Axios. $230M raised. Device scathingly reviewed. Customers given <10 days notice before brick.",
+ "status": "shut down",
+ "source_url": "https://techcrunch.com/2025/02/18/humanes-ai-pin-is-dead-as-hp-buys-startups-assets-for-116m/",
+ "sub_theme": "Verification"
+ },
+ {
+ "name": "METR task horizon doubling (7 months)",
+ "what": "Verified: Task completion time horizons doubling approximately every 7 months",
+ "evidence": "Confirmed by METR's own publications. Time Horizon 1.1 released Jan 2026 with expanded task suite.",
+ "status": "live",
+ "source_url": "https://metr.org/time-horizons/",
+ "sub_theme": "Verification"
+ },
+ {
+ "name": "Anthropic agent autonomy metrics",
+ "what": "Verified: 99.9th percentile turn duration doubled (25 min to 45 min). Interventions decreased from 5.4 to 3.3.",
+ "evidence": "From Anthropic's own research publication. Auto-approve rates climb from 20% to 40%+.",
+ "status": "live",
+ "source_url": "https://www.anthropic.com/research/measuring-agent-autonomy",
+ "sub_theme": "Verification"
+ },
+ {
+ "name": "Cognition/Devin $10.2B valuation",
+ "what": "Verified: $400M raise at $10.2B valuation (Sept 2025). ARR $73M growing to ~$155M with Windsurf.",
+ "evidence": "Confirmed by CNBC, TechCrunch. Net burn under $20M total. Triple-digit YoY growth.",
+ "status": "live",
+ "source_url": "https://techcrunch.com/2025/09/08/cognition-ai-defies-turbulence-with-a-400m-raise-at-10-2b-valuation/",
+ "sub_theme": "Verification"
+ },
+ {
+ "name": "Mem0 $24M raise",
+ "what": "Verified: $24M total (Seed + Series A). 186M API calls Q3 2025.",
+ "evidence": "Confirmed by TechCrunch, company blog. Backed by YC, Peak XV, GitHub Fund.",
+ "status": "live",
+ "source_url": "https://techcrunch.com/2025/10/28/mem0-raises-24m-from-yc-peak-xv-and-basis-set-to-build-the-memory-layer-for-ai-apps/",
+ "sub_theme": "Verification"
+ },
+ {
+ "name": "Limitless AI / Meta acquisition",
+ "what": "Verified: Acquired by Meta Dec 2025. $33M+ total raised. Desktop app sunset Dec 19, 2025.",
+ "evidence": "Confirmed by TechCrunch. $350M valuation on $707K ARR (2023). Pendant discontinued.",
+ "status": "acquired",
+ "source_url": "https://techcrunch.com/2025/12/05/meta-acquires-ai-device-startup-limitless/",
+ "sub_theme": "Verification"
+ }
+ ],
+ "gaps": []
+}
diff --git a/research/personal-data-agents/personal-data-as-agent-context-landscape.csv b/research/personal-data-agents/personal-data-as-agent-context-landscape.csv
new file mode 100644
index 00000000..58095933
--- /dev/null
+++ b/research/personal-data-agents/personal-data-as-agent-context-landscape.csv
@@ -0,0 +1,41 @@
+Section,Company/Product,Category,Key Metrics,Status,How It Works,Relevance to Personal Data as Agent Context,Source
+1. Autonomous Coding Agents,Cursor (Anysphere),AI code editor,"$29.3B valuation; $2B+ ARR (Mar 2026); 1M+ DAU; 360K paying customers",Live,"Indexes entire codebase with custom embedding model; .cursorrules file provides standing AI instructions; Cloud Agents run autonomously on Cursor infrastructure; RL-based suggestion filtering retrains multiple times per day on user accept/reject signals","Cursor learns user preferences via implicit signals (accept/reject); .cursorrules is a form of personal context injection; no cross-platform personal data ingestion",https://www.cnbc.com/2025/11/13/cursor-ai-startup-funding-round-valuation.html
+1. Autonomous Coding Agents,Claude Code (Anthropic),CLI coding agent,"$2.5B ARR; $19B total Anthropic ARR; 99.9th percentile turn duration 45+ min",Live,"CLI agent with 6-layer memory system (CLAUDE.md files); hooks for lifecycle control; sub-agents with isolated context windows; background tasks via Ctrl+B; Agent SDK for external automation","CLAUDE.md files are static personal context; memory system captures session history but not cross-platform personal data; hooks enable deterministic context injection at session start",https://www.anthropic.com/research/measuring-agent-autonomy
+1. Autonomous Coding Agents,Devin (Cognition Labs),Autonomous coding agent,"$10.2B valuation; $155M+ combined ARR; PR merge rate 67%",Live,"Handles full project lifecycle in secure sandbox; multi-agent dispatch; Devin Wiki and Search for codebase understanding; acquired Windsurf for IDE-based coding","Operates on codebase context only; no personal data ingestion; 33% PR rejection rate suggests context gaps that personal data could partially address",https://techcrunch.com/2025/09/08/cognition-ai-defies-turbulence-with-a-400m-raise-at-10-2b-valuation/
+1. Autonomous Coding Agents,GitHub Copilot Coding Agent,Autonomous PR agent,"1.2M PRs/month; 4.7M paid subscribers; 55% faster task completion",Live,"Creates PRs from issues in GitHub Actions sandbox; agent mode expanding to JetBrains; Eclipse; Xcode; all PRs require independent human review; agent cannot approve or merge its own work","Learns from repository context; no personal data beyond codebase; 30% suggestion acceptance rate indicates large gap between agent output and developer intent",https://github.com/newsroom/press-releases/agent-mode
+1. Autonomous Coding Agents,OpenAI Codex,Cloud-sandboxed coding agent,"GPT-5.3-Codex: 56.8% SWE-Bench Pro; 25-hour autonomous run demonstrated",Live,"Two modes: cloud sandbox (parallel background tasks; no internet) and terminal CLI (local execution); auto-detects setup scripts; configurable internet access and approval controls","Cloud sandbox isolation prevents personal data access during execution; operates purely on repository context; 25-hour autonomous runs demonstrate long-horizon capability without personal context",https://openai.com/index/introducing-codex/
+1. Autonomous Coding Agents,Cline,Open-source VS Code agent,"5M+ developers; $32M Series A; Plan/Act modes",Live,"Open-source autonomous coding agent with Plan/Act modes; MCP integration; requires external API keys; Samsung beta-testing for device development","MCP integration enables connecting to personal data sources; open-source architecture allows custom context injection; no built-in personal data ingestion",https://cline.bot/
+1. Autonomous Coding Agents,Aider,Open-source CLI agent,"39K+ GitHub stars; 4.1M+ installations; 49.2% SWE-bench Verified",Live,"Repository map creates condensed codebase overview; automatically pulls context from related files; every AI change gets its own git commit; connects to 100+ models","Context management limited to repository files; no personal data integration; repository map is a form of automated context provision that reduces manual context-setting interventions",https://aider.chat/
+2. Agent Memory Infrastructure,Mem0,Memory API for AI agents,"$24M raised; 186M API calls Q3 2025; 41K GitHub stars; 13M PyPI downloads",Live,"Persistent structured memory via API; graph-based memory with sub-second retrieval; claims 26% higher response accuracy vs OpenAI memory; 91% lower p95 latency; exclusive memory provider for AWS Agent SDK","Captures session-derived memories; not personal data from external platforms; memory is agent-generated rather than user-imported; closest infrastructure to what cross-platform personal context would require",https://techcrunch.com/2025/10/28/mem0-raises-24m-from-yc-peak-xv-and-basis-set-to-build-the-memory-layer-for-ai-apps/
+2. Agent Memory Infrastructure,Zep,Context engineering platform,"$500K seed; YC W24",Live,"Combines chat history; business data; and user behavior; tracks how facts change over time (temporal reasoning); graph-based memory with relationship modeling","Temporal reasoning on evolving facts aligns with personal data that changes over time; limited traction data makes adoption assessment difficult",https://www.getzep.com/
+2. Agent Memory Infrastructure,LangChain / LangSmith,Agent framework + observability,"$1.25B valuation; $125M Series B; 47M+ PyPI downloads",Live,"Most adopted AI agent framework with commercial observability platform; LangSmith trace volume 12x year-over-year; provides checkpointing and memory abstractions for agent state","Framework-level memory support; enables custom personal data integration but provides no personal data itself; infrastructure layer that a personal data context system would plug into",https://fortune.com/2025/10/20/exclusive-early-ai-darling-langchain-is-now-a-unicorn-with-a-fresh-125-million-in-funding/
+2. Agent Memory Infrastructure,Supermemory,AI memory extraction,"$2.6M seed (Oct 2025); backed by Google executives",Live,"Extracts structured memories from unstructured data for application context","Early-stage company attempting to bridge unstructured personal data and structured agent memory; directly relevant to the personal-data-as-context hypothesis",https://techcrunch.com/2025/10/06/a-19-year-old-nabs-backing-from-google-execs-for-his-ai-memory-startup-supermemory/
+2. Agent Memory Infrastructure,Memories.ai,Visual memory model,"$8M seed; founded by ex-Meta Reality Labs researchers",Live,"Building a Large Visual Memory Model (LVMM) for long-term visual memory; Samsung Next participation","Visual personal data (photos; screenshots) as agent context; hardware company backing suggests wearable integration path",https://wellows.com/blog/ai-startups/
+3. Personal Data + AI Products,ChatGPT Memory + Connectors,LLM with personal data,"800M+ weekly active users; $13B ARR; 5M paying business users",Live,"Persistent memory across conversations; connectors to Gmail; Google Drive; GitHub; Outlook; Teams; MCP-based partner connectors (Stripe; Amplitude; Monday.com); full chat history reference since April 2025","Largest-scale personal data AI integration; single-platform (OpenAI ecosystem); connectors are read-only for most sources; actions beyond reading require user confirmation",https://openai.com/index/memory-and-new-controls-for-chatgpt/
+3. Personal Data + AI Products,Google Personal Intelligence,LLM with personal data,"Rolling out to free Gemini users in US (Mar 2026)",Live,"Connects Gemini to Gmail; Photos; YouTube history; Search history; enables cross-referencing private emails with real-time market data; built on Gemini 3 Pro and Flash","Largest-scale cross-Google-service personal data integration; limited to Google platforms; demonstrates feasibility of multi-source personal context for AI",https://blog.google/innovation-and-ai/products/gemini-app/personal-intelligence/
+3. Personal Data + AI Products,Apple Intelligence,On-device personal AI,"~3B parameter model; 2.2B+ active devices; repeatedly delayed",Announced,"On-device model on Apple Silicon; intended to give Siri access to emails; messages; photos; personal context features delayed from 2025 to spring 2026 and possibly beyond","Even Apple with 2.2B devices and full OS-level data access has not shipped reliable personal context AI; Craig Federighi: 'no one is doing it really well right now'",https://www.cnbc.com/2025/03/07/apple-delays-siri-ai-improvements-to-2026.html
+3. Personal Data + AI Products,Samsung Personal Data Engine,On-device knowledge graph,"Galaxy S25+ series; powered by RDFox; Knox Vault encryption",Live,"On-device knowledge graph using RDFox; drives Now Brief daily summary and Smart Gallery search; secured by Knox Vault and KEEP encryption; hardware-based Privacy Display on S26 Ultra","On-device personal data processing for AI context; limited to Samsung ecosystem data; demonstrates knowledge graph approach to personal data for AI",https://www.computerweekly.com/news/366618319/Samsung-unpacks-Galaxy-AIs-personal-data-engine
+3. Personal Data + AI Products,Glean,Enterprise knowledge AI,"$7.2B valuation; $100M+ ARR in <3 years",Live,"Builds personalized knowledge graph per employee from 100+ workplace data sources; personal graph tracks projects; collaborators; work style; agents summarize weekly work or prepare reviews","Enterprise-scoped personal context; per-employee knowledge graph is analogous to cross-platform personal data but limited to workplace sources",https://www.glean.com/press/glean-achieves-100m-arr-in-three-years-delivering-true-ai-roi-to-the-enterprise
+3. Personal Data + AI Products,Otter.ai,Meeting AI with personal context,"$100M ARR (Mar 2025); ~$70M total raised",Live,"Voice AI meeting agent participates as voice-activated attendee; answers questions using real-time transcript + company meeting history; vocabulary learning and preference personalization","Meeting context as personal data for AI; limited to audio/meeting domain; demonstrates personalization from accumulated user-specific data",https://otter.ai/blog/otter-ai-caps-transformational-2025-with-100m-arr-milestone-industry-first-ai-meeting-agents-and-global-enterprise-expansion
+3. Personal Data + AI Products,Screenpipe,Open-source screen capture + AI,"17.2K GitHub stars; $3.5K MRR; 202 subscriptions",Live,"Continuously captures screen and audio locally; local Whisper for speech-to-text; AI-powered search and recall; works on macOS; Windows; Linux","Continuous personal data capture for AI context; local-first architecture; has not reached product-market fit per founders; major companies testing",https://screenpi.pe/
+4. Data Portability Infrastructure,Data Transfer Initiative (DTI),Nonprofit data portability,"Apple; Google; Meta as founding partners",Live,"Independent nonprofit spun from Google Data Transfer Project in 2022; builds open-source data portability tools; Apple-Google photo/video transfer; EU DMA compliance","Direct infrastructure for moving personal data between platforms; currently focused on platform-to-platform transfer rather than platform-to-AI",https://dtinit.org/blog/2024/07/10/DTI-members-new-photo-video-tool
+4. Data Portability Infrastructure,Vana Protocol,L1 blockchain for personal data,"Mainnet Dec 2024; 12M+ data points onboarded; VANA trading $8-15",Live,"EVM-compatible Layer 1 for personal data sovereignty; DataDAOs for collective data management; VRC-20 token standard for data-backed digital assets","Directly enabling user-owned personal data portability; DataDAO structure enables consent-managed data access for AI agents",https://www.vana.org/
+4. Data Portability Infrastructure,Solid Project (Inrupt),Personal data pods,"Tim Berners-Lee; launched 2016; Inrupt 2018",Live,"Standards-based personal data pods with time-bound access grants; apps and AI agents request specific data access; user controls all permissions","Architecturally aligned with personal data as agent context; time-bound grants match agent session model; developer community remains small",https://en.wikipedia.org/wiki/Solid_(web_decentralization_project)
+4. Data Portability Infrastructure,MCP (Model Context Protocol),Agent-to-data protocol,"10K+ servers; 97M+ monthly SDK downloads; Linux Foundation governance",Live,"Open protocol for connecting AI models to external data sources and tools; adopted by ChatGPT; Cursor; Gemini; Microsoft Copilot; VS Code; 2026 roadmap includes triggers/events and security hardening","Primary protocol through which personal data could reach coding agents; 5800+ servers but no data on how many handle personal vs. enterprise data",https://www.anthropic.com/news/donating-the-model-context-protocol-and-establishing-of-the-agentic-ai-foundation
+4. Data Portability Infrastructure,Plaid,Financial data aggregation,"$800M+ estimated annual revenue (2025); 220+ new products in 2025",Live,"Financial data connectivity across 12K+ institutions; AI-powered auto-repair enabled 2M+ successful logins; LendScore credit risk from real-time cash flow","Demonstrates the economics of personal data connectivity at scale; financial domain only; AI-powered enrichment on aggregated personal data is a proven model",https://sacra.com/c/plaid/
+4. Data Portability Infrastructure,Skyflow,Privacy vault infrastructure,"$100M total raised; ~1B records; 2B+ API calls/quarter",Live,"Tokenized data vaults; LLM Privacy Vault detects sensitive data and replaces with deterministic tokens before sending to AI models","Privacy infrastructure layer between personal data and AI models; addresses the trust gap by making data access auditable and tokenized",https://www.skyflow.com/post/generative-ai-data-privacy-skyflow-llm-privacy-vault
+5. Failed Products,Humane AI Pin,AI wearable,"$230M raised; ~10K units sold; HP acquired assets for $116M",Shut down,"Wearable with camera; projector; LLM access; intended to replace smartphones; returns outpaced sales; charging case recalled for battery fires; all devices bricked Feb 28 2025","Hardware form factor failed; personal context features were dependent on cloud services that were shut down; demonstrates risk of proprietary personal data storage",https://techcrunch.com/2025/02/18/humanes-ai-pin-is-dead-as-hp-buys-startups-assets-for-116m/
+5. Failed Products,Rabbit R1,AI hardware device,"$64.7M raised; 130K units sold; 5K daily active users (5% retention)",Live (minimal),"Large Action Model intended to replace app-based interactions; supported only 6 apps; entire interface discovered to be a single Android app; CEO admitted premature launch","95% user abandonment; limited app integrations prevented meaningful personal data access; reliability failures in basic tasks preceded any personal context features",https://9to5google.com/2024/09/26/rabbit-5000-people-use-the-r1-daily/
+5. Failed Products,Microsoft Cortana,Voice assistant,"Peak 145M MAU; <2% voice assistant market share; deprecated Aug 2023",Shut down,"Integrated into Windows; competed with Siri/Alexa/Google Assistant; only 10% of Windows 10 users regularly engaged by 2023; replaced by Copilot","Had deep OS-level personal data access (calendar; email; files); personal context features did not drive sustained engagement; replaced by task-focused Copilot",https://techcrunch.com/2023/08/04/microsoft-kills-cortana-in-windows-as-it-focuses-on-next-gen-ai/
+5. Failed Products,Facebook M,Human-assisted AI assistant,"70%+ requests required human operators; ~10K test users; shut down Jan 2018",Shut down,"AI assistant inside Messenger for arbitrary tasks (restaurant reservations; shopping); launched Aug 2015 to ~2K users; human operators handled majority of requests","HITL-dependent architecture could not scale economically; personal context (Messenger history) was insufficient to automate arbitrary tasks without human fallback",https://techcrunch.com/2018/01/08/facebook-is-shutting-down-its-standalone-personal-assistant-m/
+5. Failed Products,Inflection AI (Pi),Personal AI chatbot,"$1.525B raised; 1M DAU; 6M MAU; acqui-hired by Microsoft for $650M",Acquired,"Empathetic conversational AI focused on personal connection rather than task completion; strong engagement but no sustainable business model","Strong user engagement with personal AI did not translate to revenue; personal context (conversation history) created engagement but not economic value",https://www.eesel.ai/blog/inflection-ai
+5. Failed Products,Limitless (Rewind AI),Screen/audio capture AI,"$33M+ raised; $350M valuation on $707K ARR; acquired by Meta Dec 2025",Acquired,"Desktop screen recording evolved to wearable pendant ($99); continuous personal data capture; acquired by Meta for Reality Labs (Ray-Ban smart glasses)","Most ambitious personal data capture product; 495x revenue multiple reflected investor belief in personal data value; could not find PMF independently; acquired as feature not product",https://techcrunch.com/2025/12/05/meta-acquires-ai-device-startup-limitless/
+5. Failed Products,ChatGPT Pulse,Proactive daily briefing,"Launched Sep 2025; shelved Dec 2025 after ~3 months",Shut down,"Proactive daily briefing using user memory and chat history; delivered personalized morning updates without prompting; shelved when CEO issued 'Code Red' to refocus on core ChatGPT","Directly tested personal-data-driven proactive AI; shelved after 3 months; no usage metrics published; deprioritized in favor of core product improvements",https://techcrunch.com/2025/09/25/openai-launches-chatgpt-pulse-to-proactively-write-you-morning-briefs/
+5. Failed Products,Builder.ai (Natasha),AI app builder,"$445M raised; $1.2B valuation; bankrupt May 2025",Shut down,"Claimed AI built apps; investigation revealed ~700 human engineers doing the work; claimed $220M revenue; audit found ~$50M","Extreme case of HITL masquerading as autonomous AI; demonstrates that personal context alone does not enable autonomous software creation",https://techstartups.com/2025/05/24/builder-ai-a-microsoft-backed-ai-startup-once-valued-at-1-2-billion-files-for-bankruptcy-is-ai-becoming-another-com-bubble/
+6. Proactive/Scheduled Agent Products,OpenClaw,Open-source proactive agent,"264K GitHub stars (Mar 2026); surpassed React as most-starred software project",Live,"Self-hosted AI agent with cron-scheduled daily briefings; pulls from Gmail; Calendar; GitHub; RSS; Todoist; Linear; Stripe; silently skips missing integrations; three schedule types: at; every; cron","Closest existing product to personal-data-driven proactive agents; aggregates cross-platform data for daily briefings; does not generate coding agent tasks from personal data",https://github.com/openclaw/openclaw
+6. Proactive/Scheduled Agent Products,NanoClaw,Lightweight proactive agent,"10.5K GitHub stars; ~3900 LOC across 15 files",Live,"Container-isolated alternative to OpenClaw built on Anthropic Agents SDK; connects to WhatsApp; Telegram; Slack; Discord; Gmail; scheduled jobs and daily briefings","Security-focused (Linux container isolation); auditable codebase; same daily-briefing pattern as OpenClaw in smaller package",https://github.com/qwibitai/nanoclaw
+6. Proactive/Scheduled Agent Products,ChatGPT Scheduled Tasks,Platform scheduled agent,"Limit of 10 active tasks; available to Plus/Pro/Teams",Live,"Recurring prompts at predetermined times (daily; weekly; monthly); delivers results via push notification or email; runs independently of user being online","Platform-native scheduled execution with access to ChatGPT memory and connectors; limited to 10 concurrent tasks; still in beta as of 2026",https://help.openai.com/en/articles/10291617-scheduled-tasks-in-chatgpt
+6. Proactive/Scheduled Agent Products,Gemini Scheduled Actions,Platform scheduled agent,"Available to AI Pro and Ultra subscribers; limit of 10 active",Live,"Scheduled recurring AI tasks; Goal Scheduled Actions (Feb 2026) add proactive monitoring where AI reviews previous outputs and adjusts next actions","Goal variant introduces feedback loops where the agent adapts based on prior results; closest platform feature to autonomous goal-directed personal agents",https://blog.google/products-and-platforms/products/gemini/scheduled-actions-gemini-app/
+6. Proactive/Scheduled Agent Products,Google CC (Your Day Ahead),Daily briefing agent,"Launched Dec 2025; US and Canada; early access",Live,"AI agent sends daily morning briefing email by connecting to Gmail; Google Calendar; Google Drive without requiring a prompt; users steer by replying to the email","Google-ecosystem daily briefing; limited to Google services; demonstrates platform interest in proactive personal data agents",https://blog.google/technology/google-labs/cc-ai-agent/
+6. Proactive/Scheduled Agent Products,Dependabot,Automated dependency updates,"846K+ repos configured; 137% YoY adoption growth",Live,"Creates PRs for outdated or vulnerable dependencies without human prompting; supports 30+ ecosystems; free for all GitHub repos; 75% reduction in remediation time","Proactive agent pattern in production at scale; no personal data required; demonstrates that scheduled autonomous code changes are accepted when scoped narrowly",https://docs.renovatebot.com/bot-comparison/
+6. Proactive/Scheduled Agent Products,Renovate (Mend.io),Automated dependency updates,"90+ package manager support; cross-platform",Live,"Cross-platform dependency update tool creating PRs with grouping; scheduling; and auto-merge policies; claims ~90% time savings for dependency updates","Broader platform support than Dependabot; demonstrates that proactive code maintenance agents succeed when task scope is well-defined and consequences are bounded",https://github.com/renovatebot/renovate
diff --git a/research/personal-data-agents/personal-data-as-agent-context-landscape.md b/research/personal-data-agents/personal-data-as-agent-context-landscape.md
new file mode 100644
index 00000000..15d46c53
--- /dev/null
+++ b/research/personal-data-agents/personal-data-as-agent-context-landscape.md
@@ -0,0 +1,505 @@
+# Personal Data as Agent Context: Landscape Report
+
+**Date:** March 17, 2026
+**Scope:** Whether personal data from connected platforms can reduce human-in-the-loop requirements for autonomous coding agents.
+**Supporting data:** [Landscape CSV](./personal-data-as-agent-context-landscape.csv)
+
+---
+
+## Part 1: Terminology
+
+### Autonomy and Oversight
+
+| Term | Definition |
+| ------------------------ | -------------------------------------------------------------------------------------------------------------------------------------------- |
+| Task horizon | The maximum task duration at which a model completes tasks with 50% reliability. Tracked by METR since 2020. |
+| Human-in-the-loop (HITL) | A workflow where a human reviews and approves before each agent action. |
+| Human-on-the-loop (HOTL) | A workflow where agents run autonomously by default and humans intervene only on anomalies or flagged exceptions. |
+| Auto-approve | A setting where the user permits the agent to execute tool calls without per-action confirmation. |
+| Turn duration | The time an agent works continuously between user interactions. Distinct from session length. |
+| Session length | Total wall-clock time of a user-agent interaction, including both agent turns and human input. |
+| Compound error rate | The multiplicative effect of per-step error rates across multi-step workflows. At 85% per-action accuracy, a 10-step workflow succeeds 20%. |
+| SWE-bench Verified | Benchmark of 500 Python GitHub issues for measuring autonomous code generation. Data contamination concerns led OpenAI to stop reporting it. |
+| SWE-bench Pro | Harder, uncontaminated version of SWE-bench. Top scores as of March 2026: 56.8% (GPT-5.3-Codex). |
+
+### Memory and Context
+
+| Term | Definition |
+| ---------------------------- | ------------------------------------------------------------------------------------------------------------------------------- |
+| Agent memory | Persistent state that survives across context windows or sessions. Implementations range from flat files to graph databases. |
+| Context window | The maximum number of tokens a model can process in a single inference call. Claude Opus 4.6: 1M tokens. |
+| MCP (Model Context Protocol) | Open protocol (Anthropic, now Linux Foundation) for connecting AI models to external data sources and tools. 10K+ servers. |
+| RAG | Retrieval-Augmented Generation. Fetching relevant documents at inference time rather than fine-tuning on them. |
+| Personal data | Data generated by or about a specific user across platforms: code history, messages, preferences, schedules, listening history. |
+| Cross-platform context | Personal data aggregated from multiple services (e.g., GitHub + Slack + Calendar + Spotify) into a unified view. |
+| Context collapse | When data from different spheres of a user's life (work, family, hobbies) bleeds together inappropriately. |
+| Cold start | The period before sufficient user data exists to provide meaningful personalization. 20-30 preference dimensions per task. |
+
+### Product Patterns
+
+| Term | Definition |
+| ------------------- | ------------------------------------------------------------------------------------------------------------------------- |
+| Daily briefing | A proactive agent that synthesizes data from connected sources and delivers a summary at a scheduled time without prompt. |
+| Scheduled task | A recurring prompt that executes at a predetermined time (daily, weekly, monthly) and delivers results asynchronously. |
+| Proactive agent | An agent that anticipates user needs and takes initiative without explicit requests. |
+| Goal-directed agent | An agent that monitors an objective over time and adjusts its actions based on prior results (Gemini Goal Actions). |
+
+### Maturity Summary
+
+| Category | Maturity | Key Signal |
+| ------------------------------------ | ---------- | ------------------------------------------------------------------------------- |
+| Autonomous coding agents | Growth | $2B+ ARR (Cursor); 14.5-hour task horizons; 67% PR merge rate (Devin) |
+| Agent memory infrastructure | Early | $24M largest raise (Mem0); session memory only; no cross-platform personal data |
+| Personal data + AI (single-platform) | Deployed | 800M+ WAU (ChatGPT Memory); Google Personal Intelligence rolling out |
+| Personal data + AI (cross-platform) | Whitespace | No product combines data from multiple user platforms for agent context |
+| Data portability infrastructure | Standards | GDPR Art. 20; DMA; DTI; MCP 10K+ servers; Solid not ready for adoption |
+| Proactive/scheduled agents | Emerging | 264K stars (OpenClaw); platform features from Google, OpenAI, Gemini |
+| Privacy infrastructure for AI | Early | Skyflow $100M raised; Usercentrics acquired MCP Manager |
+| User trust in personal data + AI | Low | 40% would never share personal info; 82% believe undisclosed training |
+
+---
+
+## Part 2: The Autonomy Baseline
+
+### Task horizons are doubling every 7 months
+
+METR has tracked AI task completion time horizons since 2020. The length of task that frontier models complete with 50% reliability has been doubling approximately every 7 months for six consecutive years ([METR](https://metr.org/time-horizons/)).
+
+Claude Opus 4.6 crossed a full work-day task horizon at approximately 14.5 hours, up from 4 hours 49 minutes for Opus 4.5 ([Beam Dev](https://getbeam.dev/blog/anthropic-agentic-coding-trends-2026.html)). OpenAI's GPT-5.2-Codex runs 24-hour autonomous tasks. A demonstrated Codex run built a design tool from scratch over 25 hours, consuming approximately 13 million tokens and generating approximately 30,000 lines of code ([OpenAI](https://openai.com/index/introducing-gpt-5-3-codex/)).
+
+At the current doubling rate, multi-day autonomous task horizons arrive within 18 months. Multi-week horizons arrive within 3 years.
+
+### Benchmark scores cluster at the top, but the ceiling matters
+
+Top SWE-bench Verified scores cluster between 77% and 81%: Claude Opus 4.5 at 80.9%, Opus 4.6 at 81.42% (with prompt modification), Sonar Foundation Agent at 79.2% ([SWE-bench](https://www.swebench.com/verified.html)). The tightest top-tier race in the benchmark's history.
+
+SWE-bench Pro, considered more reliable due to data contamination issues in Verified, tells a different story. GPT-5.3-Codex achieves 56.8%, meaning the best model still fails nearly half of professional-level tasks ([OpenAI](https://openai.com/index/introducing-gpt-5-3-codex/)). Claude Opus 4.5 scores 45.9% on Pro, roughly half its Verified score on the same model ([SWE-bench](https://www.swebench.com/)).
+
+### The compound error problem remains
+
+At 85% per-action accuracy, a 10-step workflow succeeds approximately 20% of the time (0.85^10). This compound failure rate is the primary barrier to longer autonomous operation ([METR](https://metr.org/time-horizons/)).
+
+Devin's PR merge rate doubled from 34% to 67% year-over-year, but 33% of PRs are still rejected by human reviewers. Devin is described as "senior-level at codebase understanding but junior at execution" ([Cognition](https://cognition.ai/blog/devin-annual-performance-review-2025)). GitHub Copilot's suggestion acceptance rate averages 30%, though Java developers accept at 61% and the rate rises from 28.9% in the first 3 months to 32.1% in the next 3 months. 88% of accepted code is retained, suggesting that when the suggestion is right, developers keep it ([GitHub](https://docs.github.com/en/copilot/concepts/copilot-usage-metrics/copilot-metrics)).
+
+Independent forensic analysis of real-world agent PRs found that merged PRs correlate with small, localized changes; failed PRs are invasive and sprawling. 46% of developers actively distrust AI code accuracy. Only 3% highly trust it. 66% say their top frustration is "almost right, but not quite" ([Medium](https://medium.com/@vivek.babu/where-autonomous-coding-agents-fail-a-forensic-audit-of-real-world-prs-59d66e33efe9)).
+
+Devin's task-specific performance varies dramatically: security vulnerability fixes take 1.5 minutes vs 30 minutes for humans (20x speedup), file migrations take 3-4 hours vs 30-40 hours (10x). But complex real-world tasks succeed approximately 15% of the time without human assistance ([Trickle](https://trickle.so/blog/devin-ai-review)). The gap between routine and complex task completion rates explains why human oversight remains critical.
+
+### The METR productivity study complicates the narrative
+
+A rigorous RCT by METR found that AI tools made 16 experienced open-source developers (averaging 5 years on their projects, repositories averaging 22K+ stars) 19% slower across 246 real issues. Developers expected AI to speed them up by 24% and, even after the study, still believed it had sped them up by 20%. Less than 44% of AI-generated code was accepted ([METR](https://metr.org/blog/2025-07-10-early-2025-ai-experienced-os-dev-study/)). The study used early-2025 tools (Claude 3.5/3.7 Sonnet); METR announced in February 2026 that it is redesigning the experiment with current models.
+
+### Autonomy metrics from production
+
+Anthropic analyzed millions of Claude Code interactions between October 2025 and January 2026 ([Anthropic](https://www.anthropic.com/research/measuring-agent-autonomy)):
+
+- 99.9th percentile turn duration nearly doubled from under 25 minutes to over 45 minutes
+- Average human interventions per session decreased from 5.4 to 3.3
+- Users with fewer than 50 sessions use full auto-approve approximately 20% of the time; by 750 sessions this rises to over 40%
+- Experienced users interrupt Claude more often, not less, reflecting a shift from pre-approval to autonomous-with-intervention oversight
+- Only 0.8% of actions are irreversible
+- 73% of actions appear to have a human in the loop
+- On complex tasks, Claude self-stops for clarification more than 2x as often as humans interrupt it
+
+Average session length increased from 4 minutes in the autocomplete era to 23 minutes in the agentic era. 78% of Claude Code sessions in Q1 2026 involve multi-file edits, up from 34% in Q1 2025 ([Anthropic](https://resources.anthropic.com/2026-agentic-coding-trends-report)).
+
+Atlassian's HULA framework, deployed to 2,600 practitioners across 22,000+ eligible issues, measured human intervention at each stage of its coding pipeline: plan approval rate 82%, code generation for 87% of approved plans, 25% reached PR stage, 59% of generated PRs merged ([Atlassian/arXiv](https://arxiv.org/abs/2411.12924)).
+
+---
+
+## Part 3: What Human-in-the-Loop Provides
+
+### Taxonomy of human interventions
+
+Human interventions in agent workflows fall into four categories. No published study has measured the relative frequency of each category, though qualitative evidence points to context provision as the dominant one.
+
+**1. Intent clarification.** The human specifies what the agent is supposed to do. This includes initial task description, scope definition, and priority setting. Every coding agent session begins with this intervention. Prompt engineering research frames this as a bi-directional alignment problem where the user's definition of alignment evolves as they discover new requirements ([arXiv](https://arxiv.org/abs/2401.04122)).
+
+**2. Context provision.** The human supplies information the agent does not have: project conventions, architectural decisions, tribal knowledge, stakeholder preferences, historical decisions recorded in Jira tickets or Slack threads. Turing Post's industry analysis found that when developers explain why they distrust coding agents, the dominant response is: "We don't trust the context the model has." Missing context is identified as the critical issue: "The critical logic sleeps in a Jira ticket from 2019, or worse, it's tribal knowledge" ([Turing Post](https://www.turingpost.com/p/aisoftwarestack)).
+
+**3. Error correction.** The human identifies and corrects mistakes in agent output. GitHub Copilot's 70% suggestion rejection rate is the largest-scale dataset on this intervention type, though the rejection reasons (wrong code, wrong timing, partial match, style mismatch) are not broken down ([GitHub](https://docs.github.com/en/copilot/concepts/copilot-usage-metrics/copilot-metrics)). Devin's 33% PR rejection rate is another proxy ([Cognition](https://cognition.ai/blog/devin-annual-performance-review-2025)).
+
+**4. Approval gates.** The human grants permission for irreversible or high-consequence actions: database writes, deployments, production changes, external API calls, file deletions. RedMonk found that developers want fine-grained permissions for what agents can and cannot do autonomously, approval gates before destructive actions, configurable autonomy levels per task type, and clear audit trails ([RedMonk](https://redmonk.com/kholterhoff/2025/12/22/10-things-developers-want-from-their-agentic-ides-in-2025/)).
+
+### Intervention frequency is declining but not disappearing
+
+Average interventions per Claude Code session dropped from 5.4 to 3.3 between October 2025 and January 2026. Experienced users auto-approve at higher rates (40%+ vs 20% for new users) but also interrupt more frequently (9% of work steps vs 5% for new users). This pattern suggests experienced users delegate more freely on routine steps but intervene more precisely on specific steps that matter ([Anthropic](https://www.anthropic.com/research/measuring-agent-autonomy)).
+
+Only 0-20% of tasks can be fully delegated even with current frontier models ([Anthropic](https://resources.anthropic.com/2026-agentic-coding-trends-report)). Enterprise adoption confirms this: 99% of enterprise developers experimented with agents in 2025, but mass adoption of full autonomy did not materialize. The practical sweet spot is supervised autonomy where developers provide goals and guardrails, the agent executes independently, and the developer approves or rejects at decision points ([First Page Sage](https://firstpagesage.com/seo-blog/agentic-ai-statistics/)).
+
+### "Context is the bottleneck"
+
+The Turing Post finding deserves emphasis because it reframes the human-in-the-loop problem. If the primary reason developers intervene is to provide context the agent lacks, then the path to reducing interventions runs through better context rather than better models. Models are improving at approximately 7-month doubling intervals. Context provision infrastructure is comparatively underdeveloped.
+
+Anthropic's own engineering blog describes the core challenge of multi-context-window sessions: "Long-running agents must work in discrete sessions, each starting with no memory of prior work" ([Anthropic](https://www.anthropic.com/engineering/effective-harnesses-for-long-running-agents)). The bridging mechanism is a text file (claude-progress.txt) and git history. Even Opus 4.5, running in a loop across multiple context windows, fails to build production-quality apps from high-level prompts alone.
+
+Developer acceptance rate of agent changes is 89% when the agent provides a diff summary vs 62% for raw output ([Anthropic](https://resources.anthropic.com/2026-agentic-coding-trends-report)). The difference is not model capability. It is context presentation.
+
+---
+
+## Part 4: Which Interventions Personal Data Can Replace
+
+### Context provision: yes, partially
+
+Personal data from connected platforms (GitHub history, Slack conversations, Jira tickets, calendar, communication patterns) directly addresses the "missing context" intervention type. A developer's GitHub contribution history encodes coding style preferences. Their Slack messages contain project decisions. Their calendar reveals availability and meeting context.
+
+No product currently ingests this cross-platform personal data and feeds it to a coding agent. The closest analogs:
+
+- **CLAUDE.md files** provide static per-project context that is manually authored
+- **Cursor's .cursorrules** serves the same function with different syntax
+- **ChatGPT's connectors** ingest Gmail, Google Drive, and GitHub but do not pipe that context to a coding agent
+- **Glean** builds per-employee knowledge graphs from 100+ workplace sources but is enterprise-scoped and does not target coding agents
+
+Mem0's research claims 26% higher response accuracy when agents have access to structured persistent memory vs. stateless operation ([Mem0](https://mem0.ai/research)). This is the closest published number to measuring the effect of personal context on agent performance, though it measures session-derived memory, not cross-platform personal data.
+
+### Approval gates: no
+
+Personal data does not replace the need for human approval of irreversible actions. Knowing a developer's preferences does not authorize the agent to deploy to production. Approval gates are trust mechanisms, not knowledge mechanisms.
+
+Only 0.8% of Claude Code actions are irreversible, but those 0.8% are the ones that matter most ([Anthropic](https://www.anthropic.com/research/measuring-agent-autonomy)). Dependabot and Renovate demonstrate that even well-understood, narrow-scope automated PRs for dependency updates still require human review before merge.
+
+### Error correction: partially
+
+Personal data can reduce error correction interventions in two ways. First, coding style preferences and project conventions embedded in personal data can prevent style-related errors before they occur (the "almost right, but not quite" problem that 66% of developers cite as their top frustration). Second, a developer's code review history encodes what they consider acceptable code quality.
+
+Personal data does not address logical errors, architectural mistakes, or misunderstanding of requirements. Cursor's RL-based suggestion filtering demonstrates the implicit version: the model retrains multiple times per day on accept/reject signals, producing 21% fewer suggestions but 28% higher acceptance rate ([Analytics India Magazine](https://analyticsindiamag.com/ai-news-updates/cursor-is-using-real-time-reinforcement-learning-to-improve-suggestions-for-developers/)).
+
+### Intent inference: emerging
+
+Intent inference, where a system predicts what the user wants without explicit prompting, is the most speculative category. Netflix's FM-Intent framework predicts user session intent from short- and long-term implicit signals ([Netflix Tech Blog](https://research.netflix.com/publication/fm-intent-predicting-user-session-intent)). Google Research demonstrated that small multimodal LLMs can extract user intent from sequences of screen interactions ([Google Research](https://research.google/blog/small-models-big-results-achieving-superior-intent-extraction-through-decomposition/)). Research on proactive agents formalizes agents that perceive environmental context and infer user intentions without explicit prompts ([arXiv](https://arxiv.org/abs/2410.12361)).
+
+The PROBE benchmark, measuring proactive problem-solving in LLM agents, found that even state-of-the-art models achieve no more than 40% success on end-to-end proactive tasks ([arXiv](https://arxiv.org/abs/2510.19771)).
+
+No product generates coding agent prompts from personal data. The closest are OpenClaw daily briefings (which summarize but do not generate actionable coding tasks) and GitHub Copilot's coding agent (which acts on issues but does not infer tasks from personal data).
+
+### Constitutional AI: compressing human oversight into principles
+
+Anthropic's Constitutional AI research demonstrates that human oversight can be compressed from thousands of individual preference labels to approximately 10 natural-language principles. The model self-critiques and revises its own outputs, then an AI preference model replaces human labelers in the RL phase. Constitutional RL produces Pareto improvements over RLHF (both more helpful and more harmless) ([arXiv](https://arxiv.org/abs/2212.08073)).
+
+This is relevant because personal data could serve a similar compression function: instead of per-instance human feedback, a developer's cross-platform data encodes persistent preferences that the agent can reference. The difference is that Constitutional AI uses hand-authored principles, while personal data would provide empirically derived preferences.
+
+No published measurement exists of how much residual human oversight Constitutional AI eliminates in production deployment vs. standard RLHF.
+
+### The HITL to HOTL transition
+
+Industry analysis distinguishes between Human-in-the-Loop (HITL), where a human must review and approve before each action, and Human-on-the-Loop (HOTL), where agents run autonomously by default and humans intervene only on anomalies. Moving to HOTL does not mean abandoning HITL; it means using HITL strategically with optional interventions on demand ([ByteBridge](https://bytebridge.medium.com/from-human-in-the-loop-to-human-on-the-loop-evolving-ai-agent-autonomy-c0ae62c3bf91)).
+
+Anthropic's telemetry data shows this transition happening organically: experienced users (750+ sessions) auto-approve over 40% of sessions but interrupt at a higher rate (9% vs 5%) on the specific steps where they intervene. The oversight model shifts from "approve everything" to "trust by default, intervene precisely."
+
+Personal data could accelerate this transition by providing the agent with enough context to handle the routine interventions autonomously, concentrating human attention on the genuinely novel or high-stakes decisions.
+
+---
+
+## Part 5: Products at the Intersection
+
+### Memory infrastructure
+
+The agent memory market is small but growing:
+
+| Company | Raised | Key Metric | What It Stores |
+| ----------- | ------ | ---------------------- | --------------------------------------------- |
+| Mem0 | $24M | 186M API calls/quarter | Session-derived structured memories |
+| Zep | $500K | YC W24 | Chat history + business data + temporal facts |
+| Supermemory | $2.6M | Seed stage | Extracted memories from unstructured data |
+| Memories.ai | $8M | Seed stage | Visual memory (LVMM) |
+| LangChain | $160M | 47M+ PyPI downloads | Framework-level memory abstractions |
+
+All of these store agent-generated or session-derived memory. None ingest cross-platform personal data from the user's own accounts.
+
+Anthropic chose transparent file-based memory (CLAUDE.md, markdown files readable by both human and model) over vector databases ([Anthropic](https://platform.claude.com/docs/en/agents-and-tools/tool-use/memory-tool)). Claude Code's 6-layer memory system loads at session start, providing static context without cross-session learning from personal data sources.
+
+### Personal AI products
+
+ChatGPT Memory + Connectors is the largest-scale personal data integration: 800M+ weekly active users, connectors to Gmail, Google Drive, GitHub, Outlook, and Teams, with MCP-based partner connectors to Stripe, Amplitude, Monday.com, and others. Memory expanded in April 2025 to reference all past conversations. Connectors are available to Team, Enterprise, and Edu plans ([OpenAI](https://openai.com/index/memory-and-new-controls-for-chatgpt/)).
+
+Google Personal Intelligence connects Gemini to Gmail, Photos, YouTube history, and Search history, rolling out to free Gemini users in the US as of March 2026 ([Google](https://blog.google/innovation-and-ai/products/gemini-app/personal-intelligence/)).
+
+Samsung's Personal Data Engine uses an on-device RDFox knowledge graph to drive the Now Brief daily summary and Smart Gallery search, secured by Knox Vault encryption ([Computer Weekly](https://www.computerweekly.com/news/366618319/Samsung-unpacks-Galaxy-AIs-personal-data-engine)).
+
+All three are single-ecosystem: ChatGPT connects to services via its own connectors, Google connects to Google services, Samsung connects to Samsung device data. None aggregate across ecosystems.
+
+### Daily briefing agents
+
+The daily briefing pattern, where an agent proactively synthesizes data from connected sources at a scheduled time, emerged as a distinct product category in 2025-2026:
+
+- **OpenClaw**: 264K GitHub stars (March 2026), surpassing React as the most-starred software project. Self-hosted, connects to Gmail, Calendar, GitHub, RSS, Todoist, Linear, Stripe. Cron-scheduled daily briefings ([GitHub](https://github.com/openclaw/openclaw)).
+- **Google CC (Your Day Ahead)**: Launched December 2025, sends daily morning briefing email from Gmail, Calendar, and Drive without requiring a prompt ([Google](https://blog.google/technology/google-labs/cc-ai-agent/)).
+- **ChatGPT Pulse**: Launched September 2025, delivered personalized morning updates from user memory and chat history. Shelved December 2025 when Sam Altman issued "Code Red" to refocus on core ChatGPT ([TechCrunch](https://techcrunch.com/2025/09/25/openai-launches-chatgpt-pulse-to-proactively-write-you-morning-briefs/)).
+- **ChatGPT Scheduled Tasks**: Recurring prompts at predetermined times, limit of 10 active tasks, still in beta ([OpenAI](https://help.openai.com/en/articles/10291617-scheduled-tasks-in-chatgpt)).
+- **Gemini Scheduled Actions**: Including Goal Scheduled Actions (February 2026) where the AI reviews previous outputs and adjusts next actions ([Google](https://blog.google/products-and-platforms/products/gemini/scheduled-actions-gemini-app/)).
+
+OpenClaw's 264K stars in under 3 months (from 9K to 264K) demonstrate strong demand for the data-driven proactive agent pattern. NanoClaw (10.5K stars) provides a smaller, security-focused alternative built on Anthropic's Agents SDK with Linux container isolation ([GitHub](https://github.com/qwibitai/nanoclaw)).
+
+---
+
+## Part 6: Competitive Landscape
+
+### Coding agents: market structure
+
+The autonomous coding agent market as of March 2026:
+
+| Product | Valuation/Scale | ARR | Model | Personal Context Mechanism |
+| -------------- | ------------------ | ------------- | ----------------- | -------------------------------------- |
+| Cursor | $29.3B | $2B+ | Multi-model | .cursorrules file; RL on accept/reject |
+| Claude Code | (Anthropic: $380B) | $2.5B | Claude Opus 4.6 | CLAUDE.md files; hooks; memory |
+| Devin | $10.2B | $155M+ | Multi-model | Devin Wiki; codebase indexing |
+| GitHub Copilot | (Microsoft) | Est. $500M-1B | Multi-model | Repository context |
+| Codex | (OpenAI) | Bundled | GPT-5.3-Codex | Repository context |
+| Cline | $32M raised | N/A (OSS) | Multi-model (BYO) | MCP integrations |
+| Aider | N/A (OSS) | N/A | Multi-model | Repository map |
+
+All coding agents operate on codebase context. None ingest cross-platform personal data. The personal context mechanisms are either static files (.cursorrules, CLAUDE.md), session-derived learning (Cursor's RL), or codebase indexing.
+
+### MCP ecosystem
+
+MCP is the primary protocol through which personal data could reach coding agents. As of March 2026:
+
+- 10,000+ active public MCP servers
+- 97M+ monthly SDK downloads
+- Adopted by ChatGPT, Cursor, Gemini, Microsoft Copilot, VS Code
+- Donated to Agentic AI Foundation (Linux Foundation) co-founded by Anthropic, Block, OpenAI with support from Google, Microsoft, AWS, Cloudflare, Bloomberg ([Anthropic](https://www.anthropic.com/news/donating-the-model-context-protocol-and-establishing-of-the-agentic-ai-foundation))
+- 2026 roadmap: triggers/events, streamed results, security/authorization hardening, registry/discovery infrastructure
+
+Google's Agent2Agent Protocol (A2A), launched April 2025 with 150+ supporting organizations, is designed as a complement to MCP: MCP handles tools and context, A2A handles agent-to-agent coordination ([Google](https://developers.googleblog.com/en/a2a-a-new-era-of-agent-interoperability/)).
+
+No data exists on how many MCP servers handle personal data vs. enterprise data. Usercentrics acquired MCP Manager in January 2026 to extend consent infrastructure into MCP-based AI workflows, indicating that the personal data access question is active ([Usercentrics](https://usercentrics.com/press/usercentrics-acquires-mcp-manager/)).
+
+### Memory companies
+
+| Company | Category | Differentiation | Structural Constraint |
+| ----------- | ------------------- | --------------------------------------------------------- | ------------------------------------------ |
+| Mem0 | Memory API | Graph-based; sub-second retrieval; AWS exclusive provider | Stores session memories, not personal data |
+| Zep | Context engineering | Temporal reasoning; fact evolution tracking | Early stage; limited traction |
+| Supermemory | Memory extraction | Unstructured-to-structured conversion | Pre-revenue |
+| Memories.ai | Visual memory | LVMM for long-term visual data | Pre-revenue; hardware-adjacent |
+
+### Data portability
+
+| Entity | Approach | Scale | Constraint |
+| ------ | ----------------------------------------- | ------------------------------- | --------------------------------- |
+| DTI | Open-source platform-to-platform transfer | Apple, Google, Meta as partners | Transfer not AI-focused |
+| Vana | Blockchain-based data sovereignty | 12M+ data points; mainnet live | Crypto adoption barrier |
+| Solid | Standards-based data pods | Berners-Lee backed | "Not ready for general adoption" |
+| MCP | Protocol for AI-to-data connection | 10K+ servers | No personal data governance layer |
+| Plaid | Financial data aggregation | $800M+ revenue | Financial domain only |
+
+### Enterprise knowledge platforms
+
+Glean ($7.2B valuation, $100M+ ARR) builds a personalized knowledge graph per employee from 100+ workplace data sources. Its third-generation assistant (September 2025) tracks per-employee projects, collaborators, and work style, enabling agents that summarize weekly work or prepare performance reviews ([Glean](https://www.glean.com/press/glean-achieves-100m-arr-in-three-years-delivering-true-ai-roi-to-the-enterprise)).
+
+Notion ($500M annualized revenue, 100M+ users) launched autonomous AI agents in Notion 3.0 (September 2025) that work for up to 20 minutes on multi-step tasks across hundreds of pages. AI adoption crossed 50% of paying customers using AI features in 2025, up from 10-20% in 2024. Notion 3.2 (January 2026) added multi-model support (GPT-5.2, Claude Opus 4.5, Gemini 3) and agents on mobile ([Notion](https://www.notion.com/releases/2025-09-18), [CNBC](https://www.cnbc.com/2025/09/18/notion-launches-ai-agent-as-it-crosses-500-million-in-annual-revenue.html)).
+
+Dust.tt ($21.5M raised, $6M ARR) creates enterprise agents connected to company knowledge via MCP integrations with Asana, Jira, GitHub, Google Drive, Gong, Gmail, Salesforce, HubSpot, and Notion. Customers achieve 70%+ weekly AI adoption rates. 2026 vision: multi-player agents where teams share context with AI teammates ([VentureBeat](https://venturebeat.com/ai/dust-hits-6m-arr-helping-enterprises-build-ai-agents-that-actually-do-stuff-instead-of-just-talking)).
+
+All of these are enterprise-scoped and limited to work data. None aggregate personal data from outside the workplace.
+
+### Workflow automation platforms
+
+n8n (150K+ GitHub stars, $40M ARR, $2.5B valuation) and Zapier (8,000+ app integrations) provide AI-powered workflow automation with native agent nodes and MCP support. n8n's revenue grew 5x after pivoting to AI-friendly approaches in 2022. Zapier Canvas converts visual process diagrams into functioning automations. Both platforms demonstrate that cross-platform data orchestration works when the user explicitly defines workflows, but neither infers tasks or generates agent prompts from personal data ([n8n](https://github.com/n8n-io/n8n), [Zapier](https://zapier.com/)).
+
+CrewAI ($18M raised, backed by Andrew Ng and Dharmesh Shah) natively integrates Mem0 for memory in multi-agent automation. It is the fastest-growing multi-agent framework by adoption ([SiliconAngle](https://siliconangle.com/2024/10/22/agentic-ai-startup-crewai-closes-18m-funding-round/)).
+
+### The whitespace
+
+No product currently ingests cross-platform personal data (GitHub + ChatGPT + LinkedIn + Spotify together) and uses that combined context for AI agent autonomy. Products are either single-platform (ChatGPT Memory), enterprise-scoped (Glean), or domain-specific (Plaid for finance, Otter.ai for meetings). The cross-platform personal context layer is whitespace.
+
+---
+
+## Part 7: Product Shape Analysis
+
+### Reactive vs. proactive agents
+
+Current coding agents are reactive: they wait for a prompt, then execute. The proactive agent pattern, where an agent takes initiative based on available context without explicit prompting, is emerging in adjacent categories (daily briefings, scheduled tasks, dependency updates) but does not exist for coding.
+
+Dependabot (846K+ repos) and Renovate (90+ package managers) demonstrate that proactive, scheduled code changes are accepted by developers at scale when the scope is well-defined and consequences are bounded ([GitHub](https://docs.renovatebot.com/bot-comparison/)). These are narrow-scope proactive agents that succeed precisely because they do not require personal data or intent inference.
+
+### The daily briefing pattern
+
+OpenClaw's 264K stars demonstrate demand for the pattern: aggregate data from multiple sources, synthesize it, deliver it at a scheduled time. The pattern works because:
+
+1. It is additive (user can ignore it without consequence)
+2. It is time-bounded (runs once, delivers once)
+3. It is transparent (user can see all sources)
+4. It degrades gracefully (silently skips missing integrations)
+
+Google CC, ChatGPT Pulse, ChatGPT Scheduled Tasks, and Gemini Scheduled Actions all implement variants of this pattern. ChatGPT Pulse was shelved after 3 months, suggesting that platform prioritization, not user demand, killed it.
+
+### Scheduled task patterns
+
+Three distinct scheduling models exist in production:
+
+1. **One-time execution** (OpenClaw's `at` type, ChatGPT Scheduled Tasks): run once at a specified time
+2. **Interval-based** (OpenClaw's `every` type): run every N minutes/hours
+3. **Cron-based** (OpenClaw's `cron` type, Dependabot, Renovate): standard cron expressions for recurring schedules
+
+Gemini's Goal Scheduled Actions introduce a fourth pattern: **goal-directed scheduling** where the agent reviews outputs from previous runs and adjusts the next action. This is the closest production feature to autonomous, adaptive agents operating over time.
+
+### Intent inference: research state
+
+Academic research on proactive agents is active but early:
+
+- ProactiveBench: 6,790 events for evaluating proactive agent behavior ([arXiv](https://arxiv.org/abs/2410.12361))
+- PROBE benchmark: state-of-the-art achieves no more than 40% on proactive tasks ([arXiv](https://arxiv.org/abs/2510.19771))
+- ProAgentBench: evaluating LLM agents for proactive assistance ([arXiv](https://arxiv.org/html/2602.04482v1))
+- PPP framework: multi-objective RL optimizing productivity, proactivity, and personalization jointly ([arXiv](https://arxiv.org/abs/2511.02208))
+- Netflix FM-Intent: predicts session intent from implicit signals in production at Netflix scale ([Netflix](https://research.netflix.com/publication/fm-intent-predicting-user-session-intent))
+- Google's intent decomposition: small on-device models extract user intent from screen interaction sequences ([Google Research](https://research.google/blog/small-models-big-results-achieving-superior-intent-extraction-through-decomposition/))
+
+The 40% PROBE ceiling indicates that proactive task generation from personal data is technically feasible but not reliable. No commercial product has published comparable accuracy metrics for proactive features.
+
+---
+
+## Part 8: The Graveyard
+
+### Pattern 1: Hardware fails
+
+**Humane AI Pin**: $230M raised, approximately 10,000 units sold, returns outpaced sales. HP acquired assets for $116M. All devices bricked February 28, 2025. Charging case recalled for battery fire risk ([TechCrunch](https://techcrunch.com/2025/02/18/humanes-ai-pin-is-dead-as-hp-buys-startups-assets-for-116m/)).
+
+**Rabbit R1**: $64.7M raised, 130,000 units sold, 5,000 daily active users 5 months after launch (5% retention). Entire interface discovered to be a single Android app. CEO admitted premature launch ([9to5Google](https://9to5google.com/2024/09/26/rabbit-5000-people-use-the-r1-daily/)).
+
+**Narrative Clip**: Kickstarted in 2012, automatic photo every 30 seconds. Dissolved September 2016. Could not compete with smartphone cameras and live-streaming services ([PetaPixel](https://petapixel.com/2016/09/28/lifelogging-camera-maker-narrative-going-business/)).
+
+**Magic Leap**: $4.5B raised. Magic Leap One (2018) at $2,300 failed to meet sales targets. Cut half the workforce in 2020. Pivoted from consumer to enterprise, then to licensing optics technology ([Road to VR](https://www.roadtovr.com/magic-leap-layoff-2024-optics-pivot/)).
+
+The pattern: dedicated hardware for personal AI context adds friction, cost, and failure modes without providing enough value over software-only approaches. Meta's acquisition of Limitless for Ray-Ban smart glasses integration suggests hardware may work as a peripheral rather than a standalone product.
+
+### Pattern 2: HITL does not scale economically
+
+**Facebook M**: Launched August 2015 to approximately 2,000 users. Over 70% of requests required human operators. Scaling to Messenger's 1.3 billion users would have required a prohibitively large human workforce. Shut down January 2018 ([TechCrunch](https://techcrunch.com/2018/01/08/facebook-is-shutting-down-its-standalone-personal-assistant-m/)).
+
+**Builder.ai (Natasha)**: $445M raised, $1.2B valuation. Investigation revealed approximately 700 human engineers in India doing work attributed to AI. Claimed $220M revenue; independent audit found approximately $50M. Bankrupt May 2025 ([TechStartups](https://techstartups.com/2025/05/24/builder-ai-a-microsoft-backed-ai-startup-once-valued-at-1-2-billion-files-for-bankruptcy-is-ai-becoming-another-com-bubble/)).
+
+**Forward Health (CarePods)**: $400M+ raised. Planned 3,200 autonomous medical kiosks; launched 3. Blood draws frequently failed. Patients got stuck inside pods. Shut down November 2024 ([Fierce Healthcare](https://www.fiercehealthcare.com/health-tech/primary-care-player-forward-shutters-after-raising-400m-rolling-out-carepods)).
+
+The pattern: products that depend on human operators behind the scenes cannot reach the unit economics needed for consumer scale. The 70% HITL rate for Facebook M and the hundreds of hidden engineers at Builder.ai illustrate the gap between marketed autonomy and actual capability.
+
+### Pattern 3: Personal context features get deprioritized
+
+**Microsoft Cortana**: Peak 145 million MAU, never exceeded 2% voice assistant market share. Had deep OS-level personal data access (calendar, email, files). Personal context features did not drive sustained engagement. Deprecated August 2023, replaced by the task-focused Copilot ([TechCrunch](https://techcrunch.com/2023/08/04/microsoft-kills-cortana-in-windows-as-it-focuses-on-next-gen-ai/)).
+
+**Google Now / Google Assistant**: Google Now (2012) proactively surfaced contextual cards from email, calendar, location. Gradually deprecated. Google Assistant removed 17 personal context features in January 2024 (travel itineraries, contact queries, email/payments by voice). Assistant itself deprecated March 2026, replaced by Gemini ([Google](https://blog.google/products/assistant/google-assistant-update-january-2024/)).
+
+**ChatGPT Pulse**: Launched September 2025, shelved December 2025 after 3 months when Altman issued "Code Red" to refocus on core product ([TechCrunch](https://techcrunch.com/2025/09/25/openai-launches-chatgpt-pulse-to-proactively-write-you-morning-briefs/)).
+
+**Apple Intelligence**: Personal context features (Siri accessing emails, photos, messages) delayed from 2025 to spring 2026 and possibly beyond. Craig Federighi: "when it comes to automating capabilities on devices in a reliable way, no one's doing it really well right now" ([CNBC](https://www.cnbc.com/2025/03/07/apple-delays-siri-ai-improvements-to-2026.html)).
+
+The pattern: large platform companies build personal context features, then deprioritize or remove them in favor of more tractable product directions. This has happened at Microsoft, Google, OpenAI, and Apple. Personal context AI may be a better fit for dedicated products than as features inside platforms.
+
+### Pattern 4: Strong engagement does not equal a business model
+
+**Inflection AI (Pi)**: $1.525B raised at $4B valuation. 1 million daily active users, 6 million monthly active users. Strong engagement with empathetic personal AI. No sustainable business model. Microsoft acqui-hired nearly all 70 employees and paid $650M to license models. Pi continues with usage caps and a skeleton crew ([eesel.ai](https://www.eesel.ai/blog/inflection-ai)).
+
+**Limitless (Rewind AI)**: $33M+ raised at $350M valuation on $707K ARR (495x revenue multiple). Pivoted from desktop screen recording to wearable pendant. Strong investor belief in personal data value. Could not find product-market fit independently. Acquired by Meta December 2025 ([TechCrunch](https://techcrunch.com/2025/12/05/meta-acquires-ai-device-startup-limitless/)).
+
+**Mem.ai**: $29.1M raised at $110M post-money valuation. Criticized for underpowered AI and missing basic features. Development appears stalled per user reports. Revenue not publicly disclosed ([Medium](https://medium.com/@theo-james/mem-ai-the-40m-second-brain-failure-burning-the-worlds-money-5f3176a34cbd)).
+
+The pattern: investor enthusiasm for personal data AI exceeds demonstrated unit economics. Limitless's 495x revenue multiple and Inflection's $4B valuation on zero revenue suggest the market prices personal data AI on potential rather than traction.
+
+### Startup mortality context
+
+966 startups shut down in 2024 (25.6% increase from 769 in 2023). The 2023-2024 cycle rewarded speed and UX, producing a long tail of thin GPT-wrapper products. The 2025 market shifted to require proprietary data advantage and real unit economics ([Simple Closure](https://simpleclosure.com/blog/posts/state-of-startup-shutdowns-2025/)).
+
+88% of YC S25 (141 of 169 startups) was AI-native, with over 50% building agentic AI, mostly domain-specific copilots rather than general personal AI. Nearly 50% of YC W26 was identified as AI agent companies ([Catalaize](https://catalaize.substack.com/p/y-combinator-s25-batch-profile-and), [TLDL](https://www.tldl.io/blog/yc-ai-startups-2026)). No YC startup was found focused specifically on personal data portability for AI agent context.
+
+---
+
+## Part 9: Regulatory and Trust Barriers
+
+### Regulatory landscape
+
+**GDPR Article 20 (Data Portability)**: Gives EU residents the right to receive their personal data in a structured, commonly used, machine-readable format and transmit it to another controller. This right is the regulatory basis for cross-platform personal data aggregation.
+
+**EU Digital Markets Act (DMA)**: Designates iOS and Android as services obligated to facilitate effective data portability. Apple and Google announced OS-level switching collaboration in late 2025.
+
+**Utah Digital Choice Act**: State-level data portability legislation in the US.
+
+**EU AI Act**: Full enforcement of high-risk AI system requirements takes effect August 2, 2026. Penalties up to 35M EUR or 7% of global turnover. Requires conformity assessments, technical documentation, data minimization, purpose limitation, and transparency ([Legal Nodes](https://www.legalnodes.com/article/eu-ai-act-2026-updates-compliance-requirements-and-business-risks)).
+
+**CCPA Automated Decision-Making**: California finalized regulations on automated decision-making technology, broadly defined as any technology that processes personal information to replace or substantially replace human decision-making. Compliance required from January 1, 2026 for some provisions ([Wiley](https://www.wiley.law/alert-California-Finalizes-Pivotal-CCPA-Regulations-on-AI-Cyber-Audits-and-Risk-Governance)).
+
+**Spain AEPD**: Published a 71-page guide in February 2026 identifying persistent memory profiles, autonomous multi-service access, and consequential actions without human checkpoints as novel risks specific to agentic AI ([PPC.Land](https://ppc.land/spains-data-watchdog-maps-the-hidden-gdpr-risks-of-agentic-ai/)).
+
+**UK ICO**: Flagged data minimization and transparency as compliance risks for agentic AI, noting that agent-to-agent data flows create unobservable processing ([InsidePrivacy](https://www.insideprivacy.com/artificial-intelligence/ico-shares-early-views-on-agentic-ai-data-protection/)).
+
+**EDPB**: Clarified that LLMs rarely achieve anonymization standards; controllers deploying third-party LLMs must conduct comprehensive legitimate interests assessments ([DPO Centre](https://www.dpocentre.com/data-protection-ai-governance-2025-2026/)).
+
+Regulatory tailwinds exist for data portability (GDPR Art. 20, DMA, Utah). Regulatory headwinds exist for personal data processing by AI agents (EU AI Act, CCPA ADMT, EDPB anonymization standards).
+
+### User trust
+
+Trust data paints a mixed picture:
+
+- 35% of Americans use AI weekly but only 5% trust it deeply ([YouGov](https://yougov.com/en-us/articles/53701-most-americans-use-ai-but-still-dont-trust-it))
+- 48% cite data exposure as the primary adoption barrier, outranking hallucinations ([YouGov](https://yougov.com/en-us/articles/53701-most-americans-use-ai-but-still-dont-trust-it))
+- 40% say they would never enter personal or financial information into an AI tool ([YouGov](https://yougov.com/en-us/articles/53701-most-americans-use-ai-but-still-dont-trust-it))
+- 82% believe companies train AI on their data without disclosure ([Relyance AI](https://www.relyance.ai/consumer-ai-trust-survey-2025))
+- Only approximately 10% are very willing to share financial, communication, or biometric data ([Relyance AI](https://www.relyance.ai/consumer-ai-trust-survey-2025))
+- Nearly 60% are willing to share data for personalized shopping recommendations, but willingness varies sharply by context ([ARF](https://www.prnewswire.com/news-releases/trust-in-ai-surges-as-consumers-take-a-more-transactional-view-of-data-sharing-arf-study-finds-302667046.html))
+- Trust in AI surged 16 points in 2025 but from a low base ([ARF](https://www.prnewswire.com/news-releases/trust-in-ai-surges-as-consumers-take-a-more-transactional-view-of-data-sharing-arf-study-finds-302667046.html))
+
+Microsoft's Recall feature (screenshots every few seconds for AI context) faced severe backlash when researchers found it stored passwords, financial data, and medical records in plaintext. It was redesigned to be opt-in with full database encryption ([Time](https://time.com/6980911/microsoft-copilot-recall-ai-features-privacy-concerns/)).
+
+Simon Willison documented ChatGPT's "context collapse" problem: the system inferred his location (Half Moon Bay) from prior conversations and inserted it into an unrelated image generation request ([Simon Willison](https://simonwillison.net/2025/May/21/chatgpt-new-memory/)).
+
+OWASP ranked prompt injection as the #1 critical vulnerability in its 2025 Top 10 for LLM Applications, appearing in 73% of production AI deployments assessed. OpenAI stated in December 2025 that prompt injection "is unlikely to ever be fully solved" ([OWASP](https://genai.owasp.org/llmrisk/llm01-prompt-injection/)).
+
+### Cold start
+
+Cold start research indicates that a single AI task can involve 20-30 preference dimensions but individual users care about only 2-4, making cold start a navigation problem in high-dimensional space. Strategies include onboarding elicitation, leveraging existing data, and real-time streaming updates that turn cold start into a short-term condition ([arXiv](https://arxiv.org/html/2602.15012)).
+
+No data was found on the minimum viable personalization threshold: how much personal context is needed before an AI agent becomes measurably more useful than a generic one.
+
+### Privacy infrastructure
+
+Skyflow ($100M raised, approximately 1 billion records, 2B+ API calls per quarter) provides tokenized data vaults that detect sensitive data and replace it with deterministic tokens before sending to AI models. Its LLM Privacy Vault product intercepts data before it reaches the model, addressing the trust gap with auditable, tokenized access ([Skyflow](https://www.skyflow.com/post/generative-ai-data-privacy-skyflow-llm-privacy-vault)).
+
+Usercentrics acquired MCP Manager in January 2026 to extend consent guardrails into MCP-based AI workflows. This is the first major privacy company to address consent specifically in the context of MCP-based data access by AI agents ([Usercentrics](https://usercentrics.com/press/usercentrics-acquires-mcp-manager/)).
+
+First-generation personal data stores (Digi.me, Meeco, MyDex) have offered encrypted user-controlled storage with selective sharing since as early as 2012. Meeco holds ISO 27001 accreditation with Zero Knowledge Value architecture. MyDex operates as a Community Interest Company focused on health data. All remain small-scale with limited consumer adoption after years of operation. No publicly available user count or revenue figures exist for any of them in 2025-2026 ([PMC](https://pmc.ncbi.nlm.nih.gov/articles/PMC9921726/)).
+
+The Solid Project (Tim Berners-Lee, Inrupt) provides standards-based personal data pods with time-bound access grants. Berners-Lee's 2025 book positions Solid as a counter to AI built on platform data, but he acknowledges "not ready for general adoption yet." Network effects remain the core obstacle ([Wikipedia](https://en.wikipedia.org/wiki/Solid_%28web_decentralization_project%29)).
+
+These indicate that infrastructure for privacy-preserving personal data access by AI agents is being built at multiple layers (vaults, consent, pods), but none has achieved broad adoption.
+
+### The economics of personal data ingestion
+
+Embedding and indexing personal documents is inexpensive: 10,000 documents can be embedded and indexed for under $100. RAG cuts fine-tuning spend by 60-80%. Google charges $0.15 per 1 million tokens for embedding generation. However, operational staffing for small teams costs $750K+ annually, and fine-tuning a 70B model costs $50K-200K in compute ([The Data Guy](https://thedataguy.pro/blog/2025/07/the-economics-of-rag-cost-optimization-for-production-systems/)).
+
+Plaid demonstrates the economics of personal data connectivity at scale: estimated $800M+ annual revenue, 220+ new products and features in 2025, AI-powered auto-repair enabling 2M+ successful user logins and reducing degradation fix time by 90%. Its LendScore credit risk product uses real-time cash flow data, showing that AI-powered enrichment on top of aggregated personal data is a proven business model in the financial domain ([Sacra](https://sacra.com/c/plaid/)).
+
+The global data monetization market was valued at $3.75B in 2024 and is projected to reach $28.16B by 2033 (25.1% CAGR). Organizations commercializing data via APIs report recurring revenue growth exceeding 20% annually. 60% of companies cite compliance concerns as the primary barrier ([Grand View Research](https://www.grandviewresearch.com/industry-analysis/data-monetization-market)).
+
+---
+
+## Sources
+
+All sources are cited inline. Key publications referenced:
+
+- Anthropic. "Measuring Agent Autonomy." February 2026. https://www.anthropic.com/research/measuring-agent-autonomy
+- Anthropic. "2026 Agentic Coding Trends Report." 2026. https://resources.anthropic.com/2026-agentic-coding-trends-report
+- Anthropic. "Effective Harnesses for Long-Running Agents." 2026. https://www.anthropic.com/engineering/effective-harnesses-for-long-running-agents
+- Anthropic. "Constitutional AI." December 2022. https://arxiv.org/abs/2212.08073
+- Anthropic. "MCP and the Agentic AI Foundation." 2026. https://www.anthropic.com/news/donating-the-model-context-protocol-and-establishing-of-the-agentic-ai-foundation
+- Atlassian. "HULA Framework." November 2024. https://arxiv.org/abs/2411.12924
+- ByteBridge. "From HITL to HOTL." January 2026. https://bytebridge.medium.com/from-human-in-the-loop-to-human-on-the-loop-evolving-ai-agent-autonomy-c0ae62c3bf91
+- Cognition. "Devin Annual Performance Review 2025." https://cognition.ai/blog/devin-annual-performance-review-2025
+- First Page Sage. "Agentic AI Statistics." 2026. https://firstpagesage.com/seo-blog/agentic-ai-statistics/
+- GitHub. "Copilot Metrics." https://docs.github.com/en/copilot/concepts/copilot-usage-metrics/copilot-metrics
+- Google. "Agent2Agent Protocol." April 2025. https://developers.googleblog.com/en/a2a-a-new-era-of-agent-interoperability/
+- Google. "Personal Intelligence." January 2026. https://blog.google/innovation-and-ai/products/gemini-app/personal-intelligence/
+- Google Research. "Small Models, Big Results." EMNLP 2025. https://research.google/blog/small-models-big-results-achieving-superior-intent-extraction-through-decomposition/
+- Grand View Research. "Data Monetization Market." https://www.grandviewresearch.com/industry-analysis/data-monetization-market
+- Mem0. "Series A." October 2025. https://techcrunch.com/2025/10/28/mem0-raises-24m-from-yc-peak-xv-and-basis-set-to-build-the-memory-layer-for-ai-apps/
+- METR. "Time Horizons." 2026. https://metr.org/time-horizons/
+- METR. "AI Developer Productivity Study." July 2025. https://metr.org/blog/2025-07-10-early-2025-ai-experienced-os-dev-study/
+- Netflix. "FM-Intent." https://research.netflix.com/publication/fm-intent-predicting-user-session-intent
+- OpenAI. "Introducing GPT-5.3-Codex." 2026. https://openai.com/index/introducing-gpt-5-3-codex/
+- OpenAI. "Memory and New Controls for ChatGPT." https://openai.com/index/memory-and-new-controls-for-chatgpt/
+- OWASP. "LLM01 Prompt Injection." 2025. https://genai.owasp.org/llmrisk/llm01-prompt-injection/
+- PROBE Benchmark. October 2025. https://arxiv.org/abs/2510.19771
+- RedMonk. "10 Things Developers Want from Agentic IDEs." December 2025. https://redmonk.com/kholterhoff/2025/12/22/10-things-developers-want-from-their-agentic-ides-in-2025/
+- Relyance AI. "Consumer AI Trust Survey." 2025. https://www.relyance.ai/consumer-ai-trust-survey-2025
+- Simple Closure. "State of Startup Shutdowns 2025." https://simpleclosure.com/blog/posts/state-of-startup-shutdowns-2025/
+- SWE-bench. https://www.swebench.com/
+- Turing Post. "State of AI Coding." 2025. https://www.turingpost.com/p/aisoftwarestack
+- YouGov. "AI Trust Survey." 2025. https://yougov.com/en-us/articles/53701-most-americans-use-ai-but-still-dont-trust-it
diff --git a/research/personal-data-agents/personal-data-as-agent-context-memo.md b/research/personal-data-agents/personal-data-as-agent-context-memo.md
new file mode 100644
index 00000000..ac1c5365
--- /dev/null
+++ b/research/personal-data-agents/personal-data-as-agent-context-memo.md
@@ -0,0 +1,72 @@
+# Personal Data as Agent Context: What the Market Shows
+
+**Date:** March 17, 2026
+**Audience:** Leadership team evaluating personal data as a product direction for AI agent autonomy
+**Supporting data:** [Landscape Report](./personal-data-as-agent-context-landscape.md) | [Landscape CSV](./personal-data-as-agent-context-landscape.csv)
+
+---
+
+## What this research covers
+
+This memo summarizes landscape research into whether personal data from connected platforms (GitHub, ChatGPT, LinkedIn, Spotify, and others) can reduce human-in-the-loop requirements for autonomous coding agents. It covers 158 deduplicated findings across coding agents, memory infrastructure, personal AI products, data portability, failed products, and proactive agent patterns. The framing is: what would you learn in 3 years of building personal-data-powered agent autonomy that market research already shows today.
+
+---
+
+## 1. Agent autonomy is scaling fast, but context, not capability, is the stated bottleneck
+
+Task horizons are doubling every 7 months (METR, tracking since 2020). Claude Opus 4.6 crossed 14.5 hours. OpenAI Codex demonstrated a 25-hour autonomous run. At the current rate, multi-day horizons arrive within 18 months.
+
+Human interventions per Claude Code session dropped from 5.4 to 3.3 in three months. But when developers explain why they still intervene, the dominant answer is not "the model made a mistake." It is: "We don't trust the context the model has." Turing Post found that "the critical logic sleeps in a Jira ticket from 2019, or worse, it's tribal knowledge." Anthropic's own data shows developer acceptance of agent changes jumps from 62% to 89% when the agent presents a diff summary, a context presentation change rather than a capability improvement.
+
+The implication: as models get better at execution, the remaining interventions concentrate around missing context. Personal data is one source of that missing context.
+
+## 2. No product combines cross-platform personal data with coding agent context
+
+Every coding agent (Cursor at $29.3B valuation, Claude Code at $2.5B ARR, Devin at $10.2B, Copilot at 4.7M subscribers) operates on codebase context only. Personal context mechanisms are limited to static files (.cursorrules, CLAUDE.md) or implicit learning (Cursor's RL on accept/reject signals).
+
+Memory infrastructure companies (Mem0, $24M raised, 186M API calls per quarter) store session-derived memories, not personal data from external platforms. Enterprise knowledge platforms (Glean at $7.2B valuation, Notion at $500M revenue) build per-employee graphs from workplace sources, not cross-platform personal data.
+
+ChatGPT connects to Gmail, Drive, and GitHub, but does not pipe that data to a coding agent. Google Personal Intelligence connects to Gmail and Photos but stays within Google's ecosystem. Samsung's Personal Data Engine stays on-device within Samsung hardware.
+
+The cross-platform personal context layer, aggregating GitHub + ChatGPT + LinkedIn + Spotify and making it available as coding agent context, does not exist in any product.
+
+## 3. The daily briefing pattern proves demand for data-driven proactive agents
+
+OpenClaw surged from 9K to 264K GitHub stars in under 3 months, surpassing React as the most-starred software project. It aggregates Gmail, Calendar, GitHub, RSS, Todoist, Linear, and Stripe into cron-scheduled daily briefings.
+
+Google launched CC (Your Day Ahead) in December 2025. OpenAI launched ChatGPT Pulse in September 2025, then shelved it after 3 months when priorities shifted. Gemini added Goal Scheduled Actions in February 2026, where the agent reviews prior outputs and adjusts future actions.
+
+The pattern works because it is additive (ignorable without consequence), time-bounded, transparent, and degrades gracefully. These same properties would apply to a personal-data-powered coding agent that surfaces tasks or context at scheduled intervals.
+
+No product generates coding agent prompts from personal data. OpenClaw summarizes; it does not generate actionable coding tasks.
+
+## 4. The graveyard warns against four specific failure modes
+
+| Failure Mode | Examples | Pattern |
+| ----------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------- | ----------------------------------------------------------------------------------------------------- |
+| Hardware fails | Humane AI Pin ($230M burned); Rabbit R1 (5% retention); Narrative Clip (dissolved) | Dedicated hardware for personal AI adds friction and cost without sufficient value over software |
+| HITL does not scale | Facebook M (70% human-operated; shut down); Builder.ai ($445M raised; bankrupt) | Products dependent on human operators behind the scenes cannot reach consumer unit economics |
+| Personal context gets deprioritized | Cortana (deprecated); Google Assistant (17 features removed); ChatGPT Pulse (shelved after 3 months); Apple Intelligence (repeatedly delayed) | Large platforms build personal context features, then deprioritize them for more tractable directions |
+| Engagement does not equal revenue | Inflection Pi ($1.5B raised; 1M DAU; no business model; acqui-hired); Limitless ($350M valuation on $707K ARR; acquired by Meta) | Investor enthusiasm for personal data AI exceeds demonstrated unit economics |
+
+The deprioritization pattern is the most relevant: Microsoft, Google, OpenAI, and Apple have all built and then pulled back from personal context AI features. This pattern suggests personal context AI may work better as a dedicated product than as a feature inside a platform.
+
+## 5. Regulatory tailwinds and trust headwinds coexist
+
+GDPR Article 20, the EU Digital Markets Act, and the Utah Digital Choice Act create legal rights and obligations around personal data portability. These are tailwinds for any product that helps users move their data to AI agents.
+
+Trust data is less favorable: 40% of Americans say they would never enter personal information into an AI tool. 82% believe companies train AI on their data without disclosure. Microsoft Recall's plaintext screenshot storage triggered a redesign. OpenAI states prompt injection "is unlikely to ever be fully solved."
+
+Cold start research shows a single AI task involves 20-30 preference dimensions but users care about 2-4. No published data exists on the minimum viable personalization threshold: how much personal data is needed before an agent becomes measurably more useful.
+
+Privacy infrastructure is being built (Skyflow: $100M raised, tokenized data vaults for AI; Usercentrics: acquired MCP Manager for consent in agent workflows) but is early-stage.
+
+---
+
+## What remains unmeasured
+
+- No published comparison of agent performance with vs. without personal user context
+- No data on which interventions are context-related vs. approval-related (frequency breakdown)
+- No product-market fit signal from any product combining cross-platform personal data with coding agent context
+- No empirical minimum viable personalization threshold
+- Cost-per-user economics for personal AI context at consumer scale are proprietary across all players
diff --git a/research/personal-data-agents/personal-data-as-agent-context.zip b/research/personal-data-agents/personal-data-as-agent-context.zip
new file mode 100644
index 00000000..151ac887
Binary files /dev/null and b/research/personal-data-agents/personal-data-as-agent-context.zip differ
diff --git a/research/personal-data-agents/research-plan.md b/research/personal-data-agents/research-plan.md
new file mode 100644
index 00000000..bd2f7e58
--- /dev/null
+++ b/research/personal-data-agents/research-plan.md
@@ -0,0 +1,106 @@
+# Research Plan: Personal Data as Agent Context
+
+## Core question
+
+Can a user's personal data (from connected platforms like GitHub,
+ChatGPT, LinkedIn, Spotify) reduce the need for human-in-the-loop
+intervention in autonomous coding agents, enabling longer autonomous
+operation or auto-initiated task sequences?
+
+## Sub-questions
+
+### 1. Current state of autonomous agent duration and reliability
+
+**What to find:** Published benchmarks on how long coding agents
+(Claude Code, Codex, Devin, Cursor, Windsurf) operate autonomously
+before requiring human intervention. Anthropic's data on task
+duration scaling with model capability. SWE-bench scores, GAIA
+benchmarks, real-world deployment data.
+**Why it matters:** Establishes the baseline. If agents already
+operate for hours without intervention, the marginal value of
+personal data context is different than if they stall after minutes.
+**Good looks like:** Specific numbers on autonomous task duration,
+success rates, intervention frequency.
+
+### 2. What "human in the loop" actually provides
+
+**What to find:** Taxonomy of human interventions in agent workflows.
+Research or practitioner reports breaking down WHY humans intervene:
+intent clarification, context provision, error correction, approval
+gates, preference expression, priority setting.
+**Why it matters:** If personal data can only substitute for 1 of 6
+intervention types, the value proposition is narrow. If it covers 4
+of 6, the proposition is transformative.
+**Good looks like:** Categorized breakdown with relative frequency
+data.
+
+### 3. Which interventions personal data can substitute for
+
+**What to find:** Existing research or products that use personal
+data, preference models, or user history to reduce human
+intervention in automated workflows. Not limited to coding: include
+email triage, calendar management, recommendation systems, personal
+AI assistants.
+**Why it matters:** Maps the hypothesis to evidence.
+**Good looks like:** Concrete examples where user data replaced
+a human decision point.
+
+### 4. Existing products at the intersection
+
+**What to find:** Products that combine personal data aggregation
+with AI agent autonomy. Memory systems (ChatGPT memory, Mem.ai,
+Rewind/Limitless), personal AI assistants (Rabbit R1, Humane AI
+Pin, Personal.ai), preference learning systems, context-aware
+agents. Include failed products.
+**Why it matters:** Shows what's been tried, what worked, what
+didn't.
+**Good looks like:** Product names, funding, user counts, status,
+specific capabilities.
+
+### 5. Competitive landscape for reducing human-in-the-loop
+
+**What to find:** Tools, frameworks, and approaches specifically
+aimed at reducing human intervention in coding agents. MCP servers,
+context providers, memory systems, codebase indexing, task
+planners. Companies building "agent infrastructure."
+**Why it matters:** Maps where Vana Connect would sit competitively.
+**Good looks like:** Named companies, approaches, funding,
+differentiation.
+
+### 6. Potential product shapes
+
+**What to find:** Existing implementations or proposals for
+generating agent prompts or tasks from personal data. "Daily
+briefing" products, automated task queues, intent inference
+systems. Academic research on user modeling for task prediction.
+**Why it matters:** Informs what a "vana connect" feature that
+generates agent prompts from personal data could look like.
+**Good looks like:** Concrete product descriptions, research
+papers, user behavior data.
+
+### 7. The graveyard
+
+**What to find:** Products and companies that tried to be a
+"personal AI" or "data-driven autonomous assistant" and failed.
+Why they failed: data quality, user trust, cold start, wrong
+abstraction level.
+**Why it matters:** Failure modes are the most valuable data.
+**Good looks like:** Named failures with specific reasons.
+
+## Proposed report sections
+
+1. Terminology and definitions
+2. Autonomous agent capabilities (current state)
+3. Anatomy of human-in-the-loop interventions
+4. Personal data as context: what it can and cannot substitute
+5. Existing products and approaches
+6. Competitive landscape
+7. Product shape analysis
+8. The graveyard: failures and lessons
+9. Sources
+
+## Parallelization
+
+Wave 1 (parallel): Sub-questions 1, 2, 4, 5, 7
+Wave 2 (after wave 1 merge): Sub-questions 3, 6 (depend on
+findings from wave 1)
diff --git a/scripts/assert-homebrew-formula-sync.mjs b/scripts/assert-homebrew-formula-sync.mjs
new file mode 100644
index 00000000..d1259332
--- /dev/null
+++ b/scripts/assert-homebrew-formula-sync.mjs
@@ -0,0 +1,135 @@
+import fs from "node:fs/promises";
+import path from "node:path";
+import process from "node:process";
+import { execFileSync } from "node:child_process";
+
+const DEFAULT_RELEASE_REPO = "vana-com/vana-connect";
+const DEFAULT_TAP_PATH = "/home/tnunamak/code/homebrew-vana/Formula/vana.rb";
+
+function parseArgs(argv) {
+ const options = {
+ releaseRepo: DEFAULT_RELEASE_REPO,
+ formulaPath: DEFAULT_TAP_PATH,
+ releaseTag: "",
+ };
+
+ for (let index = 0; index < argv.length; index += 1) {
+ const arg = argv[index];
+ switch (arg) {
+ case "--release-repo":
+ options.releaseRepo = argv[++index];
+ break;
+ case "--formula-path":
+ options.formulaPath = argv[++index];
+ break;
+ case "--release-tag":
+ options.releaseTag = argv[++index];
+ break;
+ default:
+ throw new Error(`Unknown argument: ${arg}`);
+ }
+ }
+
+ if (!options.releaseTag) {
+ throw new Error("--release-tag is required");
+ }
+
+ return options;
+}
+
+async function main() {
+ const options = parseArgs(process.argv.slice(2));
+ const formulaPath = path.resolve(options.formulaPath);
+ const formula = await fs.readFile(formulaPath, "utf8");
+
+ const assetChecksums = getReleaseAssetChecksums({
+ releaseRepo: options.releaseRepo,
+ releaseTag: options.releaseTag,
+ });
+
+ assertFormulaContains(
+ formula,
+ options.releaseTag,
+ "vana-darwin-arm64.tar.gz",
+ assetChecksums.get("vana-darwin-arm64.tar.gz"),
+ );
+ assertFormulaContains(
+ formula,
+ options.releaseTag,
+ "vana-darwin-x64.tar.gz",
+ assetChecksums.get("vana-darwin-x64.tar.gz"),
+ );
+ assertFormulaContains(
+ formula,
+ options.releaseTag,
+ "vana-linux-x64.tar.gz",
+ assetChecksums.get("vana-linux-x64.tar.gz"),
+ );
+
+ process.stdout.write(
+ `Homebrew formula matches ${options.releaseRepo}@${options.releaseTag}\n`,
+ );
+}
+
+function getReleaseAssetChecksums({ releaseRepo, releaseTag }) {
+ const output = execFileSync(
+ "gh",
+ ["release", "view", releaseTag, "--repo", releaseRepo, "--json", "assets"],
+ {
+ encoding: "utf8",
+ stdio: ["ignore", "pipe", "pipe"],
+ },
+ );
+ const release = JSON.parse(output);
+ const checksums = new Map();
+ for (const asset of release.assets ?? []) {
+ if (!asset.name.endsWith(".sha256")) {
+ continue;
+ }
+
+ const checksumText = execFileSync(
+ "gh",
+ [
+ "release",
+ "download",
+ releaseTag,
+ "--repo",
+ releaseRepo,
+ "--pattern",
+ asset.name,
+ "--output",
+ "-",
+ ],
+ {
+ encoding: "utf8",
+ stdio: ["ignore", "pipe", "pipe"],
+ },
+ ).trim();
+ const [sha256, assetName] = checksumText.split(/\s+/);
+ checksums.set(assetName, sha256);
+ }
+ return checksums;
+}
+
+function assertFormulaContains(formula, releaseTag, assetName, expectedSha) {
+ if (!expectedSha) {
+ throw new Error(`Missing published checksum for ${assetName}`);
+ }
+
+ if (!formula.includes(`/releases/download/${releaseTag}/${assetName}`)) {
+ throw new Error(
+ `Formula does not reference ${assetName} from ${releaseTag}`,
+ );
+ }
+
+ if (!formula.includes(`sha256 "${expectedSha}"`)) {
+ throw new Error(`Formula checksum mismatch for ${assetName}`);
+ }
+}
+
+main().catch((error) => {
+ process.stderr.write(
+ `${error instanceof Error ? (error.stack ?? error.message) : String(error)}\n`,
+ );
+ process.exitCode = 1;
+});
diff --git a/scripts/assert-pack-contents.mjs b/scripts/assert-pack-contents.mjs
new file mode 100644
index 00000000..a95c6c82
--- /dev/null
+++ b/scripts/assert-pack-contents.mjs
@@ -0,0 +1,56 @@
+import { spawnSync } from "node:child_process";
+
+const requiredPaths = [
+ "dist/cli/bin.js",
+ "dist/cli/index.js",
+ "dist/runtime/managed-playwright.js",
+];
+
+const raw = runNpmPackDryRun();
+
+const manifest = JSON.parse(raw);
+if (!Array.isArray(manifest) || manifest.length === 0) {
+ throw new Error("npm pack --json --dry-run returned no manifest data.");
+}
+
+const [packResult] = manifest;
+const files = Array.isArray(packResult.files) ? packResult.files : [];
+const filePaths = new Set(files.map((file) => file.path));
+
+for (const requiredPath of requiredPaths) {
+ if (!filePaths.has(requiredPath)) {
+ throw new Error(
+ `Packed npm tarball is missing required file: ${requiredPath}`,
+ );
+ }
+}
+
+if (files.length < 20) {
+ throw new Error(
+ `Packed npm tarball unexpectedly small: ${files.length} files.`,
+ );
+}
+
+console.log(
+ `npm pack validation passed with ${files.length} files and required CLI/runtime entries present.`,
+);
+
+function runNpmPackDryRun() {
+ const npmCommand = process.platform === "win32" ? "npm.cmd" : "npm";
+ const result = spawnSync(npmCommand, ["pack", "--json", "--dry-run"], {
+ encoding: "utf8",
+ shell: process.platform === "win32",
+ });
+
+ if (result.error) {
+ throw result.error;
+ }
+
+ if (result.status !== 0) {
+ throw new Error(
+ `npm pack --json --dry-run failed with code ${result.status ?? "unknown"}\n${result.stderr ?? ""}`,
+ );
+ }
+
+ return result.stdout;
+}
diff --git a/scripts/assert-release-demo-assets.mjs b/scripts/assert-release-demo-assets.mjs
new file mode 100644
index 00000000..6c03eb0c
--- /dev/null
+++ b/scripts/assert-release-demo-assets.mjs
@@ -0,0 +1,99 @@
+import { execFileSync } from "node:child_process";
+
+const REQUIRED_ASSETS = [
+ "help.gif",
+ "data-help.gif",
+ "setup.gif",
+ "status.gif",
+ "doctor.gif",
+ "logs.gif",
+ "sources.gif",
+ "data-list.gif",
+ "data-list-empty.gif",
+ "data-show-github.gif",
+ "data-show-github-missing.gif",
+ "data-path-github.gif",
+ "connect-github-no-input.gif",
+ "connect-github-session-reuse-no-input.gif",
+ "connect-shop-no-input.gif",
+ "connect-shop.gif",
+ "connect-steam.gif",
+ "connect-steam-no-input.gif",
+ "connect-github-success.gif",
+];
+
+function getArgMap(argv) {
+ const args = new Map();
+
+ for (let index = 2; index < argv.length; index += 1) {
+ const token = argv[index];
+ if (!token.startsWith("--")) {
+ continue;
+ }
+
+ const key = token.slice(2);
+ const next = argv[index + 1];
+
+ if (next && !next.startsWith("--")) {
+ args.set(key, next);
+ index += 1;
+ } else {
+ args.set(key, "true");
+ }
+ }
+
+ return args;
+}
+
+function main() {
+ const args = getArgMap(process.argv);
+ const repo = args.get("repo") ?? "vana-com/vana-connect";
+ const tag = args.get("tag");
+
+ if (!tag) {
+ throw new Error("Missing required --tag argument.");
+ }
+
+ const output = execFileSync(
+ "gh",
+ [
+ "release",
+ "view",
+ tag,
+ "--repo",
+ repo,
+ "--json",
+ "assets",
+ "--jq",
+ ".assets[].name",
+ ],
+ { encoding: "utf8" },
+ );
+
+ const assetNames = new Set(
+ output
+ .split("\n")
+ .map((value) => value.trim())
+ .filter(Boolean),
+ );
+
+ const missing = REQUIRED_ASSETS.filter((name) => !assetNames.has(name));
+ if (missing.length > 0) {
+ throw new Error(
+ `Release ${tag} is missing demo assets: ${missing.join(", ")}`,
+ );
+ }
+
+ process.stdout.write(
+ `[release] Demo assets present for ${tag}: ${REQUIRED_ASSETS.join(", ")}\n`,
+ );
+}
+
+try {
+ main();
+} catch (error) {
+ process.stderr.write(
+ `${error instanceof Error ? error.message : String(error)}\n`,
+ );
+ process.exitCode = 1;
+}
diff --git a/scripts/assert-sea-artifact.mjs b/scripts/assert-sea-artifact.mjs
new file mode 100644
index 00000000..fcb94524
--- /dev/null
+++ b/scripts/assert-sea-artifact.mjs
@@ -0,0 +1,163 @@
+import { createHash } from "node:crypto";
+import { execFileSync } from "node:child_process";
+import { promises as fsp } from "node:fs";
+import path from "node:path";
+
+const args = parseArgs(process.argv.slice(2));
+
+const artifactDir = path.resolve(requiredArg(args, "artifact-dir"));
+const archivePath = path.resolve(requiredArg(args, "archive"));
+const checksumPath = path.resolve(requiredArg(args, "checksum"));
+const platform = requiredArg(args, "platform");
+const binaryName =
+ args.get("binary-name") ?? (platform === "win32" ? "vana.exe" : "vana");
+
+await assertExists(
+ artifactDir,
+ `Artifact directory was not found: ${artifactDir}`,
+);
+await assertExists(
+ archivePath,
+ `Artifact archive was not found: ${archivePath}`,
+);
+await assertExists(
+ checksumPath,
+ `Artifact checksum file was not found: ${checksumPath}`,
+);
+
+const requiredDirectoryEntries = [
+ binaryName,
+ "app/sea-entry.cjs",
+ "app/package.json",
+ "app/dist/cli/bin.js",
+ "app/dist/cli/main.js",
+ "app/dist/runtime/managed-playwright.js",
+];
+
+for (const relativePath of requiredDirectoryEntries) {
+ const candidate = path.join(artifactDir, relativePath);
+ await assertExists(
+ candidate,
+ `SEA artifact directory is missing required file: ${candidate}`,
+ );
+}
+
+const archiveEntries = listArchiveEntries({ archivePath, platform });
+for (const relativePath of requiredDirectoryEntries) {
+ const expectedSuffix = `/${relativePath.replaceAll("\\", "/")}`;
+ const hasEntry = archiveEntries.some(
+ (entry) => entry === relativePath || entry.endsWith(expectedSuffix),
+ );
+ if (!hasEntry) {
+ throw new Error(
+ `SEA archive is missing required entry: ${relativePath}\nArchive: ${archivePath}`,
+ );
+ }
+}
+
+const expectedDigest = (await fsp.readFile(checksumPath, "utf8"))
+ .trim()
+ .split(/\s+/)[0];
+if (!expectedDigest) {
+ throw new Error(`Checksum file did not contain a digest: ${checksumPath}`);
+}
+
+const actualDigest = await sha256(archivePath);
+if (expectedDigest !== actualDigest) {
+ throw new Error(
+ `SEA archive checksum mismatch for ${archivePath}\nExpected: ${expectedDigest}\nActual: ${actualDigest}`,
+ );
+}
+
+const archiveStat = await fsp.stat(archivePath);
+if (archiveStat.size < 1024 * 100) {
+ throw new Error(
+ `SEA archive is unexpectedly small: ${archivePath} (${archiveStat.size} bytes)`,
+ );
+}
+
+if (platform === "darwin") {
+ execFileSync(
+ "codesign",
+ ["--verify", "--verbose=2", path.join(artifactDir, binaryName)],
+ {
+ stdio: "inherit",
+ },
+ );
+}
+
+console.log(
+ `SEA artifact validation passed for ${path.basename(archivePath)} with ${archiveEntries.length} archive entries.`,
+);
+
+function parseArgs(argv) {
+ const parsed = new Map();
+ for (let index = 0; index < argv.length; index += 1) {
+ const arg = argv[index];
+ if (!arg.startsWith("--")) {
+ continue;
+ }
+
+ const key = arg.slice(2);
+ const next = argv[index + 1];
+ const value = next && !next.startsWith("--") ? next : "true";
+ parsed.set(key, value);
+ if (value !== "true") {
+ index += 1;
+ }
+ }
+ return parsed;
+}
+
+function requiredArg(argsMap, key) {
+ const value = argsMap.get(key);
+ if (!value || value === "true") {
+ throw new Error(`Missing required argument: --${key}`);
+ }
+ return value;
+}
+
+async function assertExists(targetPath, message) {
+ try {
+ await fsp.access(targetPath);
+ } catch {
+ throw new Error(message);
+ }
+}
+
+function listArchiveEntries({ archivePath, platform }) {
+ if (platform === "win32") {
+ const raw = execFileSync(
+ "powershell",
+ [
+ "-NoProfile",
+ "-Command",
+ `Add-Type -AssemblyName System.IO.Compression.FileSystem; [System.IO.Compression.ZipFile]::OpenRead('${escapePowerShellPath(archivePath)}').Entries | ForEach-Object { $_.FullName }`,
+ ],
+ {
+ encoding: "utf8",
+ },
+ );
+ return raw
+ .split(/\r?\n/)
+ .map((entry) => entry.trim())
+ .filter(Boolean);
+ }
+
+ const raw = execFileSync("tar", ["-tzf", archivePath], {
+ encoding: "utf8",
+ });
+ return raw
+ .split(/\r?\n/)
+ .map((entry) => entry.trim())
+ .filter(Boolean);
+}
+
+function escapePowerShellPath(input) {
+ return input.replace(/'/g, "''");
+}
+
+async function sha256(filePath) {
+ const buffer = await fsp.readFile(filePath);
+ return createHash("sha256").update(buffer).digest("hex");
+}
diff --git a/scripts/build-sea.mjs b/scripts/build-sea.mjs
new file mode 100644
index 00000000..82ccec95
--- /dev/null
+++ b/scripts/build-sea.mjs
@@ -0,0 +1,279 @@
+import { spawn } from "node:child_process";
+import { createHash } from "node:crypto";
+import { promises as fsp } from "node:fs";
+import path from "node:path";
+import { fileURLToPath } from "node:url";
+
+const __dirname = path.dirname(fileURLToPath(import.meta.url));
+const repoRoot = path.resolve(__dirname, "..");
+
+const args = new Map();
+for (let index = 2; index < process.argv.length; index += 1) {
+ const arg = process.argv[index];
+ if (!arg.startsWith("--")) {
+ continue;
+ }
+
+ const key = arg.slice(2);
+ const next = process.argv[index + 1];
+ const value = next && !next.startsWith("--") ? next : "true";
+ args.set(key, value);
+ if (value !== "true") {
+ index += 1;
+ }
+}
+
+const artifactDir = path.join(repoRoot, "artifacts", "sea");
+const scratchDir = path.join(repoRoot, ".sea-work", "build-sea");
+const platform = args.get("platform") ?? process.platform;
+const arch = args.get("arch") ?? process.arch;
+const binaryName =
+ args.get("binary-name") ?? (platform === "win32" ? "vana.exe" : "vana");
+const targetName = args.get("artifact-name") ?? `vana-${platform}-${arch}`;
+const archiveFormat =
+ args.get("archive-format") ?? (platform === "win32" ? "zip" : "tar.gz");
+const targetDir = path.join(artifactDir, targetName);
+const outputPath = path.resolve(
+ repoRoot,
+ args.get("output") ?? path.join(targetDir, binaryName),
+);
+const archivePath = path.join(artifactDir, `${targetName}.${archiveFormat}`);
+const checksumPath = `${archivePath}.sha256`;
+const appPayloadPath = path.join(path.dirname(outputPath), "app");
+
+const distCliMain = path.join(repoRoot, "dist", "cli", "main.js");
+await assertExists(
+ distCliMain,
+ "dist/cli/main.js was not found. Run `pnpm build` first.",
+);
+
+await fsp.mkdir(artifactDir, { recursive: true });
+await removePath(scratchDir);
+await fsp.mkdir(scratchDir, { recursive: true });
+await removePath(path.dirname(outputPath));
+await fsp.mkdir(path.dirname(outputPath), { recursive: true });
+
+const launcherPath = path.join(scratchDir, "launcher.cjs");
+const configPath = path.join(scratchDir, "sea-config.json");
+await writeLauncher(launcherPath);
+await buildLauncher(outputPath, launcherPath, configPath);
+await signLauncher(outputPath);
+await stageAppPayload(appPayloadPath);
+
+if (args.has("smoke")) {
+ const smokeHome = path.join(scratchDir, "home");
+ await removePath(smokeHome);
+ await fsp.mkdir(smokeHome, { recursive: true });
+ await run(outputPath, ["status", "--json"], {
+ cwd: repoRoot,
+ env: { ...process.env, HOME: smokeHome },
+ });
+}
+
+await createArchive({
+ archiveFormat,
+ archivePath,
+ targetParentDir: path.dirname(path.dirname(outputPath)),
+ targetName: path.basename(path.dirname(outputPath)),
+});
+
+const archiveDigest = await sha256(archivePath);
+await fsp.writeFile(
+ checksumPath,
+ `${archiveDigest} ${path.basename(archivePath)}\n`,
+ "utf8",
+);
+
+process.stdout.write(`Built SEA launcher: ${outputPath}\n`);
+process.stdout.write(`Built app payload: ${appPayloadPath}\n`);
+process.stdout.write(`Built release archive: ${archivePath}\n`);
+process.stdout.write(`Built release checksum: ${checksumPath}\n`);
+
+async function buildLauncher(outputFile, mainFile, configFile) {
+ const config = {
+ main: mainFile,
+ output: outputFile,
+ disableExperimentalSEAWarning: true,
+ };
+
+ await fsp.writeFile(
+ configFile,
+ `${JSON.stringify(config, null, 2)}\n`,
+ "utf8",
+ );
+ await run(process.execPath, ["--build-sea", configFile], {
+ cwd: repoRoot,
+ });
+}
+
+async function signLauncher(outputFile) {
+ if (platform !== "darwin") {
+ return;
+ }
+
+ await run("codesign", ["--force", "--sign", "-", outputFile], {
+ cwd: repoRoot,
+ });
+}
+
+async function writeLauncher(outputFile) {
+ const launcher = [
+ "const fs = require('node:fs');",
+ "const path = require('node:path');",
+ "const { createRequire } = require('node:module');",
+ "",
+ "(async () => {",
+ " const execPath = fs.realpathSync(process.execPath);",
+ " const appRoot = process.env.VANA_APP_ROOT || path.join(path.dirname(execPath), 'app');",
+ " const appEntryPath = path.join(appRoot, 'sea-entry.cjs');",
+ "",
+ " if (!fs.existsSync(appEntryPath)) {",
+ " console.error(`Vana app payload was not found at ${appEntryPath}. Reinstall vana or repair the installation.`);",
+ " process.exitCode = 1;",
+ " return;",
+ " }",
+ "",
+ " const appRequire = createRequire(appEntryPath);",
+ " const { runCli } = appRequire(appEntryPath);",
+ " const exitCode = await runCli(process.argv);",
+ " if (typeof exitCode === 'number') {",
+ " process.exitCode = exitCode;",
+ " }",
+ "})().catch((error) => {",
+ " console.error(error instanceof Error ? (error.stack || error.message) : String(error));",
+ " process.exitCode = 1;",
+ "});",
+ "",
+ ].join("\n");
+
+ await fsp.writeFile(outputFile, launcher, "utf8");
+}
+
+async function stageAppPayload(outputDir) {
+ await removePath(outputDir);
+ await fsp.mkdir(outputDir, { recursive: true });
+
+ await fsp.cp(path.join(repoRoot, "dist"), path.join(outputDir, "dist"), {
+ recursive: true,
+ force: true,
+ });
+
+ const rootPackage = JSON.parse(
+ await fsp.readFile(path.join(repoRoot, "package.json"), "utf8"),
+ );
+ const appPackage = {
+ name: "@opendatalabs/connect-app",
+ version: rootPackage.version,
+ private: true,
+ type: "module",
+ dependencies: rootPackage.dependencies,
+ };
+ await fsp.writeFile(
+ path.join(outputDir, "package.json"),
+ `${JSON.stringify(appPackage, null, 2)}\n`,
+ "utf8",
+ );
+
+ await run(getNpmCommand(), ["install", "--omit=dev", "--ignore-scripts"], {
+ cwd: outputDir,
+ env: {
+ ...process.env,
+ HUSKY: "0",
+ npm_config_fund: "false",
+ npm_config_audit: "false",
+ PLAYWRIGHT_SKIP_BROWSER_DOWNLOAD: "1",
+ },
+ });
+
+ const appEntryPath = path.join(outputDir, "sea-entry.cjs");
+ await fsp.writeFile(
+ appEntryPath,
+ [
+ "const path = require('node:path');",
+ "const { pathToFileURL } = require('node:url');",
+ "",
+ "exports.runCli = async function runCli(argv) {",
+ " const cliMainPath = path.join(__dirname, 'dist', 'cli', 'main.js');",
+ " const { runCli } = await import(pathToFileURL(cliMainPath).href);",
+ " return runCli(argv);",
+ "};",
+ "",
+ ].join("\n"),
+ "utf8",
+ );
+}
+
+function getNpmCommand() {
+ return process.platform === "win32" ? "npm.cmd" : "npm";
+}
+
+async function removePath(targetPath) {
+ await fsp.rm(targetPath, { recursive: true, force: true });
+}
+
+async function assertExists(filePath, message) {
+ try {
+ await fsp.access(filePath);
+ } catch {
+ throw new Error(message);
+ }
+}
+
+async function run(command, commandArgs, options = {}) {
+ await new Promise((resolve, reject) => {
+ const useShell =
+ process.platform === "win32" && /\.(cmd|bat)$/i.test(command);
+ const child = spawn(command, commandArgs, {
+ shell: useShell,
+ stdio: "inherit",
+ ...options,
+ });
+
+ child.on("error", reject);
+ child.on("exit", (code) => {
+ if (code === 0) {
+ resolve();
+ return;
+ }
+ reject(new Error(`${command} exited with code ${code ?? "unknown"}`));
+ });
+ });
+}
+
+async function sha256(filePath) {
+ const buffer = await fsp.readFile(filePath);
+ return createHash("sha256").update(buffer).digest("hex");
+}
+
+async function createArchive({
+ archiveFormat,
+ archivePath,
+ targetParentDir,
+ targetName,
+}) {
+ if (archiveFormat === "tar.gz") {
+ await run("tar", ["-czf", archivePath, "-C", targetParentDir, targetName], {
+ cwd: repoRoot,
+ });
+ return;
+ }
+
+ if (archiveFormat === "zip") {
+ await fsp.rm(archivePath, { force: true });
+ await run(
+ process.platform === "win32" ? "powershell" : "pwsh",
+ [
+ "-NoLogo",
+ "-NoProfile",
+ "-Command",
+ `Compress-Archive -Path '${path.join(targetParentDir, targetName)}' -DestinationPath '${archivePath}' -Force`,
+ ],
+ {
+ cwd: targetParentDir,
+ },
+ );
+ return;
+ }
+
+ throw new Error(`Unsupported archive format: ${archiveFormat}`);
+}
diff --git a/scripts/capture-cli-transcripts.mjs b/scripts/capture-cli-transcripts.mjs
new file mode 100644
index 00000000..6302b3bc
--- /dev/null
+++ b/scripts/capture-cli-transcripts.mjs
@@ -0,0 +1,223 @@
+import fs from "node:fs";
+import fsp from "node:fs/promises";
+import os from "node:os";
+import path from "node:path";
+import { execFileSync } from "node:child_process";
+import { fileURLToPath } from "node:url";
+
+const __dirname = path.dirname(fileURLToPath(import.meta.url));
+const repoRoot = path.resolve(__dirname, "..");
+const transcriptsMd = path.join(repoRoot, "docs", "CLI-TRANSCRIPTS.md");
+const fixturesRoot = path.join(repoRoot, "docs", "vhs", "fixtures");
+
+async function main() {
+ const tempRoot = await fsp.mkdtemp(
+ path.join(os.tmpdir(), "vana-transcripts-"),
+ );
+ const workingHome = path.join(tempRoot, "home");
+ const freshHome = path.join(tempRoot, "fresh-home");
+ await prepareFixtures(workingHome);
+ await fsp.mkdir(freshHome, { recursive: true });
+ const connectorsDir = resolveDataConnectorsDir();
+ const binDir = path.join(tempRoot, "bin");
+ await prepareDemoBin(binDir);
+
+ const seededEnv = {
+ ...process.env,
+ HOME: workingHome,
+ PATH: `${binDir}${path.delimiter}${process.env.PATH ?? ""}`,
+ VANA_DEMO_FAST_SUCCESS: "1",
+ ...(connectorsDir ? { VANA_DATA_CONNECTORS_DIR: connectorsDir } : {}),
+ };
+ const freshEnv = {
+ ...process.env,
+ HOME: freshHome,
+ PATH: `${binDir}${path.delimiter}${process.env.PATH ?? ""}`,
+ ...(connectorsDir ? { VANA_DATA_CONNECTORS_DIR: connectorsDir } : {}),
+ };
+ const seededInputEnv = {
+ ...seededEnv,
+ };
+ delete seededInputEnv.VANA_DEMO_FAST_SUCCESS;
+
+ const commands = [
+ { marker: "help", argv: ["vana"], env: seededEnv },
+ { marker: "data-help", argv: ["vana", "data"], env: seededEnv },
+ { marker: "setup", argv: ["vana", "setup"], env: seededEnv },
+ { marker: "status", argv: ["vana", "status"], env: seededEnv },
+ { marker: "doctor", argv: ["vana", "doctor"], env: seededEnv },
+ { marker: "logs", argv: ["vana", "logs"], env: seededEnv },
+ { marker: "sources", argv: ["vana", "sources"], env: seededEnv },
+ { marker: "data-list", argv: ["vana", "data", "list"], env: seededEnv },
+ {
+ marker: "data-list-empty",
+ argv: ["vana", "data", "list"],
+ env: freshEnv,
+ },
+ {
+ marker: "data-show-github",
+ argv: ["vana", "data", "show", "github"],
+ env: seededEnv,
+ allowFailure: true,
+ },
+ {
+ marker: "data-show-github-missing",
+ argv: ["vana", "data", "show", "github"],
+ env: freshEnv,
+ allowFailure: true,
+ },
+ {
+ marker: "data-path-github",
+ argv: ["vana", "data", "path", "github"],
+ env: seededEnv,
+ allowFailure: true,
+ },
+ {
+ marker: "connect-github-success",
+ argv: ["vana", "connect", "github"],
+ env: seededEnv,
+ allowFailure: true,
+ },
+ {
+ marker: "connect-github-no-input",
+ argv: ["vana", "connect", "github", "--no-input"],
+ env: freshEnv,
+ allowFailure: true,
+ },
+ {
+ marker: "connect-github-session-reuse-no-input",
+ argv: ["vana", "connect", "github", "--no-input"],
+ env: seededInputEnv,
+ allowFailure: true,
+ },
+ {
+ marker: "connect-shop-no-input",
+ argv: ["vana", "connect", "shop", "--no-input"],
+ env: seededEnv,
+ allowFailure: true,
+ },
+ {
+ marker: "connect-shop",
+ argv: ["vana", "connect", "shop"],
+ env: seededEnv,
+ allowFailure: true,
+ },
+ {
+ marker: "connect-steam",
+ argv: ["vana", "connect", "steam"],
+ env: seededEnv,
+ allowFailure: true,
+ },
+ {
+ marker: "connect-steam-no-input",
+ argv: ["vana", "connect", "steam", "--no-input"],
+ env: seededEnv,
+ allowFailure: true,
+ },
+ ];
+
+ let mdContent = await fsp.readFile(transcriptsMd, "utf8");
+
+ for (const command of commands) {
+ const cmdLine = `$ ${command.argv.join(" ")}`;
+ const output = normalizeTranscript(
+ run(command.argv, command.env, command.allowFailure),
+ );
+ const block = `\`\`\`\n${cmdLine}\n\n${output.trimEnd()}\n\`\`\``;
+
+ const beginTag = ``;
+ const endTag = ``;
+ const pattern = new RegExp(
+ `${escapeRegex(beginTag)}[\\s\\S]*?${escapeRegex(endTag)}`,
+ );
+
+ if (!pattern.test(mdContent)) {
+ process.stderr.write(
+ `[transcript] WARNING: marker ${command.marker} not found in CLI-TRANSCRIPTS.md\n`,
+ );
+ continue;
+ }
+
+ mdContent = mdContent.replace(pattern, `${beginTag}\n${block}\n${endTag}`);
+ process.stdout.write(`[transcript] updated ${command.marker}\n`);
+ }
+
+ await fsp.writeFile(transcriptsMd, mdContent, "utf8");
+ process.stdout.write(
+ `[transcript] wrote ${path.relative(repoRoot, transcriptsMd)}\n`,
+ );
+
+ await fsp.rm(tempRoot, { recursive: true, force: true });
+}
+
+function escapeRegex(s) {
+ return s.replace(/[.*+?^${}()|[\]\\]/g, "\\$&");
+}
+
+function prepareFixtures(homeRoot) {
+ execFileSync("node", ["./scripts/prepare-vhs-fixtures.mjs"], {
+ cwd: repoRoot,
+ env: {
+ ...process.env,
+ ...(homeRoot ? { VANA_VHS_HOME_ROOT: homeRoot } : {}),
+ },
+ stdio: "inherit",
+ });
+}
+
+async function prepareDemoBin(binDir) {
+ await fsp.mkdir(binDir, { recursive: true });
+ const launcherPath = path.join(binDir, "vana");
+ const launcher = `#!/usr/bin/env bash
+set -euo pipefail
+exec node "${path.join(repoRoot, "dist", "cli", "bin.js")}" "$@"
+`;
+ await fsp.writeFile(launcherPath, launcher, "utf8");
+ await fsp.chmod(launcherPath, 0o755);
+}
+
+function run(argv, env, allowFailure = false) {
+ try {
+ return execFileSync(argv[0], argv.slice(1), {
+ cwd: repoRoot,
+ env,
+ encoding: "utf8",
+ stdio: ["ignore", "pipe", "pipe"],
+ });
+ } catch (error) {
+ const stdout = error.stdout?.toString?.() ?? "";
+ const stderr = error.stderr?.toString?.() ?? "";
+ if (allowFailure) {
+ return `${stdout}${stderr}`.trimEnd() + "\n";
+ }
+ throw new Error(`${stdout}${stderr}`.trim());
+ }
+}
+
+function normalizeTranscript(output) {
+ return output.replace(
+ /(~\/\.vana\/logs\/(?:run|fetch|setup)-[A-Za-z0-9_-]+)-\d{4}-\d{2}-\d{2}T\d{2}-\d{2}-\d{2}-\d{3}Z\.log/g,
+ "$1-.log",
+ );
+}
+
+function resolveDataConnectorsDir() {
+ const fixtureRepo = path.join(fixturesRoot, "demo-data-connectors");
+ if (fs.existsSync(path.join(fixtureRepo, "registry.json"))) {
+ return fixtureRepo;
+ }
+
+ if (process.env.VANA_DATA_CONNECTORS_DIR) {
+ return process.env.VANA_DATA_CONNECTORS_DIR;
+ }
+
+ const siblingRepo = path.resolve(repoRoot, "..", "data-connectors");
+ return fs.existsSync(siblingRepo) ? siblingRepo : null;
+}
+
+main().catch((error) => {
+ process.stderr.write(
+ `${error instanceof Error ? (error.stack ?? error.message) : String(error)}\n`,
+ );
+ process.exitCode = 1;
+});
diff --git a/scripts/check-demo-freshness.sh b/scripts/check-demo-freshness.sh
new file mode 100755
index 00000000..1eacf849
--- /dev/null
+++ b/scripts/check-demo-freshness.sh
@@ -0,0 +1,40 @@
+#!/usr/bin/env bash
+# Check if the demo GIF may be stale relative to CLI output changes.
+# Run from repo root. Exits 0 always (advisory, not blocking).
+
+set -euo pipefail
+
+GIF="docs/assets/demo.gif"
+TAPE="docs/vhs/demo.tape"
+
+# Source paths that affect what the CLI prints in human mode.
+# If these change and the GIF doesn't, it's probably stale.
+WATCHED_PATHS=(
+ "src/cli/"
+ "docs/vhs/fixtures/"
+ "docs/vhs/demo.tape"
+)
+
+if [ ! -f "$GIF" ]; then
+ echo "demo-freshness: GIF does not exist yet. Render with: vhs $TAPE"
+ exit 0
+fi
+
+gif_commit=$(git log -1 --format=%H -- "$GIF" 2>/dev/null || true)
+
+if [ -z "$gif_commit" ]; then
+ echo "demo-freshness: GIF exists but is not tracked by git."
+ exit 0
+fi
+
+changed=$(git log --oneline "$gif_commit"..HEAD -- "${WATCHED_PATHS[@]}" 2>/dev/null | head -5)
+
+if [ -n "$changed" ]; then
+ echo "demo-freshness: GIF may be stale. CLI output files changed since last GIF update."
+ echo ""
+ echo " Last GIF update: $(git log -1 --format='%h %s (%cr)' -- "$GIF")"
+ echo " Changes since:"
+ echo "$changed" | sed 's/^/ /'
+ echo ""
+ echo " Re-render: vhs $TAPE"
+fi
diff --git a/scripts/clean-build.mjs b/scripts/clean-build.mjs
new file mode 100644
index 00000000..a68df329
--- /dev/null
+++ b/scripts/clean-build.mjs
@@ -0,0 +1,11 @@
+import { promises as fsp } from "node:fs";
+import path from "node:path";
+import { fileURLToPath } from "node:url";
+
+const __dirname = path.dirname(fileURLToPath(import.meta.url));
+const repoRoot = path.resolve(__dirname, "..");
+
+await Promise.all([
+ fsp.rm(path.join(repoRoot, "dist"), { recursive: true, force: true }),
+ fsp.rm(path.join(repoRoot, "tsconfig.tsbuildinfo"), { force: true }),
+]);
diff --git a/scripts/collect-release-assets.mjs b/scripts/collect-release-assets.mjs
new file mode 100644
index 00000000..def428c9
--- /dev/null
+++ b/scripts/collect-release-assets.mjs
@@ -0,0 +1,80 @@
+import { readdirSync, statSync } from "node:fs";
+import path from "node:path";
+
+function getArgMap(argv) {
+ const args = new Map();
+
+ for (let index = 2; index < argv.length; index += 1) {
+ const token = argv[index];
+ if (!token.startsWith("--")) {
+ continue;
+ }
+
+ const key = token.slice(2);
+ const next = argv[index + 1];
+
+ if (next && !next.startsWith("--")) {
+ args.set(key, next);
+ index += 1;
+ } else {
+ args.set(key, "true");
+ }
+ }
+
+ return args;
+}
+
+function listFiles(dir, predicate = () => true) {
+ try {
+ return readdirSync(dir)
+ .map((name) => path.join(dir, name))
+ .filter((filePath) => statSync(filePath).isFile())
+ .filter(predicate)
+ .sort();
+ } catch {
+ return [];
+ }
+}
+
+const args = getArgMap(process.argv);
+const releaseDir = args.get("release-dir") ?? "artifacts/release";
+const packageManagersDir =
+ args.get("package-managers-dir") ?? "artifacts/package-managers";
+const demoPreviewDir = args.get("demo-preview-dir") ?? "artifacts/demo-preview";
+
+const releaseFiles = listFiles(
+ releaseDir,
+ (filePath) =>
+ filePath.endsWith(".tar.gz") ||
+ filePath.endsWith(".zip") ||
+ filePath.endsWith(".sha256"),
+);
+const packageManagerFiles = listFiles(path.join(packageManagersDir, "homebrew"))
+ .concat(
+ listFiles(path.join(packageManagersDir, "winget"), (filePath) =>
+ filePath.endsWith(".yaml"),
+ ),
+ )
+ .sort();
+const demoPreviewFiles = [
+ ...listFiles(
+ path.join(demoPreviewDir, "docs", "transcripts"),
+ (filePath) => filePath.endsWith(".txt") || filePath.endsWith(".md"),
+ ),
+ ...listFiles(
+ path.join(demoPreviewDir, "transcripts"),
+ (filePath) => filePath.endsWith(".txt") || filePath.endsWith(".md"),
+ ),
+ ...listFiles(
+ path.join(demoPreviewDir, "docs", "vhs"),
+ (filePath) => filePath.endsWith(".gif") || filePath.endsWith(".svg"),
+ ),
+ ...listFiles(
+ path.join(demoPreviewDir, "vhs"),
+ (filePath) => filePath.endsWith(".gif") || filePath.endsWith(".svg"),
+ ),
+].sort();
+
+for (const file of releaseFiles.concat(packageManagerFiles, demoPreviewFiles)) {
+ process.stdout.write(`${file}\n`);
+}
diff --git a/scripts/generate-package-manager-metadata.mjs b/scripts/generate-package-manager-metadata.mjs
new file mode 100644
index 00000000..52b6e236
--- /dev/null
+++ b/scripts/generate-package-manager-metadata.mjs
@@ -0,0 +1,237 @@
+import { promises as fsp } from "node:fs";
+import path from "node:path";
+import { fileURLToPath } from "node:url";
+
+const __dirname = path.dirname(fileURLToPath(import.meta.url));
+const repoRoot = path.resolve(__dirname, "..");
+
+const args = new Map();
+for (let index = 2; index < process.argv.length; index += 1) {
+ const arg = process.argv[index];
+ if (!arg.startsWith("--")) {
+ continue;
+ }
+
+ const key = arg.slice(2);
+ const next = process.argv[index + 1];
+ const value = next && !next.startsWith("--") ? next : "true";
+ args.set(key, value);
+ if (value !== "true") {
+ index += 1;
+ }
+}
+
+const releaseTag = args.get("release-tag");
+const packageVersion = args.get("package-version");
+const releaseRepo = args.get("release-repo") ?? "vana-com/vana-connect";
+const artifactsDir = path.resolve(
+ repoRoot,
+ args.get("artifacts-dir") ?? "artifacts/release",
+);
+const outputDir = path.resolve(
+ repoRoot,
+ args.get("output-dir") ?? "artifacts/package-managers",
+);
+
+if (!releaseTag || !packageVersion) {
+ throw new Error(
+ "--release-tag and --package-version are required to generate package-manager metadata.",
+ );
+}
+
+await fsp.rm(outputDir, { recursive: true, force: true });
+await fsp.mkdir(outputDir, { recursive: true });
+
+const releaseBaseUrl = `https://github.com/${releaseRepo}/releases/download/${releaseTag}`;
+const assets = await loadReleaseAssets(artifactsDir);
+
+await generateHomebrewFormula({
+ outputDir,
+ releaseBaseUrl,
+ packageVersion,
+ assets,
+});
+
+await generateWingetManifest({
+ outputDir,
+ releaseBaseUrl,
+ packageVersion,
+ assets,
+});
+
+process.stdout.write(
+ `Generated package-manager metadata in ${outputDir} for ${releaseTag} (${packageVersion})\n`,
+);
+
+async function loadReleaseAssets(baseDir) {
+ const files = await fsp.readdir(baseDir);
+ const assetMap = new Map();
+
+ for (const file of files) {
+ if (!file.endsWith(".sha256")) {
+ continue;
+ }
+
+ const checksumFile = path.join(baseDir, file);
+ const checksumContents = await fsp.readFile(checksumFile, "utf8");
+ const [sha256, assetName] = checksumContents.trim().split(/\s+/);
+ assetMap.set(assetName, { sha256 });
+ }
+
+ return {
+ linuxX64: requireAsset(assetMap, "vana-linux-x64.tar.gz"),
+ darwinX64: requireAsset(assetMap, "vana-darwin-x64.tar.gz"),
+ darwinArm64: requireAsset(assetMap, "vana-darwin-arm64.tar.gz"),
+ win32X64: requireAsset(assetMap, "vana-win32-x64.zip"),
+ };
+}
+
+function requireAsset(assetMap, assetName) {
+ const asset = assetMap.get(assetName);
+ if (!asset) {
+ throw new Error(`Missing checksum metadata for ${assetName}`);
+ }
+
+ return {
+ name: assetName,
+ sha256: asset.sha256,
+ };
+}
+
+async function generateHomebrewFormula({
+ outputDir,
+ releaseBaseUrl,
+ packageVersion,
+ assets,
+}) {
+ const formulaDir = path.join(outputDir, "homebrew");
+ await fsp.mkdir(formulaDir, { recursive: true });
+ const formulaPath = path.join(formulaDir, "vana.rb");
+ const formula = `class Vana < Formula
+ desc "Vana Connect CLI"
+ homepage "https://github.com/${releaseRepo}"
+ version "${packageVersion}"
+ license "MIT"
+
+ on_macos do
+ if Hardware::CPU.arm?
+ url "${releaseBaseUrl}/${assets.darwinArm64.name}"
+ sha256 "${assets.darwinArm64.sha256}"
+ else
+ url "${releaseBaseUrl}/${assets.darwinX64.name}"
+ sha256 "${assets.darwinX64.sha256}"
+ end
+ end
+
+ on_linux do
+ url "${releaseBaseUrl}/${assets.linuxX64.name}"
+ sha256 "${assets.linuxX64.sha256}"
+ end
+
+ def install
+ payload_root =
+ if (buildpath/"vana").exist? && (buildpath/"app").directory?
+ buildpath
+ else
+ child = Dir.children(buildpath)
+ .reject { |entry| entry.start_with?(".") }
+ .find { |entry| File.directory?(buildpath/entry) }
+ raise "Unable to locate Vana payload root" unless child
+
+ buildpath/child
+ end
+
+ libexec.install payload_root/"app"
+ libexec.install payload_root/"vana"
+ (bin/"vana").write_env_script libexec/"vana", VANA_APP_ROOT: libexec/"app"
+ end
+
+ test do
+ assert_match "runtime", shell_output("#{bin}/vana status --json")
+ end
+end
+`;
+ await fsp.writeFile(formulaPath, formula, "utf8");
+}
+
+async function generateWingetManifest({
+ outputDir,
+ releaseBaseUrl,
+ packageVersion,
+ assets,
+}) {
+ const packageIdentifier = "Vana.Connect";
+ const manifestVersion = "1.10.0";
+ const wingetDir = path.join(
+ outputDir,
+ "winget",
+ packageIdentifier,
+ packageVersion,
+ );
+ await fsp.mkdir(wingetDir, { recursive: true });
+
+ const versionManifest = `PackageIdentifier: ${packageIdentifier}
+PackageVersion: ${packageVersion}
+DefaultLocale: en-US
+ManifestType: version
+ManifestVersion: ${manifestVersion}
+`;
+
+ const defaultLocaleManifest = `PackageIdentifier: ${packageIdentifier}
+PackageVersion: ${packageVersion}
+PackageLocale: en-US
+Publisher: Vana
+PublisherUrl: https://vana.org
+PublisherSupportUrl: https://github.com/${releaseRepo}/issues
+PackageName: Vana Connect
+PackageUrl: https://github.com/${releaseRepo}
+License: MIT
+LicenseUrl: https://github.com/${releaseRepo}/blob/main/LICENSE
+ShortDescription: Install and run the Vana Connect CLI for data portability workflows.
+Description: Vana Connect is a local-first CLI for connecting supported data sources, collecting exports, and syncing them to your Personal Server when available.
+Tags:
+ - cli
+ - data-portability
+ - vana
+ - automation
+ReleaseNotesUrl: https://github.com/${releaseRepo}/releases/tag/${releaseTag}
+ManifestType: defaultLocale
+ManifestVersion: ${manifestVersion}
+`;
+
+ const installerManifest = `PackageIdentifier: ${packageIdentifier}
+PackageVersion: ${packageVersion}
+InstallerType: zip
+NestedInstallerType: portable
+Commands:
+ - vana
+UpgradeBehavior: install
+Installers:
+ - Architecture: x64
+ InstallerUrl: ${releaseBaseUrl}/${assets.win32X64.name}
+ InstallerSha256: ${assets.win32X64.sha256}
+ NestedInstallerFiles:
+ - RelativeFilePath: vana-win32-x64/vana.exe
+ PortableCommandAlias: vana
+ManifestType: installer
+ManifestVersion: ${manifestVersion}
+`;
+
+ await Promise.all([
+ fsp.writeFile(
+ path.join(wingetDir, `${packageIdentifier}.yaml`),
+ versionManifest,
+ "utf8",
+ ),
+ fsp.writeFile(
+ path.join(wingetDir, `${packageIdentifier}.locale.en-US.yaml`),
+ defaultLocaleManifest,
+ "utf8",
+ ),
+ fsp.writeFile(
+ path.join(wingetDir, `${packageIdentifier}.installer.yaml`),
+ installerManifest,
+ "utf8",
+ ),
+ ]);
+}
diff --git a/scripts/prepare-vhs-fixtures.mjs b/scripts/prepare-vhs-fixtures.mjs
new file mode 100644
index 00000000..09c6f134
--- /dev/null
+++ b/scripts/prepare-vhs-fixtures.mjs
@@ -0,0 +1,336 @@
+import fs from "node:fs/promises";
+import path from "node:path";
+import { fileURLToPath } from "node:url";
+
+const __dirname = path.dirname(fileURLToPath(import.meta.url));
+const repoRoot = path.resolve(__dirname, "..");
+const fixturesRoot = path.join(repoRoot, "docs", "vhs", "fixtures");
+const homeRoot = process.env.VANA_VHS_HOME_ROOT
+ ? path.resolve(process.env.VANA_VHS_HOME_ROOT)
+ : path.join(fixturesRoot, "demo-home");
+const demoDataConnectorsRoot = path.join(fixturesRoot, "demo-data-connectors");
+const dataConnectRoot = path.join(homeRoot, ".vana");
+
+async function main() {
+ await fs.rm(homeRoot, { recursive: true, force: true });
+ await fs.rm(demoDataConnectorsRoot, { recursive: true, force: true });
+
+ await seedDemoHome();
+ await seedDemoDataConnectors();
+
+ process.stdout.write(
+ `Prepared VHS fixtures at ${homeRoot} with demo connectors at ${demoDataConnectorsRoot}\n`,
+ );
+}
+
+async function seedDemoHome() {
+ await fs.mkdir(path.join(dataConnectRoot, "connectors", "github"), {
+ recursive: true,
+ });
+ await fs.mkdir(path.join(dataConnectRoot, "connectors", "shop"), {
+ recursive: true,
+ });
+ await fs.mkdir(path.join(dataConnectRoot, "connectors", "spotify"), {
+ recursive: true,
+ });
+ await fs.mkdir(
+ path.join(dataConnectRoot, "browsers", "chromium-1200", "chrome-linux64"),
+ {
+ recursive: true,
+ },
+ );
+ await fs.mkdir(path.join(dataConnectRoot, "logs"), { recursive: true });
+ await fs.mkdir(
+ path.join(dataConnectRoot, "browser-profiles", "github-playwright"),
+ {
+ recursive: true,
+ },
+ );
+
+ await fs.writeFile(
+ path.join(dataConnectRoot, "connectors", "github", "github-playwright.js"),
+ "// demo fixture\n",
+ "utf8",
+ );
+ await fs.writeFile(
+ path.join(dataConnectRoot, "connectors", "shop", "shop-playwright.js"),
+ "// demo fixture\n",
+ "utf8",
+ );
+ await fs.writeFile(
+ path.join(
+ dataConnectRoot,
+ "connectors",
+ "spotify",
+ "spotify-playwright.js",
+ ),
+ "// demo fixture\n",
+ "utf8",
+ );
+
+ const browserPath = path.join(
+ dataConnectRoot,
+ "browsers",
+ "chromium-1200",
+ "chrome-linux64",
+ "chrome",
+ );
+ await fs.writeFile(browserPath, "", "utf8");
+ await fs.chmod(browserPath, 0o755);
+
+ const state = {
+ version: 1,
+ sources: {
+ github: {
+ sessionPresent: true,
+ lastRunAt: "2026-03-14T13:10:03.677Z",
+ lastRunOutcome: "connected_local_only",
+ dataState: "collected_local",
+ lastResultPath: path.join(dataConnectRoot, "last-result.json"),
+ lastLogPath: path.join(dataConnectRoot, "logs", "run-github-demo.log"),
+ },
+ shop: {
+ lastRunAt: "2026-03-14T13:11:10.000Z",
+ lastRunOutcome: "legacy_auth",
+ dataState: "none",
+ lastLogPath: path.join(dataConnectRoot, "logs", "run-shop-demo.log"),
+ },
+ steam: {
+ lastRunAt: "2026-03-14T13:12:00.000Z",
+ lastRunOutcome: "connector_unavailable",
+ dataState: "none",
+ lastError: "No connector is available for steam right now.",
+ lastLogPath: path.join(dataConnectRoot, "logs", "fetch-steam-demo.log"),
+ },
+ spotify: {
+ lastRunAt: "2026-03-13T21:23:00.000Z",
+ lastRunOutcome: "connected_local_only",
+ dataState: "collected_local",
+ lastResultPath: path.join(dataConnectRoot, "spotify-result.json"),
+ },
+ },
+ };
+
+ await fs.writeFile(
+ path.join(dataConnectRoot, "vana-connect-state.json"),
+ `${JSON.stringify(state, null, 2)}\n`,
+ "utf8",
+ );
+
+ await fs.writeFile(
+ path.join(dataConnectRoot, "last-result.json"),
+ `${JSON.stringify(
+ {
+ profile: { username: "tnunamak" },
+ repositories: [{ name: "vana-connect" }, { name: "data-connectors" }],
+ starred: [],
+ },
+ null,
+ 2,
+ )}\n`,
+ "utf8",
+ );
+ await fs.writeFile(
+ path.join(dataConnectRoot, "spotify-result.json"),
+ `${JSON.stringify(
+ {
+ profile: { username: "tnunamak" },
+ playlists: [{ name: "Data Portability" }, { name: "Build Flow" }],
+ },
+ null,
+ 2,
+ )}\n`,
+ "utf8",
+ );
+ await fs.writeFile(
+ path.join(dataConnectRoot, "logs", "run-github-demo.log"),
+ "[runtime] GitHub demo log\n[data] status=Complete\n",
+ "utf8",
+ );
+ await fs.writeFile(
+ path.join(dataConnectRoot, "logs", "run-shop-demo.log"),
+ "[runtime] Shop demo log\n[data] status=Manual step required\n",
+ "utf8",
+ );
+ await fs.writeFile(
+ path.join(dataConnectRoot, "logs", "fetch-steam-demo.log"),
+ "[runtime] Steam demo log\n[data] status=Connector unavailable\n",
+ "utf8",
+ );
+}
+
+async function seedDemoDataConnectors() {
+ await fs.mkdir(
+ path.join(demoDataConnectorsRoot, "skills", "vana-connect", "scripts"),
+ { recursive: true },
+ );
+ await fs.mkdir(path.join(demoDataConnectorsRoot, "connectors", "github"), {
+ recursive: true,
+ });
+ await fs.mkdir(path.join(demoDataConnectorsRoot, "connectors", "shop"), {
+ recursive: true,
+ });
+ await fs.mkdir(path.join(demoDataConnectorsRoot, "connectors", "spotify"), {
+ recursive: true,
+ });
+
+ await fs.writeFile(
+ path.join(demoDataConnectorsRoot, "registry.json"),
+ `${JSON.stringify(
+ {
+ connectors: [
+ {
+ id: "github",
+ name: "GitHub",
+ company: "github",
+ description:
+ "Exports your GitHub profile, repositories, and starred repositories using Playwright browser automation.",
+ files: {
+ script: "connectors/github/github-playwright.js",
+ },
+ },
+ {
+ id: "shop",
+ name: "Shop",
+ company: "shop",
+ description:
+ "Exports your Shop app order history using Playwright browser automation.",
+ files: {
+ script: "connectors/shop/shop-playwright.js",
+ },
+ },
+ {
+ id: "spotify",
+ name: "Spotify",
+ company: "spotify",
+ description:
+ "Exports your Spotify playlists using Playwright browser automation.",
+ files: {
+ script: "connectors/spotify/spotify-playwright.js",
+ },
+ },
+ ],
+ },
+ null,
+ 2,
+ )}\n`,
+ "utf8",
+ );
+
+ await fs.writeFile(
+ path.join(
+ demoDataConnectorsRoot,
+ "connectors",
+ "github",
+ "github-playwright.js",
+ ),
+ `const delay = (ms) => new Promise((resolve) => setTimeout(resolve, ms));
+
+(async () => {
+ await page.setData("status", "Checking GitHub login...");
+
+ if (process.env.VANA_DEMO_FAST_SUCCESS !== "1") {
+ await page.requestInput({
+ message: "Log in to GitHub",
+ schema: {
+ type: "object",
+ properties: {
+ username: { type: "string" },
+ password: { type: "string", format: "password" },
+ },
+ },
+ });
+ }
+
+ const demoDelay = process.env.VANA_DEMO_FAST_SUCCESS === "1" ? 800 : 120;
+ await delay(demoDelay);
+ await page.setData(
+ "status",
+ "Login confirmed. Collecting data in background...",
+ );
+ await page.setProgress({
+ phase: { step: 1, total: 3, label: "Profile" },
+ message: "Fetching profile...",
+ });
+ await delay(demoDelay);
+ await page.setProgress({
+ phase: { step: 2, total: 3, label: "Repositories" },
+ message: "Fetched 2 repositories",
+ count: 2,
+ });
+ await delay(demoDelay);
+ await page.setProgress({
+ phase: { step: 3, total: 3, label: "Starred" },
+ message: "Fetched 0 starred repositories",
+ count: 0,
+ });
+ await delay(demoDelay);
+
+ return {
+ profile: { username: "tnunamak" },
+ repositories: [{ name: "vana-connect" }, { name: "data-connectors" }],
+ starred: [],
+ exportSummary: {
+ count: 2,
+ label: "items",
+ details: "2 repositories, 0 starred",
+ },
+ };
+})();
+`,
+ "utf8",
+ );
+
+ await fs.writeFile(
+ path.join(
+ demoDataConnectorsRoot,
+ "connectors",
+ "shop",
+ "shop-playwright.js",
+ ),
+ `(async () => {
+ await page.showBrowser("https://shop.app/account/order-history");
+ await page.promptUser(
+ "Finish signing in to Shop in the browser window.",
+ async () => false,
+ 1,
+ );
+})();
+`,
+ "utf8",
+ );
+
+ await fs.writeFile(
+ path.join(
+ demoDataConnectorsRoot,
+ "connectors",
+ "spotify",
+ "spotify-playwright.js",
+ ),
+ `(async () => {
+ await page.requestInput({
+ message: "Connect Spotify",
+ schema: {
+ type: "object",
+ properties: {
+ email: { type: "string" },
+ },
+ },
+ });
+
+ return {
+ profile: { username: "tnunamak" },
+ playlists: [{ name: "Data Portability" }, { name: "Build Flow" }],
+ };
+})();
+`,
+ "utf8",
+ );
+}
+
+main().catch((error) => {
+ process.stderr.write(
+ `${error instanceof Error ? (error.stack ?? error.message) : String(error)}\n`,
+ );
+ process.exitCode = 1;
+});
diff --git a/scripts/render-vhs.mjs b/scripts/render-vhs.mjs
new file mode 100644
index 00000000..a5c10f7b
--- /dev/null
+++ b/scripts/render-vhs.mjs
@@ -0,0 +1,307 @@
+import fs from "node:fs";
+import os from "node:os";
+import path from "node:path";
+import { execFileSync } from "node:child_process";
+import { fileURLToPath } from "node:url";
+
+const __dirname = path.dirname(fileURLToPath(import.meta.url));
+const repoRoot = path.resolve(__dirname, "..");
+const tapesDir = path.join(repoRoot, "docs", "vhs");
+const fixturesRoot = path.join(tapesDir, "fixtures");
+const linuxSeaBinaryPath = path.join(
+ repoRoot,
+ "artifacts",
+ "sea",
+ "vana-linux-x64",
+ "vana",
+);
+const VHS_DOCKER_IMAGE = "ghcr.io/charmbracelet/vhs:latest";
+const DEFAULT_TAPE_TIMEOUT_MS = 180_000;
+
+/**
+ * Each tape entry specifies the tape file name and which environment it needs.
+ *
+ * Environment types:
+ * "seeded" — fixture HOME with data, VANA_DEMO_FAST_SUCCESS=1
+ * "fresh" — empty HOME (no prior state)
+ * "seeded-input" — fixture HOME with data, NO VANA_DEMO_FAST_SUCCESS
+ */
+const tapes = [
+ { tape: "help.tape", env: "seeded" },
+ { tape: "data-help.tape", env: "seeded" },
+ { tape: "setup.tape", env: "seeded" },
+ { tape: "status.tape", env: "seeded" },
+ { tape: "doctor.tape", env: "seeded" },
+ { tape: "logs.tape", env: "seeded" },
+ { tape: "sources.tape", env: "seeded" },
+ { tape: "sources-github.tape", env: "seeded" },
+ { tape: "collect.tape", env: "seeded" },
+ { tape: "collect-github.tape", env: "seeded" },
+ { tape: "server-status.tape", env: "seeded" },
+ { tape: "server-sync.tape", env: "seeded" },
+ { tape: "server-data.tape", env: "seeded" },
+ { tape: "data-list.tape", env: "seeded" },
+ { tape: "data-list-empty.tape", env: "fresh" },
+ { tape: "data-show-github.tape", env: "seeded" },
+ { tape: "data-show-github-missing.tape", env: "fresh" },
+ { tape: "data-path-github.tape", env: "seeded" },
+ { tape: "connect-github-no-input.tape", env: "fresh" },
+ {
+ tape: "connect-github-session-reuse-no-input.tape",
+ env: "seeded-input",
+ },
+ { tape: "connect-shop-no-input.tape", env: "seeded" },
+ { tape: "connect-shop.tape", env: "seeded" },
+ { tape: "connect-steam.tape", env: "seeded" },
+ { tape: "connect-steam-no-input.tape", env: "seeded" },
+ // Runs last — mutates fixture state by writing a new result file
+ { tape: "connect-github-success.tape", env: "seeded", resetFixtures: true },
+];
+
+async function main() {
+ prepareFixtures();
+ const connectorsDir = resolveDataConnectorsDir();
+ const tempRoot = fs.mkdtempSync(path.join(os.tmpdir(), "vana-vhs-"));
+ const binDir = path.join(tempRoot, "bin");
+ prepareBinDir(binDir);
+
+ // Create a fresh HOME for "fresh" env tapes
+ const freshHome = path.join(tempRoot, "fresh-home");
+ fs.mkdirSync(freshHome, { recursive: true });
+
+ const fixtureHome = path.join(fixturesRoot, "demo-home");
+
+ const basePath = `${binDir}${path.delimiter}${process.env.PATH ?? ""}`;
+ const baseEnvFields = {
+ ...(connectorsDir ? { VANA_DATA_CONNECTORS_DIR: connectorsDir } : {}),
+ };
+
+ // Build the three environment variants
+ const envs = {
+ seeded: {
+ ...process.env,
+ HOME: fixtureHome,
+ PATH: basePath,
+ VANA_DEMO_FAST_SUCCESS: "1",
+ ...baseEnvFields,
+ },
+ fresh: {
+ ...process.env,
+ HOME: freshHome,
+ PATH: basePath,
+ ...baseEnvFields,
+ },
+ "seeded-input": {
+ ...process.env,
+ HOME: fixtureHome,
+ PATH: basePath,
+ ...baseEnvFields,
+ },
+ };
+ // Ensure VANA_DEMO_FAST_SUCCESS is NOT set for seeded-input
+ delete envs["seeded-input"].VANA_DEMO_FAST_SUCCESS;
+
+ const runner = resolveRunner({ tempRoot, binDir, connectorsDir });
+
+ try {
+ for (const entry of tapes) {
+ if (entry.resetFixtures) {
+ process.stdout.write(
+ `[vhs] re-preparing fixtures before ${entry.tape}\n`,
+ );
+ prepareFixtures();
+ }
+ const tapePath = path.join(tapesDir, entry.tape);
+ const outputPath = tapePath.replace(/\.tape$/, ".gif");
+ if (fs.existsSync(outputPath)) {
+ fs.rmSync(outputPath, { force: true });
+ }
+ const env = envs[entry.env];
+ process.stdout.write(`[vhs] rendering ${entry.tape} (${entry.env})\n`);
+ runTape(runner, tapePath, env);
+ if (!fs.existsSync(outputPath)) {
+ throw new Error(
+ `VHS did not produce ${path.relative(repoRoot, outputPath)}.`,
+ );
+ }
+ process.stdout.write(
+ `[vhs] rendered ${path.relative(repoRoot, outputPath)}\n`,
+ );
+ }
+ } finally {
+ fs.rmSync(tempRoot, { recursive: true, force: true });
+ }
+}
+
+function prepareFixtures() {
+ execFileSync("node", ["./scripts/prepare-vhs-fixtures.mjs"], {
+ cwd: repoRoot,
+ stdio: "inherit",
+ });
+}
+
+function prepareBinDir(binDir) {
+ fs.mkdirSync(binDir, { recursive: true });
+ const launcherPath = path.join(binDir, "vana");
+ const launcherTarget = fs.existsSync(linuxSeaBinaryPath)
+ ? linuxSeaBinaryPath
+ : path.join(repoRoot, "dist", "cli", "bin.js");
+ const launcherExec = fs.existsSync(linuxSeaBinaryPath)
+ ? `exec "${launcherTarget}" "$@"`
+ : `exec node "${launcherTarget}" "$@"`;
+ fs.writeFileSync(
+ launcherPath,
+ `#!/usr/bin/env bash
+set -euo pipefail
+${launcherExec}
+`,
+ "utf8",
+ );
+ fs.chmodSync(launcherPath, 0o755);
+}
+
+function resolveRunner({ tempRoot }) {
+ if (commandExists("vhs")) {
+ return { command: "vhs", args: [] };
+ }
+ if (commandExists("docker")) {
+ if (!fs.existsSync(linuxSeaBinaryPath)) {
+ throw new Error(
+ `Docker-based VHS rendering requires ${path.relative(
+ repoRoot,
+ linuxSeaBinaryPath,
+ )}. Build it first with \`pnpm build:sea -- --artifact-name vana-linux-x64 --platform linux --arch x64 --archive-format tar.gz --binary-name vana\`.`,
+ );
+ }
+ ensureDockerImage(VHS_DOCKER_IMAGE);
+ return {
+ command: "docker",
+ isDocker: true,
+ baseArgs: [
+ "run",
+ "--rm",
+ // Run as the host user so VHS doesn't create root-owned files
+ // in mounted volumes (fixture HOME, temp dir). Without this,
+ // re-preparing fixtures fails with EACCES on .cache/ etc.
+ "--user",
+ `${process.getuid()}:${process.getgid()}`,
+ "-v",
+ `${repoRoot}:${repoRoot}`,
+ "-v",
+ `${tempRoot}:${tempRoot}`,
+ "-w",
+ repoRoot,
+ ],
+ };
+ }
+ throw new Error(
+ "VHS is not available. Install `vhs` or Docker to render demo tapes.",
+ );
+}
+
+function runTape(runner, tapePath, env) {
+ const timeout = resolveTapeTimeout();
+ try {
+ if (runner.isDocker) {
+ // Docker needs env vars passed explicitly via -e flags so VHS
+ // inside the container sees HOME, PATH, and other overrides.
+ const dockerEnvArgs = [];
+ const forwardKeys = [
+ "HOME",
+ "PATH",
+ "VANA_DEMO_FAST_SUCCESS",
+ "VANA_DATA_CONNECTORS_DIR",
+ ];
+ for (const key of forwardKeys) {
+ if (env[key] != null) {
+ dockerEnvArgs.push("-e", `${key}=${env[key]}`);
+ }
+ }
+ execFileSync(
+ runner.command,
+ [...runner.baseArgs, ...dockerEnvArgs, VHS_DOCKER_IMAGE, tapePath],
+ { cwd: repoRoot, stdio: "inherit", timeout },
+ );
+ } else {
+ execFileSync(runner.command, [...runner.args, tapePath], {
+ cwd: repoRoot,
+ env,
+ stdio: "inherit",
+ timeout,
+ });
+ }
+ } catch (error) {
+ if (isTimeoutError(error)) {
+ throw new Error(
+ `VHS timed out after ${timeout}ms while rendering ${path.basename(tapePath)}.`,
+ );
+ }
+ throw error;
+ }
+}
+
+function commandExists(command) {
+ try {
+ execFileSync("bash", ["-lc", `command -v ${command}`], {
+ stdio: "ignore",
+ });
+ return true;
+ } catch {
+ return false;
+ }
+}
+
+function ensureDockerImage(image) {
+ try {
+ execFileSync("docker", ["image", "inspect", image], {
+ stdio: "ignore",
+ });
+ } catch {
+ process.stdout.write(`[vhs] pulling ${image}\n`);
+ execFileSync("docker", ["pull", image], {
+ stdio: "inherit",
+ });
+ }
+}
+
+function resolveDataConnectorsDir() {
+ const fixtureRepo = path.join(fixturesRoot, "demo-data-connectors");
+ if (fs.existsSync(path.join(fixtureRepo, "registry.json"))) {
+ return fixtureRepo;
+ }
+
+ if (process.env.VANA_DATA_CONNECTORS_DIR) {
+ return process.env.VANA_DATA_CONNECTORS_DIR;
+ }
+
+ const siblingRepo = path.resolve(repoRoot, "..", "data-connectors");
+ return fs.existsSync(siblingRepo) ? siblingRepo : null;
+}
+
+function resolveTapeTimeout() {
+ const raw = process.env.VANA_VHS_TIMEOUT_MS;
+ if (!raw) {
+ return DEFAULT_TAPE_TIMEOUT_MS;
+ }
+
+ const value = Number(raw);
+ return Number.isFinite(value) && value > 0 ? value : DEFAULT_TAPE_TIMEOUT_MS;
+}
+
+function isTimeoutError(error) {
+ return (
+ typeof error === "object" &&
+ error !== null &&
+ "code" in error &&
+ (error.code === "ETIMEDOUT" || error.signal === "SIGTERM")
+ );
+}
+
+try {
+ main();
+} catch (error) {
+ process.stderr.write(
+ `${error instanceof Error ? (error.stack ?? error.message) : String(error)}\n`,
+ );
+ process.exitCode = 1;
+}
diff --git a/scripts/report-runtime-footprint.mjs b/scripts/report-runtime-footprint.mjs
new file mode 100644
index 00000000..af3f7b02
--- /dev/null
+++ b/scripts/report-runtime-footprint.mjs
@@ -0,0 +1,153 @@
+import fs from "node:fs/promises";
+import path from "node:path";
+import os from "node:os";
+import { createRequire } from "node:module";
+
+const require = createRequire(import.meta.url);
+
+async function main() {
+ const dataHome = path.join(os.homedir(), ".vana");
+ const browserCacheDir =
+ process.env.PLAYWRIGHT_BROWSERS_PATH ?? path.join(dataHome, "browsers");
+
+ const report = {
+ generatedAt: new Date().toISOString(),
+ paths: {
+ dataHome,
+ browserCacheDir,
+ browserProfilesDir: path.join(dataHome, "browser-profiles"),
+ connectorsDir: path.join(dataHome, "connectors"),
+ logsDir: path.join(dataHome, "logs"),
+ },
+ sizes: {
+ dataHome: await describePath(dataHome),
+ browserCacheDir: await describePath(browserCacheDir),
+ browserProfilesDir: await describePath(
+ path.join(dataHome, "browser-profiles"),
+ ),
+ connectorsDir: await describePath(path.join(dataHome, "connectors")),
+ logsDir: await describePath(path.join(dataHome, "logs")),
+ packageRuntime: {
+ playwright: await describeNodeModule("playwright"),
+ playwrightCore: await describePlaywrightCore(),
+ chromiumBidi: await describeNodeModule("chromium-bidi"),
+ },
+ },
+ };
+
+ process.stdout.write(`${JSON.stringify(report, null, 2)}\n`);
+}
+
+async function describeNodeModule(
+ packageName,
+ fallbackSpecifier = packageName,
+) {
+ try {
+ const packageRoot = resolvePackageRoot(packageName, fallbackSpecifier);
+ const info = await describePath(packageRoot);
+ return {
+ packageRoot,
+ ...info,
+ };
+ } catch {
+ return {
+ exists: false,
+ bytes: 0,
+ files: 0,
+ };
+ }
+}
+
+async function describePlaywrightCore() {
+ try {
+ const playwrightRoot = resolvePackageRoot("playwright", "playwright");
+ const siblingRoot = path.resolve(playwrightRoot, "..", "playwright-core");
+ const info = await describePath(siblingRoot);
+ if (info.exists) {
+ return {
+ packageRoot: siblingRoot,
+ ...info,
+ };
+ }
+ } catch {
+ // Fall through to generic resolution below.
+ }
+
+ return describeNodeModule(
+ "playwright-core",
+ "playwright-core/lib/server/registry/index",
+ );
+}
+
+function resolvePackageRoot(packageName, fallbackSpecifier) {
+ try {
+ return path.dirname(require.resolve(`${packageName}/package.json`));
+ } catch {
+ const entryPath = require.resolve(fallbackSpecifier);
+ let current = path.dirname(entryPath);
+
+ while (current !== path.dirname(current)) {
+ const packageJsonPath = path.join(current, "package.json");
+ try {
+ require(packageJsonPath);
+ return current;
+ } catch {
+ current = path.dirname(current);
+ }
+ }
+
+ throw new Error(`Could not resolve package root for ${packageName}`);
+ }
+}
+
+async function describePath(targetPath) {
+ try {
+ const stats = await fs.stat(targetPath);
+ if (!stats.isDirectory()) {
+ return {
+ exists: true,
+ bytes: stats.size,
+ files: 1,
+ };
+ }
+
+ const { bytes, files } = await walkSize(targetPath);
+ return {
+ exists: true,
+ bytes,
+ files,
+ };
+ } catch {
+ return {
+ exists: false,
+ bytes: 0,
+ files: 0,
+ };
+ }
+}
+
+async function walkSize(root) {
+ let bytes = 0;
+ let files = 0;
+ const entries = await fs.readdir(root, { withFileTypes: true });
+
+ for (const entry of entries) {
+ const entryPath = path.join(root, entry.name);
+ if (entry.isDirectory()) {
+ const nested = await walkSize(entryPath);
+ bytes += nested.bytes;
+ files += nested.files;
+ continue;
+ }
+
+ if (entry.isFile()) {
+ const stats = await fs.stat(entryPath);
+ bytes += stats.size;
+ files += 1;
+ }
+ }
+
+ return { bytes, files };
+}
+
+await main();
diff --git a/scripts/spinner-demo.mjs b/scripts/spinner-demo.mjs
new file mode 100644
index 00000000..503599fb
--- /dev/null
+++ b/scripts/spinner-demo.mjs
@@ -0,0 +1,174 @@
+/**
+ * Final two + variants. Pick one.
+ * Usage: node scripts/spinner-demo.mjs
+ * Ctrl+C to exit.
+ */
+
+const GREEN = "\x1b[38;2;0;213;11m";
+const BLUE = "\x1b[38;2;65;65;252m";
+const DIM = "\x1b[2m";
+const BOLD = "\x1b[1m";
+const RESET = "\x1b[0m";
+
+const CHECK = `${GREEN}✓${RESET}`;
+
+function style(char, isDim) {
+ return isDim ? `${DIM}${BLUE}${char}${RESET}` : `${BLUE}${char}${RESET}`;
+}
+
+const spinners = [
+ {
+ name: "Equal beats + breath (original)",
+ frames: [
+ { char: "·", duration: 200, dim: true },
+ { char: "✧", duration: 130, dim: false },
+ { char: "✦", duration: 350, dim: false },
+ { char: "✧", duration: 130, dim: false },
+ { char: "·", duration: 100, dim: true },
+ { char: "✧", duration: 130, dim: false },
+ { char: "✦", duration: 350, dim: false },
+ { char: "✧", duration: 130, dim: false },
+ { char: "·", duration: 400, dim: true },
+ ],
+ },
+ {
+ name: "Equal beats + shorter breath",
+ frames: [
+ { char: "·", duration: 200, dim: true },
+ { char: "✧", duration: 130, dim: false },
+ { char: "✦", duration: 350, dim: false },
+ { char: "✧", duration: 130, dim: false },
+ { char: "·", duration: 100, dim: true },
+ { char: "✧", duration: 130, dim: false },
+ { char: "✦", duration: 350, dim: false },
+ { char: "✧", duration: 130, dim: false },
+ { char: "·", duration: 250, dim: true },
+ ],
+ },
+ {
+ name: "Dark pause (original, 2.5s)",
+ frames: [
+ { char: " ", duration: 300, dim: true },
+ { char: "·", duration: 150, dim: true },
+ { char: "✧", duration: 120, dim: false },
+ { char: "✦", duration: 200, dim: false },
+ { char: "✧", duration: 100, dim: false },
+ { char: "·", duration: 80, dim: true },
+ { char: "✧", duration: 120, dim: false },
+ { char: "✦", duration: 500, dim: false },
+ { char: "✧", duration: 120, dim: false },
+ { char: "·", duration: 150, dim: true },
+ { char: " ", duration: 200, dim: true },
+ ],
+ },
+ {
+ name: "Dark pause shorter (2s)",
+ frames: [
+ { char: " ", duration: 180, dim: true },
+ { char: "·", duration: 120, dim: true },
+ { char: "✧", duration: 120, dim: false },
+ { char: "✦", duration: 200, dim: false },
+ { char: "✧", duration: 100, dim: false },
+ { char: "·", duration: 80, dim: true },
+ { char: "✧", duration: 120, dim: false },
+ { char: "✦", duration: 450, dim: false },
+ { char: "✧", duration: 120, dim: false },
+ { char: "·", duration: 100, dim: true },
+ { char: " ", duration: 120, dim: true },
+ ],
+ },
+ {
+ name: "Dark pause tight (1.7s)",
+ frames: [
+ { char: " ", duration: 120, dim: true },
+ { char: "·", duration: 100, dim: true },
+ { char: "✧", duration: 100, dim: false },
+ { char: "✦", duration: 180, dim: false },
+ { char: "✧", duration: 80, dim: false },
+ { char: "·", duration: 70, dim: true },
+ { char: "✧", duration: 100, dim: false },
+ { char: "✦", duration: 400, dim: false },
+ { char: "✧", duration: 100, dim: false },
+ { char: "·", duration: 80, dim: true },
+ { char: " ", duration: 80, dim: true },
+ ],
+ },
+ // Hybrid: equal beats structure but with dark pause between cycles
+ {
+ name: "Equal beats + dark pause (2.2s)",
+ frames: [
+ { char: " ", duration: 150, dim: true },
+ { char: "·", duration: 120, dim: true },
+ { char: "✧", duration: 130, dim: false },
+ { char: "✦", duration: 320, dim: false },
+ { char: "✧", duration: 120, dim: false },
+ { char: "·", duration: 90, dim: true },
+ { char: "✧", duration: 130, dim: false },
+ { char: "✦", duration: 320, dim: false },
+ { char: "✧", duration: 120, dim: false },
+ { char: "·", duration: 120, dim: true },
+ { char: " ", duration: 150, dim: true },
+ ],
+ },
+];
+
+const states = spinners.map(() => ({
+ frameIndex: 0,
+ frameElapsed: 0,
+}));
+
+function render() {
+ const lines = [];
+ lines.push(
+ `${BOLD}Final Two + Variants${RESET} ${DIM}— Ctrl+C to exit${RESET}`,
+ );
+
+ for (let i = 0; i < spinners.length; i++) {
+ const st = states[i];
+ const frame = spinners[i].frames[st.frameIndex];
+ const displayChar = frame.char === " " ? " " : style(frame.char, frame.dim);
+ const label = `${DIM}${spinners[i].name}${RESET}`;
+ lines.push("");
+ lines.push(` ${CHECK} Profile ${displayChar} Repositories ${label}`);
+ }
+
+ lines.push("");
+ return lines.join("\n");
+}
+
+const TICK = 15;
+
+function tick() {
+ for (let i = 0; i < spinners.length; i++) {
+ const st = states[i];
+ st.frameElapsed += TICK;
+ if (st.frameElapsed >= spinners[i].frames[st.frameIndex].duration) {
+ st.frameElapsed = 0;
+ st.frameIndex = (st.frameIndex + 1) % spinners[i].frames.length;
+ }
+ }
+}
+
+let lastLineCount = 0;
+
+function draw() {
+ if (lastLineCount > 0) {
+ process.stdout.write(`\x1b[${lastLineCount}A`);
+ }
+ const output = render();
+ lastLineCount = output.split("\n").length;
+ process.stdout.write(output + "\n");
+}
+
+draw();
+
+const interval = setInterval(() => {
+ tick();
+ draw();
+}, TICK);
+
+process.on("SIGINT", () => {
+ clearInterval(interval);
+ process.stdout.write("\n");
+ process.exit(0);
+});
diff --git a/scripts/test-install-github-release.sh b/scripts/test-install-github-release.sh
new file mode 100644
index 00000000..7c8e9811
--- /dev/null
+++ b/scripts/test-install-github-release.sh
@@ -0,0 +1,89 @@
+#!/usr/bin/env sh
+set -eu
+
+REPO="${VANA_RELEASE_REPO:-vana-com/vana-connect}"
+BRANCH="${VANA_INSTALLER_BRANCH:-feat/connect-cli-v1}"
+VERSION="${VANA_VERSION:-}"
+SOURCE="${VANA_CONNECT_SOURCE:-github}"
+
+while [ "$#" -gt 0 ]; do
+ case "$1" in
+ --version)
+ VERSION="$2"
+ shift 2
+ ;;
+ --branch)
+ BRANCH="$2"
+ shift 2
+ ;;
+ --repo)
+ REPO="$2"
+ shift 2
+ ;;
+ --source)
+ SOURCE="$2"
+ shift 2
+ ;;
+ *)
+ echo "Unknown argument: $1" >&2
+ exit 1
+ ;;
+ esac
+done
+
+if [ -z "$VERSION" ]; then
+ echo "Usage: $0 --version [--branch ] [--repo ] [--source ]" >&2
+ exit 1
+fi
+
+TMP_ROOT="$(mktemp -d)"
+cleanup() {
+ rm -rf "$TMP_ROOT"
+}
+trap cleanup EXIT INT TERM
+
+HOME_DIR="$TMP_ROOT/home"
+BIN_DIR="$TMP_ROOT/bin"
+INSTALL_ROOT="$TMP_ROOT/install"
+mkdir -p "$HOME_DIR" "$BIN_DIR" "$INSTALL_ROOT"
+
+INSTALLER_URL="https://raw.githubusercontent.com/${REPO}/${BRANCH}/install/install.sh"
+
+echo "Installing vana from ${REPO}@${VERSION}"
+curl -fsSL "$INSTALLER_URL" |
+ HOME="$HOME_DIR" sh -s -- \
+ --version "$VERSION" \
+ --bin-dir "$BIN_DIR" \
+ --install-root "$INSTALL_ROOT"
+
+PATH="$BIN_DIR:$PATH"
+export HOME="$HOME_DIR"
+export VANA_APP_ROOT="$INSTALL_ROOT/current/app"
+
+echo "Checking status"
+"$BIN_DIR/vana" status --json >/dev/null
+
+echo "Checking sources"
+"$BIN_DIR/vana" sources --json >/dev/null
+
+echo "Checking piped human output"
+"$BIN_DIR/vana" sources | head -n 8 >/dev/null
+
+echo "Checking non-interactive connect for ${SOURCE}"
+set +e
+CONNECT_OUTPUT="$("$BIN_DIR/vana" connect "$SOURCE" --json --no-input 2>&1)"
+CONNECT_EXIT_CODE=$?
+set -e
+printf '%s\n' "$CONNECT_OUTPUT"
+
+if [ "$CONNECT_EXIT_CODE" -ne 0 ] && [ "$CONNECT_EXIT_CODE" -ne 1 ]; then
+ echo "Unexpected vana exit code: ${CONNECT_EXIT_CODE}" >&2
+ exit 1
+fi
+
+if ! printf '%s\n' "$CONNECT_OUTPUT" | grep -Eq '"status":"(needs_input|legacy_auth|connected_local_only|connected_and_ingested)"'; then
+ echo "Unexpected connect outcome for ${SOURCE}" >&2
+ exit 1
+fi
+
+echo "GitHub release installer smoke test passed"
diff --git a/scripts/test-install-unix.sh b/scripts/test-install-unix.sh
new file mode 100644
index 00000000..54d41b19
--- /dev/null
+++ b/scripts/test-install-unix.sh
@@ -0,0 +1,24 @@
+#!/usr/bin/env sh
+set -eu
+
+ROOT_DIR="$(CDPATH= cd -- "$(dirname "$0")/.." && pwd)"
+ARTIFACT_DIR="$ROOT_DIR/artifacts/sea"
+WORK_DIR="$ROOT_DIR/.sea-work/test-install-unix"
+RELEASE_DIR="$WORK_DIR/local-release/test-release"
+HOME_DIR="$WORK_DIR/home"
+
+mkdir -p "$RELEASE_DIR" "$HOME_DIR"
+
+cp "$ARTIFACT_DIR/vana-linux-x64.tar.gz" "$RELEASE_DIR/vana-linux-x64.tar.gz"
+cp "$ARTIFACT_DIR/vana-linux-x64.tar.gz.sha256" "$RELEASE_DIR/vana-linux-x64.tar.gz.sha256"
+
+VANA_VERSION=test-release \
+VANA_RELEASE_BASE_URL="file://$WORK_DIR/local-release" \
+VANA_INSTALL_ROOT="$HOME_DIR/root" \
+VANA_INSTALL_BIN_DIR="$HOME_DIR/bin" \
+HOME="$HOME_DIR" \
+sh "$ROOT_DIR/install/install.sh"
+
+PATH="$HOME_DIR/bin:$PATH" HOME="$HOME_DIR" "$HOME_DIR/bin/vana" status --json >/dev/null
+
+echo "Unix installer smoke test passed"
diff --git a/scripts/test-install-windows.ps1 b/scripts/test-install-windows.ps1
new file mode 100644
index 00000000..97a4801b
--- /dev/null
+++ b/scripts/test-install-windows.ps1
@@ -0,0 +1,35 @@
+$ErrorActionPreference = "Stop"
+
+$RootDir = Split-Path -Parent $PSScriptRoot
+$ArtifactDir = Join-Path $RootDir "artifacts\sea"
+$WorkDir = Join-Path $RootDir ".sea-work\test-install-windows"
+$ReleaseDir = Join-Path $WorkDir "local-release\test-release"
+$HomeDir = Join-Path $WorkDir "home"
+
+New-Item -ItemType Directory -Force -Path $ReleaseDir | Out-Null
+New-Item -ItemType Directory -Force -Path $HomeDir | Out-Null
+
+Copy-Item -Path (Join-Path $ArtifactDir "vana-win32-x64.zip") -Destination (Join-Path $ReleaseDir "vana-win32-x64.zip") -Force
+Copy-Item -Path (Join-Path $ArtifactDir "vana-win32-x64.zip.sha256") -Destination (Join-Path $ReleaseDir "vana-win32-x64.zip.sha256") -Force
+
+$BinDir = Join-Path $HomeDir "bin"
+$InstallRoot = Join-Path $HomeDir "root"
+
+$env:VANA_VERSION = "test-release"
+$env:VANA_RELEASE_BASE_URL = $WorkDir + "\local-release"
+$env:VANA_INSTALL_ROOT = $InstallRoot
+$env:VANA_INSTALL_BIN_DIR = $BinDir
+$env:HOME = $HomeDir
+
+try {
+ & (Join-Path $RootDir "install\install.ps1")
+ $env:PATH = "$BinDir;$env:PATH"
+ & (Join-Path $BinDir "vana.cmd") status --json | Out-Null
+ Write-Host "Windows installer smoke test passed"
+}
+finally {
+ Remove-Item Env:VANA_VERSION -ErrorAction SilentlyContinue
+ Remove-Item Env:VANA_RELEASE_BASE_URL -ErrorAction SilentlyContinue
+ Remove-Item Env:VANA_INSTALL_ROOT -ErrorAction SilentlyContinue
+ Remove-Item Env:VANA_INSTALL_BIN_DIR -ErrorAction SilentlyContinue
+}
diff --git a/scripts/watch-release-lane.mjs b/scripts/watch-release-lane.mjs
new file mode 100644
index 00000000..1e22a751
--- /dev/null
+++ b/scripts/watch-release-lane.mjs
@@ -0,0 +1,376 @@
+import { execFileSync } from "node:child_process";
+import fs from "node:fs";
+import path from "node:path";
+import process from "node:process";
+
+const DEFAULT_REPO = "vana-com/vana-connect";
+const DEFAULT_TAP_REPO = "vana-com/homebrew-vana";
+const DEFAULT_TAP_WORKFLOW = "sync-formula.yml";
+const DEFAULT_TAP_LOCAL_PATH = "/home/tnunamak/code/homebrew-vana";
+const DEFAULT_POLL_MS = 30_000;
+const DEFAULT_TIMEOUT_MS = 45 * 60_000;
+
+function main() {
+ const options = parseArgs(process.argv.slice(2));
+ const branch = options.branch ?? getCurrentBranch(options.repo);
+ const headSha = options.headSha ?? getTrackedHeadSha(branch);
+ const releaseTag = options.releaseTag ?? `canary-${slugify(branch)}`;
+
+ log(`Watching release lane for ${branch} @ ${headSha.slice(0, 7)}`);
+
+ if (!options.syncOnly) {
+ waitForWorkflow({
+ repo: options.repo,
+ workflow: "CI",
+ branch,
+ headSha,
+ pollMs: options.pollMs,
+ timeoutMs: options.timeoutMs,
+ });
+ waitForWorkflow({
+ repo: options.repo,
+ workflow: "Canary Release",
+ branch,
+ headSha,
+ pollMs: options.pollMs,
+ timeoutMs: options.timeoutMs,
+ });
+ }
+
+ if (options.skipTap) {
+ log("Skipping tap sync.");
+ return;
+ }
+
+ const previousTapRunId = getLatestRunId({
+ repo: options.tapRepo,
+ workflow: "Sync Formula",
+ });
+
+ log(`Triggering formula sync for ${releaseTag}`);
+ execCommand("gh", [
+ "workflow",
+ "run",
+ options.tapWorkflow,
+ "--repo",
+ options.tapRepo,
+ "-f",
+ `release_tag=${releaseTag}`,
+ ]);
+
+ const tapRun = waitForNewWorkflowRun({
+ repo: options.tapRepo,
+ workflow: "Sync Formula",
+ previousRunId: previousTapRunId,
+ pollMs: options.pollMs,
+ timeoutMs: options.timeoutMs,
+ });
+ waitForRunCompletion({
+ repo: options.tapRepo,
+ runId: tapRun.databaseId,
+ pollMs: options.pollMs,
+ timeoutMs: options.timeoutMs,
+ });
+
+ if (fs.existsSync(options.tapLocalPath)) {
+ log(`Refreshing local tap at ${options.tapLocalPath}`);
+ execCommand("git", ["-C", options.tapLocalPath, "pull", "--ff-only"]);
+ const formulaPath = path.join(options.tapLocalPath, "Formula", "vana.rb");
+ if (fs.existsSync(formulaPath)) {
+ const lines = fs
+ .readFileSync(formulaPath, "utf8")
+ .split("\n")
+ .slice(0, 24);
+ log(`Current tap formula preview:\n${lines.join("\n")}`);
+ log(`Checking Homebrew formula sync for ${releaseTag}`);
+ execCommand("node", [
+ "./scripts/assert-homebrew-formula-sync.mjs",
+ "--release-tag",
+ releaseTag,
+ "--release-repo",
+ options.repo,
+ "--formula-path",
+ formulaPath,
+ ]);
+ }
+ }
+
+ if (options.skipVerify) {
+ log("Skipping published installer verification.");
+ return;
+ }
+
+ log(`Running published installer verification for ${releaseTag}`);
+ execCommand("sh", [
+ "./scripts/test-install-github-release.sh",
+ "--version",
+ releaseTag,
+ "--branch",
+ branch,
+ "--repo",
+ options.repo,
+ ]);
+ log(`Checking published demo assets for ${releaseTag}`);
+ execCommand("node", [
+ "./scripts/assert-release-demo-assets.mjs",
+ "--tag",
+ releaseTag,
+ "--repo",
+ options.repo,
+ ]);
+}
+
+function parseArgs(argv) {
+ const options = {
+ repo: DEFAULT_REPO,
+ tapRepo: DEFAULT_TAP_REPO,
+ tapWorkflow: DEFAULT_TAP_WORKFLOW,
+ tapLocalPath: DEFAULT_TAP_LOCAL_PATH,
+ pollMs: DEFAULT_POLL_MS,
+ timeoutMs: DEFAULT_TIMEOUT_MS,
+ syncOnly: false,
+ skipTap: false,
+ skipVerify: false,
+ branch: undefined,
+ headSha: undefined,
+ releaseTag: undefined,
+ };
+
+ for (let index = 0; index < argv.length; index += 1) {
+ const arg = argv[index];
+ switch (arg) {
+ case "--":
+ break;
+ case "--repo":
+ options.repo = argv[++index];
+ break;
+ case "--branch":
+ options.branch = argv[++index];
+ break;
+ case "--head-sha":
+ options.headSha = argv[++index];
+ break;
+ case "--release-tag":
+ options.releaseTag = argv[++index];
+ break;
+ case "--tap-repo":
+ options.tapRepo = argv[++index];
+ break;
+ case "--tap-workflow":
+ options.tapWorkflow = argv[++index];
+ break;
+ case "--tap-local-path":
+ options.tapLocalPath = argv[++index];
+ break;
+ case "--poll-ms":
+ options.pollMs = Number(argv[++index]);
+ break;
+ case "--timeout-ms":
+ options.timeoutMs = Number(argv[++index]);
+ break;
+ case "--sync-only":
+ options.syncOnly = true;
+ break;
+ case "--skip-tap":
+ options.skipTap = true;
+ break;
+ case "--skip-verify":
+ options.skipVerify = true;
+ break;
+ default:
+ throw new Error(`Unknown argument: ${arg}`);
+ }
+ }
+
+ return options;
+}
+
+function waitForWorkflow({
+ repo,
+ workflow,
+ branch,
+ headSha,
+ pollMs,
+ timeoutMs,
+}) {
+ const startedAt = Date.now();
+ while (Date.now() - startedAt < timeoutMs) {
+ const run = getWorkflowRun({ repo, workflow, branch, headSha });
+ if (!run) {
+ log(`Waiting for ${workflow} run for ${headSha.slice(0, 7)}...`);
+ sleep(pollMs);
+ continue;
+ }
+
+ if (run.status !== "completed") {
+ log(`${workflow}: ${run.status} (${run.url})`);
+ sleep(pollMs);
+ continue;
+ }
+
+ if (run.conclusion !== "success") {
+ throw new Error(`${workflow} failed: ${run.url}`);
+ }
+
+ log(`${workflow}: success`);
+ return run;
+ }
+
+ throw new Error(`Timed out waiting for ${workflow}`);
+}
+
+function waitForNewWorkflowRun({
+ repo,
+ workflow,
+ previousRunId,
+ pollMs,
+ timeoutMs,
+}) {
+ const startedAt = Date.now();
+ while (Date.now() - startedAt < timeoutMs) {
+ const run = getLatestRun({ repo, workflow });
+ if (run && (!previousRunId || run.databaseId > previousRunId)) {
+ log(`Detected ${workflow} run ${run.databaseId}`);
+ return run;
+ }
+
+ log(`Waiting for ${workflow} run to appear...`);
+ sleep(pollMs);
+ }
+
+ throw new Error(`Timed out waiting for ${workflow} run`);
+}
+
+function waitForRunCompletion({ repo, runId, pollMs, timeoutMs }) {
+ const startedAt = Date.now();
+ let lastStatus = "unknown";
+ while (Date.now() - startedAt < timeoutMs) {
+ const workflowRun = JSON.parse(
+ execCommand("gh", [
+ "run",
+ "view",
+ String(runId),
+ "--repo",
+ repo,
+ "--json",
+ "status,conclusion,url",
+ ]),
+ );
+ lastStatus = workflowRun.status;
+ if (workflowRun.status !== "completed") {
+ log(`Workflow run ${runId}: ${workflowRun.status} (${workflowRun.url})`);
+ sleep(pollMs);
+ continue;
+ }
+
+ if (workflowRun.conclusion !== "success") {
+ throw new Error(`Workflow run ${runId} failed: ${workflowRun.url}`);
+ }
+
+ log(`Workflow run ${runId}: success`);
+ return;
+ }
+
+ throw new Error(
+ `Timed out waiting for workflow run ${runId} (last status: ${lastStatus}). Increase --timeout-ms if the workflow is healthy but slow.`,
+ );
+}
+
+function getWorkflowRun({ repo, workflow, branch, headSha }) {
+ const runs = JSON.parse(
+ execCommand("gh", [
+ "run",
+ "list",
+ "--repo",
+ repo,
+ "--workflow",
+ workflow,
+ "--branch",
+ branch,
+ "--commit",
+ headSha,
+ "--limit",
+ "1",
+ "--json",
+ "databaseId,workflowName,status,conclusion,headSha,url,displayTitle",
+ ]),
+ );
+ return runs[0] ?? null;
+}
+
+function getLatestRun({ repo, workflow }) {
+ const runs = JSON.parse(
+ execCommand("gh", [
+ "run",
+ "list",
+ "--repo",
+ repo,
+ "--workflow",
+ workflow,
+ "--limit",
+ "1",
+ "--json",
+ "databaseId,workflowName,status,conclusion,headSha,url,displayTitle",
+ ]),
+ );
+ return runs[0] ?? null;
+}
+
+function getLatestRunId({ repo, workflow }) {
+ return getLatestRun({ repo, workflow })?.databaseId ?? null;
+}
+
+function getCurrentBranch(repo) {
+ const remote = execCommand("gh", ["repo", "view", repo, "--json", "name"], {
+ allowFailure: true,
+ });
+ if (!remote) {
+ return execCommand("git", ["rev-parse", "--abbrev-ref", "HEAD"]).trim();
+ }
+ return execCommand("git", ["rev-parse", "--abbrev-ref", "HEAD"]).trim();
+}
+
+function getCurrentHeadSha() {
+ return execCommand("git", ["rev-parse", "HEAD"]).trim();
+}
+
+function getTrackedHeadSha(branch) {
+ try {
+ return execCommand("git", ["rev-parse", `origin/${branch}`]).trim();
+ } catch {
+ return getCurrentHeadSha();
+ }
+}
+
+function slugify(value) {
+ return value.replace(/[/_]/g, "-").replace(/[^a-zA-Z0-9-]/g, "");
+}
+
+function execCommand(command, args, options = {}) {
+ try {
+ return execFileSync(command, args, {
+ encoding: "utf8",
+ stdio: ["ignore", "pipe", "pipe"],
+ ...options,
+ });
+ } catch (error) {
+ if (options.allowFailure) {
+ return "";
+ }
+ const stderr = error.stderr?.toString?.() ?? "";
+ const stdout = error.stdout?.toString?.() ?? "";
+ throw new Error(
+ `Command failed: ${command} ${args.join(" ")}\n${stdout}${stderr}`.trim(),
+ );
+ }
+}
+
+function sleep(ms) {
+ Atomics.wait(new Int32Array(new SharedArrayBuffer(4)), 0, 0, ms);
+}
+
+function log(message) {
+ process.stdout.write(`[release] ${message}\n`);
+}
+
+main();
diff --git a/skills/connect-data/SKILL.md b/skills/connect-data/SKILL.md
new file mode 100644
index 00000000..e9cf83b3
--- /dev/null
+++ b/skills/connect-data/SKILL.md
@@ -0,0 +1,247 @@
+---
+name: vana-connect
+description: >
+ Connect personal data from any web platform using browser automation.
+ Use when: (1) user wants to connect a data source like ChatGPT, Instagram,
+ Spotify, or any platform, (2) user says "connect my [platform]",
+ (3) user wants to generate or update their profile from connected data.
+ Also triggers on: "create a connector for [platform]".
+---
+
+# Connect
+
+Connect personal data from web platforms using the `vana` CLI and local browser automation.
+
+## Setup
+
+Prefer an installed `vana` CLI on `PATH`:
+
+```bash
+command -v vana
+```
+
+If that succeeds, use:
+
+```bash
+vana
+```
+
+If `vana` is unavailable, install the current published canary. Prefer:
+
+macOS with Homebrew:
+
+```bash
+brew tap vana-com/vana
+brew install vana
+```
+
+macOS and Linux:
+
+```bash
+curl -fsSL https://raw.githubusercontent.com/vana-com/vana-connect/feat/connect-cli-v1/install/install.sh | sh -s -- --version canary-feat-connect-cli-v1
+```
+
+Only if the installed CLI path is unavailable or blocked, fall back to:
+
+```bash
+npx -y @opendatalabs/connect@canary
+```
+
+If the user is explicitly testing local changes, fall back to:
+
+```bash
+node /home/tnunamak/code/vana-connect/dist/cli/bin.js
+```
+
+If neither path is available, follow `SETUP.md` in this folder.
+
+Before connecting a source, check runtime state with:
+
+```bash
+vana status --json
+```
+
+If the user needs install, path, or upgrade diagnostics, use:
+
+```bash
+vana doctor
+```
+
+If the user needs recent setup, fetch, or run logs, use:
+
+```bash
+vana logs
+vana logs
+```
+
+If the runtime is missing, tell the user: "I need to do a one-time setup first. This downloads a browser engine and some dependencies into `~/.vana/` and usually takes about a minute." Then run:
+
+```bash
+vana setup --yes
+```
+
+## Flow
+
+### 1. Explore available sources
+
+```bash
+vana sources --json
+```
+
+This is the source of truth for what the CLI can currently connect. Prefer it over inspecting repo files manually.
+
+If the requested platform is present, use the CLI flow below.
+
+**If no connector exists for the platform,** tell the user you'll build one — this involves researching the platform's data APIs, writing the extraction code, and testing it. Let them know it'll take a bit and they're welcome to do something else while you work. Then read `CREATE.md` and follow it.
+
+### 2. Check the source's auth mode
+
+Each source has an `authMode` in `vana sources --json`:
+
+- `interactive` — prompts for credentials in the terminal. **You can handle this directly.**
+- `legacy` — opens a headed browser window. **You cannot do this. Tell the user to run it in their own terminal.**
+- `automated` — no auth needed. Fully autonomous.
+
+### 3. Connect with the CLI
+
+**For `interactive` or `automated` sources:** Use IPC mode with `run_in_background`.
+
+IMPORTANT: You MUST use `run_in_background: true` for the connect command. The process will block while waiting for credentials. If you run it in the foreground, your bash call will hang and you won't be able to respond to prompts.
+
+Step 1: Start the connect in the background.
+
+```bash
+vana connect --json --ipc 2>&1
+```
+
+Use `run_in_background: true` for this command.
+
+Step 2: Immediately read the background task output. Look for a `needs-input` JSON line containing `pendingInputPath` and `responseInputPath`. If you see `connected` instead, the saved session worked and you're done.
+
+Step 3: If credentials are needed, read the pending input file to see what fields are required:
+
+```bash
+cat
+```
+
+Step 4: Ask the user for the required credentials.
+
+Step 5: Write the response file (use the exact `responseInputPath` from the event):
+
+```bash
+echo '{"username":"value","password":"value"}' >
+```
+
+Step 6: Check the background task output again. The connector may prompt again (e.g. for 2FA). If so, repeat steps 3-5 with the new pending input file. If the task completes, you're done.
+
+Note: The connector polls for up to 5 minutes per prompt. If the user takes longer, it will time out and you'll need to rerun.
+
+**For `legacy` (browser) sources:** You cannot connect these. Tell the user:
+
+> Run `vana connect ` in your terminal. It will open a browser for you to log in. Say "done" when finished.
+
+**For a quick status check without connecting:** Use the `--json --no-input` probe. This checks if a saved session works without prompting:
+
+```bash
+vana connect --json --no-input
+```
+
+Outcomes: `needs_input`, `legacy_auth`, `connected_local_only`, or `connected_and_ingested`.
+
+If the user specifically wants to inspect current state before rerunning, use:
+
+```bash
+vana status
+```
+
+### 3. Handle outcomes
+
+The CLI emits structured JSON events in `--json` mode.
+
+Key outcomes:
+
+- `needs_input`
+ The connector needs a live login or another manual step. Explain that you'll rerun interactively.
+- `legacy_auth`
+ The connector still depends on `showBrowser` / `promptUser`. Explain that this source still needs a headed/manual session path and may not work in fully headless batch mode yet.
+- `connected_local_only`
+ Data was collected locally but no Personal Server target was available.
+- `connected_and_ingested`
+ Data was collected and synced to the Personal Server.
+
+If setup, fetch, or run output is truncated, use:
+
+```bash
+vana logs
+vana logs
+```
+
+Prefer that over manually hunting through `~/.vana/logs/` or rerunning blindly.
+
+After a successful connect, prefer the CLI data surfaces over raw file inspection when possible:
+
+```bash
+vana data list
+vana data show
+vana data path
+vana logs
+```
+
+### 4. Validate, present results, and offer to contribute
+
+If you built or modified a connector, immediately run validation — before presenting results to the user:
+
+```bash
+node scripts/validate.cjs /-playwright.js --check-result ~/.vana/last-result.json
+```
+
+Fix any issues the validator reports. The validator checks debug code, login method diversity, schema descriptions, data cleanliness, and more — it is the quality gate. Iterate until validation passes.
+
+Then read the result file and summarize for the user in human terms (see "Communicating with the user" below).
+
+If you built a new connector (not one from the registry), ask the user:
+
+> "Want to share this connector so others can connect their [Platform] data too? Contributing means the community helps maintain it when [Platform] changes their site."
+
+If yes, run `node scripts/validate.cjs /-playwright.js --contribute`. If no, move on.
+
+### 5. Suggest what to do with the data
+
+After the contribution question is resolved (or if using an existing connector), suggest use cases from `RECIPES.md`: user profile generation, personal knowledge base, data backup, cross-platform synthesis, activity analytics.
+
+## Communicating with the user
+
+The user can't see what you're doing behind the scenes. Keep them informed at key moments:
+
+1. **Before asking for credentials**, explain the approach and reassure on privacy:
+ - "I'll connect to [Platform] using a local browser on your machine. Your credentials stay local — nothing is sent to any server except [Platform] itself."
+ - If using an API key: "This uses [Platform]'s API key. You can find it at [location]. The key stays on your machine."
+
+2. **During long operations** (building a connector, collecting paginated data), give brief progress updates. Don't go silent for more than ~30 seconds.
+
+3. **After collection**, summarize results in human terms — not file paths:
+ - Good: "Connected! I collected 249 issues, 63 projects, 9 teams, and your profile from Linear."
+ - Bad: "Data saved to ~/.vana/last-result.json"
+ - Prefer the CLI outcome plus the result file. Build the summary from `exportSummary` and the scoped keys.
+
+4. **On failure**, explain what went wrong and what the user can do:
+ - Auth failed → "Login didn't work. Can you double-check your credentials?"
+ - Platform API changed → "The connector couldn't find the expected data. The platform may have changed their site."
+
+## Rules
+
+1. **Ask before saving** -- no writes to user profile without approval
+2. **Never log credentials** -- no echo, print, or output of secrets
+3. **One platform at a time**
+4. **Check session first** -- try without credentials if a browser profile exists
+5. **Read connectors before running them**
+6. **Use the CLI as the primary interface** -- only drop to raw scripts when debugging or updating connector internals
+
+## CLI fallback order
+
+Use this order when choosing the CLI entrypoint:
+
+1. installed `vana`
+2. official installer path for the current canary
+3. `npx -y @opendatalabs/connect@canary`
+4. `node /home/tnunamak/code/vana-connect/dist/cli/bin.js` only for local development or debugging
diff --git a/skills/create-connector/SKILL.md b/skills/create-connector/SKILL.md
new file mode 100644
index 00000000..37b38719
--- /dev/null
+++ b/skills/create-connector/SKILL.md
@@ -0,0 +1,225 @@
+# Creating a Connector
+
+Build a data connector for a platform that isn't in the registry yet.
+
+## Prerequisites
+
+- `reference/PAGE-API.md` -- full `page` object API
+- `reference/PATTERNS.md` -- data extraction approaches and code examples
+
+All `node scripts/...` commands refer to `skills/vana-connect/scripts/` in the data-connectors repo. Use the `vana` CLI to exercise connectors; only fall back to raw scripts when debugging connector internals.
+
+## Connector Format
+
+Scripts are plain JavaScript (CJS), no imports, no require. The runner injects a `page` object. The script body must be an async IIFE preceded by a blank line (the runner matches `\n(async`).
+
+```javascript
+(async () => {
+ // connector logic here
+ await page.setData("result", { "platform.scope": data });
+})();
+```
+
+## Reference Connectors
+
+| Platform | Strategy | Rung | Notes |
+| --------- | --------------- | ---- | ------------------------------------ |
+| Reddit | In-page fetch | 1 | OAuth-like endpoints, JSON responses |
+| Twitter/X | Network capture | 2 | GraphQL via captureNetwork |
+| Instagram | In-page fetch | 1 | Cookie auth, pagination |
+| LinkedIn | In-page fetch | 1 | Voyager API, CSRF token required |
+| GitHub | DOM extraction | 3 | Server-rendered, no client API |
+| Spotify | In-page fetch | 1 | Well-documented public API |
+
+Look at existing connectors in `~/.vana/connectors/` for working examples.
+
+---
+
+## Step 1 -- Research the Platform
+
+Map the platform's login flow, data APIs, and auth mechanism before writing code.
+
+### Verify by inspecting, not by guessing
+
+Navigate to the platform's login page and take a screenshot before writing any login code. List every login option visible on the page (email, Google, Apple, SSO, etc.) and ask the user which one they use. Your training data about a platform's auth flow may be outdated.
+
+### Web search queries
+
+- `" API endpoints"`, `" graphql endpoint"`
+- `" internal API"`, `" developer API"`
+- `" data export"`, `" GDPR data download"`
+- `" scraper github"` -- open-source scrapers reveal known API patterns
+
+### What to identify
+
+- **Login URL** and **available login methods** (inspect the actual page)
+- **Login form selectors** -- `input[name="..."]`, `input[type="password"]`, `button[type="submit"]`. Note multi-step flows.
+- **Logged-in indicator** -- CSS selector or API response confirming auth. Becomes `connectSelector` in metadata.
+- **Data endpoints** -- REST, GraphQL, or DOM targets
+- **Auth mechanism** -- cookies, CSRF tokens, bearer tokens, session storage
+- **Data categories** -- each becomes a `platform.scope` key (e.g. `reddit.profile`)
+
+### Extraction strategy
+
+Pick the approach with the best user experience. See `reference/PATTERNS.md` for details and code examples. Max 2 attempts per approach before moving to the next.
+
+---
+
+## Step 2 -- Scaffold and Implement
+
+```bash
+node scripts/scaffold.cjs [company]
+```
+
+### Auth pattern
+
+Two credential sources: `process.env` (automated runs) and `page.requestInput()` (interactive). Try env first, fall back to requestInput. If the platform has multiple login options (discovered via screenshot in Step 1), include a `method` field listing the options you observed:
+
+```javascript
+let username = process.env.USER_LOGIN_PLATFORMNAME || "";
+let password = process.env.USER_PASSWORD_PLATFORMNAME || "";
+
+if (!username || !password) {
+ const creds = await page.requestInput({
+ message: "Enter your Platform credentials.",
+ schema: {
+ type: "object",
+ properties: {
+ method: {
+ type: "string",
+ title: "Login method",
+ description: "List the options you found on the login page",
+ },
+ username: { type: "string", title: "Email or username" },
+ password: { type: "string", title: "Password" },
+ },
+ required: ["username", "password"],
+ },
+ });
+ username = creds.username;
+ password = creds.password;
+ // Use creds.method to route to the right login flow
+}
+```
+
+### Login implementation
+
+```javascript
+const loginStr = JSON.stringify(username);
+const passStr = JSON.stringify(password);
+
+await page.goto("https://platform.com/login");
+await page.sleep(2000);
+
+await page.evaluate(`
+ (() => {
+ const u = document.querySelector('input[name="username"], input[type="email"]');
+ const p = document.querySelector('input[type="password"]');
+ if (u) { u.focus(); u.value = ${loginStr}; u.dispatchEvent(new Event('input', {bubbles:true})); }
+ if (p) { p.focus(); p.value = ${passStr}; p.dispatchEvent(new Event('input', {bubbles:true})); }
+ })()
+`);
+await page.sleep(500);
+await page.evaluate(`document.querySelector('button[type="submit"]')?.click()`);
+await page.sleep(3000);
+```
+
+**Adaptations:**
+
+- **Multi-step login**: split into two evaluate+sleep sequences with a navigation between.
+- **React/Vue apps** that ignore `.value =`: use the native setter pattern:
+ ```javascript
+ const nativeInputValueSetter = Object.getOwnPropertyDescriptor(
+ window.HTMLInputElement.prototype, 'value'
+ ).set;
+ nativeInputValueSetter.call(input, ${loginStr});
+ input.dispatchEvent(new Event('input', { bubbles: true }));
+ ```
+- **2FA**: use `page.requestInput()` to ask for the code.
+
+### Key rules
+
+- `page.evaluate()` takes a string, not a function. Pass variables via `JSON.stringify()`.
+- Use ARIA roles, data attributes, semantic HTML for selectors. The validator flags obfuscated class names.
+- Rate-limit API calls with `page.sleep(300-1000)` between requests.
+- Use scoped result keys: `platform.scope` format (e.g. `spotify.playlists`).
+- Include `exportSummary: { count, label, details }` in the result.
+
+### Page API quick reference
+
+```
+page.goto(url) Navigate
+page.evaluate(jsString) Run JS in browser, return result
+page.sleep(ms) Wait
+page.requestInput({ message, schema }) Ask user for data (credentials, 2FA)
+page.setData(key, value) 'result' for data, 'error' for failures
+page.setProgress({ phase, message }) Progress reporting
+page.closeBrowser() Close browser, extract cookies
+page.httpFetch(url, options?) Node.js HTTP (auto-injects cookies)
+page.captureNetwork({ key, urlPattern }) Intercept network requests
+page.getCapturedResponse(key) Retrieve captured response
+page.screenshot() Base64 JPEG screenshot
+```
+
+Full API: `reference/PAGE-API.md`
+
+---
+
+## Step 3 -- Test
+
+Run the connector and validate in one step:
+
+```bash
+node scripts/validate.cjs /-playwright.js && \
+ vana connect && \
+ node scripts/validate.cjs /-playwright.js --check-result ~/.vana/last-result.json
+```
+
+The validator checks structure, output quality, debug code, data cleanliness, schema descriptions, and login method diversity. Fix all reported issues and re-run.
+
+If an extraction approach fails after 2 attempts, move to the next rung (see `reference/PATTERNS.md`). Use `page.screenshot()` to see what the browser shows.
+
+---
+
+## Step 4 -- Enrich Schemas
+
+Schemas are an API contract — app developers build against them.
+
+### Generate the skeleton
+
+```bash
+node scripts/generate-schemas.cjs ~/.vana/last-result.json [output-dir]
+```
+
+### Enrich from what you know
+
+- Add `description` to every field and `format` hints where applicable (`date-time`, `uri`, `email`). The validator checks description coverage.
+- Mark fields `required` only if guaranteed for all users. Use `additionalProperties: true`.
+- Write a meaningful top-level `description` — not "GitHub profile data" but "GitHub user profile including bio, follower counts, and repository statistics."
+
+Before (from `generate-schemas.cjs`):
+
+```json
+{ "type": "string" }
+```
+
+After (enriched):
+
+```json
+{
+ "type": "string",
+ "format": "date-time",
+ "description": "When the issue was created (ISO 8601)"
+}
+```
+
+---
+
+## Step 5 -- Register and Contribute
+
+```bash
+node scripts/register.cjs /-playwright.js
+node scripts/validate.cjs /-playwright.js --contribute
+```
+
+The validator runs all checks including secret scanning before creating a PR. All checks must pass — the validator is the quality gate.
diff --git a/skills/create-connector/reference/PAGE-API.md b/skills/create-connector/reference/PAGE-API.md
new file mode 100644
index 00000000..4d3f5821
--- /dev/null
+++ b/skills/create-connector/reference/PAGE-API.md
@@ -0,0 +1,254 @@
+# Page API Reference
+
+The `page` object is injected as a global in connector scripts. It is NOT raw Playwright — it's a custom API provided by the DataConnect Playwright runner.
+
+## Methods
+
+### Navigation & Browser Control
+
+#### `page.goto(url)`
+
+Navigate to a URL.
+
+```javascript
+await page.goto("https://www.linkedin.com/feed/");
+```
+
+#### `page.showBrowser(url?)`
+
+Switch to headed mode (visible browser window). Optionally navigate to a URL.
+
+```javascript
+await page.showBrowser("https://platform.com/login");
+```
+
+#### `page.goHeadless()`
+
+Switch to headless mode (browser disappears). Call this after login is confirmed, before data extraction.
+
+```javascript
+await page.goHeadless();
+```
+
+#### `page.closeBrowser()`
+
+Close the browser entirely. Use when you're done with browser interactions but still need the process alive for HTTP work.
+
+#### `page.sleep(ms)`
+
+Wait for a specified number of milliseconds.
+
+```javascript
+await page.sleep(2000); // wait 2 seconds
+```
+
+### JavaScript Execution
+
+#### `page.evaluate(jsString)`
+
+Execute JavaScript in the browser context and return the result. **Takes a string, not a function.**
+
+To pass variables from the connector scope into the browser context, use `JSON.stringify()`:
+
+```javascript
+// Simple evaluation
+const title = await page.evaluate(`document.title`);
+
+// With interpolated variables
+const endpoint = "/api/me";
+const endpointStr = JSON.stringify(endpoint);
+const data = await page.evaluate(`
+ (async () => {
+ const resp = await fetch(${endpointStr}, { credentials: 'include' });
+ return await resp.json();
+ })()
+`);
+
+// DOM inspection
+const isLoggedIn = await page.evaluate(`
+ (() => {
+ return !!document.querySelector('.logged-in-indicator');
+ })()
+`);
+```
+
+### Data Communication
+
+#### `page.setData(key, value)`
+
+Send data to the host app. Three key types:
+
+| Key | Purpose |
+| ---------- | ---------------------------------- |
+| `'status'` | Display a status message in the UI |
+| `'error'` | Report an error (stops execution) |
+| `'result'` | Send the final export result |
+
+```javascript
+await page.setData("status", "Fetching profile...");
+await page.setData("error", "Failed to fetch data: " + errorMessage);
+await page.setData("result", resultObject);
+```
+
+#### `page.setProgress({phase, message, count})`
+
+Structured progress reporting for the UI.
+
+```javascript
+await page.setProgress({
+ phase: { step: 1, total: 3, label: "Fetching profile" },
+ message: "Downloaded 50 of 200 items...",
+ count: 50,
+});
+```
+
+- `phase.step` / `phase.total` — drives the step indicator ("Step 1 of 3")
+- `phase.label` — short label for the current phase
+- `message` — human-readable progress text
+- `count` — numeric count for progress tracking
+
+### User Interaction
+
+#### `page.requestInput({ message, schema })`
+
+Ask the user for structured input (credentials, 2FA codes, API keys). Returns a promise that resolves with the user's response matching the schema.
+
+```javascript
+const creds = await page.requestInput({
+ message: "Enter your Platform credentials",
+ schema: {
+ type: "object",
+ properties: {
+ username: { type: "string", title: "Email or username" },
+ password: { type: "string", title: "Password" },
+ },
+ required: ["username", "password"],
+ },
+});
+// creds.username, creds.password
+```
+
+#### `page.promptUser(message, checkFn, pollInterval)`
+
+Show a prompt to the user and poll a check function until it returns truthy. Use for browser-based login flows where the user logs in manually.
+
+```javascript
+await page.promptUser(
+ 'Please log in to LinkedIn. Click "Done" when you see your feed.',
+ async () => {
+ return await checkLoginStatus();
+ },
+ 2000, // poll every 2 seconds
+);
+```
+
+The prompt displays in the DataConnect UI with a "Done" button. The `checkFn` is called every `pollInterval` ms. When it returns truthy, the prompt is dismissed and execution continues.
+
+### HTTP Requests (Node.js-side)
+
+#### `page.httpFetch(url, options?)`
+
+Make HTTP requests from Node.js, bypassing browser CORS restrictions. If the browser was previously open, session cookies are automatically injected for matching domains.
+
+Returns: `{ ok, status, headers, text, json, error }`
+
+```javascript
+// After page.closeBrowser() — cookies are auto-injected
+const resp = await page.httpFetch("https://api.platform.com/v1/me");
+if (resp.ok) {
+ const data = resp.json; // already parsed
+}
+
+// With custom headers (e.g., API key auth)
+const resp = await page.httpFetch("https://api.platform.com/graphql", {
+ method: "POST",
+ headers: {
+ "Content-Type": "application/json",
+ Authorization: "Bearer " + apiKey,
+ },
+ body: JSON.stringify({ query: "{ viewer { id } }" }),
+ timeout: 30000, // default: 30s
+});
+```
+
+**Typical flow:** Login via browser → `page.closeBrowser()` (extracts cookies) → `page.httpFetch()` for all API calls. This bypasses CORS entirely since requests come from Node.js, not the browser.
+
+### Screenshots
+
+#### `page.screenshot()`
+
+Take a JPEG screenshot of the current page. Returns base64-encoded string. Useful for debugging login flows and verifying page state.
+
+```javascript
+const b64 = await page.screenshot();
+await page.setData("status", "[DEBUG] Screenshot taken");
+```
+
+### Network Capture
+
+#### `page.captureNetwork({urlPattern, bodyPattern, key})`
+
+Register a network request interceptor. Captures responses matching the criteria.
+
+```javascript
+await page.captureNetwork({
+ urlPattern: "instagram.com/graphql/query", // URL substring match
+ bodyPattern: "User", // Response body substring match
+ key: "user_data", // Retrieval key
+});
+```
+
+#### `page.getCapturedResponse(key)`
+
+Retrieve a captured network response. Returns the parsed JSON body or `null`.
+
+```javascript
+const response = await page.getCapturedResponse("user_data");
+if (response) {
+ const userData = response.data.user;
+}
+```
+
+#### `page.clearNetworkCaptures()`
+
+Clear all registered network captures.
+
+## Important Notes
+
+1. **`page.evaluate()` takes a STRING, not a function.** This is the most common mistake. The string is evaluated in the browser context.
+
+2. **Variable passing:** You cannot use closures. Variables from the connector scope must be serialized:
+
+ ```javascript
+ // WRONG — variable not available in browser context
+ const url = "/api/data";
+ await page.evaluate(`fetch(url)`);
+
+ // CORRECT — interpolate the value
+ const url = "/api/data";
+ await page.evaluate(`fetch(${JSON.stringify(url)})`);
+ ```
+
+3. **Async evaluate:** Wrap async code in an IIFE:
+
+ ```javascript
+ const data = await page.evaluate(`
+ (async () => {
+ const resp = await fetch('/api/data');
+ return await resp.json();
+ })()
+ `);
+ ```
+
+4. **Error handling in evaluate:** Always try-catch inside the evaluated string:
+ ```javascript
+ const result = await page.evaluate(`
+ (async () => {
+ try {
+ const resp = await fetch('/api/data', { credentials: 'include' });
+ if (!resp.ok) return { _error: resp.status };
+ return await resp.json();
+ } catch(e) { return { _error: e.message }; }
+ })()
+ `);
+ ```
diff --git a/skills/create-connector/reference/PATTERNS.md b/skills/create-connector/reference/PATTERNS.md
new file mode 100644
index 00000000..53812d40
--- /dev/null
+++ b/skills/create-connector/reference/PATTERNS.md
@@ -0,0 +1,470 @@
+# Data Extraction Patterns
+
+## Choosing an approach
+
+Research the platform first. The right extraction strategy depends on the platform's auth model and what's easiest for a normal (non-technical) user. Consider:
+
+- **Does the user already have a browser session?** Most do. Browser login → extract data is the most natural UX. The user just logs in like they normally would.
+- **Does the platform offer API keys or personal tokens?** Some users will prefer this (quick, no browser), but many won't know what an API key is. If you use this approach, guide the user clearly.
+- **Is the platform's API on the same origin as the app?** If yes, in-page fetch works. If not (CORS), use `closeBrowser()` + `httpFetch()` to make requests from Node.js with extracted cookies.
+- **Does the platform only render data in the DOM (no usable API)?** DOM extraction always works as a last resort.
+
+There is no fixed ordering. Pick the approach that gives the best user experience for the specific platform. You can combine approaches (e.g., browser login for auth + httpFetch for data extraction).
+
+### Available tools
+
+| Tool | What it does | When to use |
+| --------------------------- | --------------------------------------- | ----------------------------------------------- |
+| `page.evaluate(js)` | Run JS in the browser page | In-page fetch, DOM extraction, login detection |
+| `page.closeBrowser()` | Close browser, extract session cookies | Before switching to httpFetch |
+| `page.httpFetch(url, opts)` | Node.js HTTP with auto-injected cookies | Cross-origin APIs (bypasses CORS), API key auth |
+| `page.captureNetwork(...)` | Intercept network responses | Platforms that load data during page bootstrap |
+| `page.requestInput(...)` | Ask user for structured input | Credentials, API keys, 2FA codes |
+
+### API key auth pattern
+
+If the platform supports API keys and that's the best UX for the user:
+
+```javascript
+let apiKey = process.env.API_KEY_PLATFORMNAME || "";
+if (!apiKey) {
+ const input = await page.requestInput({
+ message: "Enter your Platform API key (find it at Settings → API)",
+ schema: {
+ type: "object",
+ properties: {
+ apiKey: { type: "string", title: "API Key" },
+ },
+ required: ["apiKey"],
+ },
+ });
+ apiKey = input.apiKey;
+}
+
+await page.closeBrowser();
+const resp = await page.httpFetch("https://api.platform.com/v1/me", {
+ headers: { Authorization: "Bearer " + apiKey },
+});
+```
+
+### Browser login + httpFetch pattern
+
+If the user should log in via browser and the API is cross-origin:
+
+```javascript
+// 1. Navigate to login page, wait for user to log in
+await page.goto("https://platform.com/login");
+// ... login detection logic ...
+
+// 2. Close browser — cookies are extracted automatically
+await page.closeBrowser();
+
+// 3. Make API calls from Node.js with session cookies (no CORS)
+const resp = await page.httpFetch("https://api.platform.com/v1/me");
+```
+
+---
+
+## Extraction Ladder
+
+If you're unsure which approach works, try each rung. Max 2 attempts per rung before moving on.
+
+## Rung 1: In-Page Fetch
+
+**Try first.** Use `fetch()` or `XMLHttpRequest` from `page.evaluate()` to call the platform's API with the browser's existing session cookies.
+**Example:** LinkedIn, ChatGPT, Spotify
+
+### How to discover APIs:
+
+1. Open the platform in Chrome
+2. Open DevTools > Network tab
+3. Filter by XHR/Fetch
+4. Browse the platform — watch for JSON responses
+5. Note the endpoint URLs, required headers, auth mechanisms
+
+### Implementation — API fetch helper:
+
+```javascript
+const fetchApi = async (endpoint) => {
+ const endpointStr = JSON.stringify(endpoint);
+ try {
+ return await page.evaluate(`
+ (async () => {
+ try {
+ // Get CSRF token from cookies (platform-specific)
+ const csrfToken = (document.cookie.match(/JSESSIONID="?([^";]+)/) || [])[1] || '';
+ const resp = await fetch(${endpointStr}, {
+ headers: { 'csrf-token': csrfToken },
+ credentials: 'include'
+ });
+ if (!resp.ok) return { _error: resp.status };
+ return await resp.json();
+ } catch(e) { return { _error: e.message }; }
+ })()
+ `);
+ } catch (e) {
+ return { _error: e.message || String(e) };
+ }
+};
+
+// Usage
+const data = await fetchApi("/api/v1/me");
+if (data._error) {
+ await page.setData("error", "API failed: " + data._error);
+ return;
+}
+```
+
+### Auth token extraction (ChatGPT pattern):
+
+Some platforms embed auth tokens in the page source:
+
+```javascript
+const token = await page.evaluate(`
+ (() => {
+ try {
+ // Look for auth tokens in script tags
+ const bootstrapEl = document.getElementById('client-bootstrap');
+ if (bootstrapEl) {
+ const data = JSON.parse(bootstrapEl.textContent);
+ return data.accessToken || null;
+ }
+ return null;
+ } catch { return null; }
+ })()
+`);
+
+// Use token in API calls
+const tokenStr = JSON.stringify(token);
+const data = await page.evaluate(`
+ (async () => {
+ const resp = await fetch('/backend-api/conversations', {
+ headers: { 'Authorization': 'Bearer ' + ${tokenStr} }
+ });
+ return await resp.json();
+ })()
+`);
+```
+
+**When to move on:**
+
+- `fetch()` returns 401/403 with `credentials: 'include'`
+- Response is HTML (login page redirect) instead of JSON
+- CORS error in browser console ("Failed to fetch", "blocked by CORS policy")
+- Auth token not found in cookies, localStorage, sessionStorage, or page source
+
+**CORS workaround — try `httpFetch` before Rung 2:** If Rung 1 fails due to CORS (the API is on a different origin from the app), try `page.closeBrowser()` + `page.httpFetch()`. This extracts session cookies from the browser and makes Node.js-side requests — no CORS. Only move to Rung 2 if `httpFetch` also fails (e.g., cookies are TLS-bound or the server rejects non-browser requests).
+
+```javascript
+// After login is confirmed:
+await page.closeBrowser(); // extracts cookies
+
+const resp = await page.httpFetch("https://api.platform.com/graphql", {
+ method: "POST",
+ headers: { "Content-Type": "application/json" },
+ body: JSON.stringify({ query: "{ viewer { id name } }" }),
+});
+if (resp.ok && resp.json) {
+ // httpFetch works — use it for all data collection
+}
+```
+
+### Parallel API calls:
+
+```javascript
+const [profileData, positionsData] = await Promise.all([
+ fetchApi("/api/profile"),
+ fetchApi("/api/positions"),
+]);
+```
+
+### Paginated API calls:
+
+```javascript
+const allItems = [];
+let offset = 0;
+const limit = 50;
+
+while (true) {
+ await page.setProgress({
+ phase: { step: 2, total: 3, label: "Fetching items" },
+ message: `Fetched ${allItems.length} items so far...`,
+ count: allItems.length,
+ });
+
+ const data = await fetchApi(`/api/items?offset=${offset}&limit=${limit}`);
+ if (data._error) break;
+
+ const items = data.elements || [];
+ allItems.push(...items);
+
+ if (items.length < limit) break; // last page
+ offset += limit;
+ await page.sleep(500); // rate limiting
+}
+```
+
+---
+
+## Rung 2: Network Capture
+
+**Try if Rung 1 failed.** Register `captureNetwork` _before_ navigating to intercept API responses during page bootstrap, before the app switches to WebSocket or other transports.
+**Example:** Instagram, Twitter/X
+
+### Implementation:
+
+```javascript
+// 1. Register capture BEFORE navigating
+await page.captureNetwork({
+ urlPattern: "instagram.com/graphql/query", // URL substring to match
+ bodyPattern: "PolarisProfilePage", // Response body substring
+ key: "profile_data", // Key for retrieval
+});
+
+// 2. Navigate to trigger the request
+await page.goto("https://www.instagram.com/username/");
+await page.sleep(3000); // wait for requests to fire
+
+// 3. Retrieve captured response
+const response = await page.getCapturedResponse("profile_data");
+if (response) {
+ const user = response.data?.user;
+ // Process user data...
+}
+```
+
+### Multiple captures:
+
+```javascript
+// Register multiple captures
+await page.captureNetwork({
+ urlPattern: "/graphql",
+ bodyPattern: "UserProfile",
+ key: "user",
+});
+await page.captureNetwork({
+ urlPattern: "/graphql",
+ bodyPattern: "UserMedia",
+ key: "media",
+});
+
+await page.goto("https://platform.com/profile");
+await page.sleep(3000);
+
+const userResp = await page.getCapturedResponse("user");
+const mediaResp = await page.getCapturedResponse("media");
+```
+
+**When to move on to Rung 3:**
+
+- `getCapturedResponse()` returns null after navigation + 5s wait
+- Captured data is not useful (only static config, not user data)
+- Platform uses a query allowlist (captured credentials can't make arbitrary API calls)
+
+---
+
+## Rung 3: DOM Extraction
+
+**The most reliable rung.** Navigate to pages and extract data from the rendered DOM. If data is visible in the browser, it can be scraped. Works regardless of auth mechanism, including WebSocket-based SPAs.
+
+### Selector strategy (critical):
+
+Use ARIA roles, data attributes, semantic HTML, and tag structure for selectors. The validator flags obfuscated class names.
+
+- Tag structure: `main > section`, `h2`, `p`
+- ARIA roles: `[role="main"]`, `[aria-label*="repositories"]`
+- Data attributes: `[data-testid="profile-name"]`, `[itemprop="name"]`
+- Semantic HTML: `nav`, `article`, `header`, `aside`
+- Text content matching via JS
+
+### Implementation:
+
+```javascript
+const profileData = await page.evaluate(`
+ (() => {
+ // Use stable selectors
+ const name = (document.querySelector('span[itemprop="name"]')?.textContent || '').trim();
+ const bio = (document.querySelector('div[data-bio-text]')?.textContent || '').trim();
+
+ // Use structural selectors as fallback
+ const stats = document.querySelectorAll('nav a span');
+ const followers = stats.length > 0 ? stats[0]?.textContent?.trim() : '';
+
+ return { name, bio, followers };
+ })()
+`);
+```
+
+### Pagination via DOM:
+
+```javascript
+const allItems = [];
+let pageNum = 1;
+const maxPages = 20;
+
+while (pageNum <= maxPages) {
+ await page.goto(`https://platform.com/items?page=${pageNum}`);
+ await page.sleep(1500);
+
+ const items = await page.evaluate(`
+ (() => {
+ const rows = document.querySelectorAll('[data-testid="item-row"]');
+ return Array.from(rows).map(row => ({
+ title: (row.querySelector('h3')?.textContent || '').trim(),
+ url: row.querySelector('a')?.href || '',
+ }));
+ })()
+ `);
+
+ if (!items || items.length === 0) break;
+ allItems.push(...items);
+
+ // Check for next page
+ const hasNext = await page.evaluate(
+ `!!document.querySelector('a[rel="next"]')`,
+ );
+ if (!hasNext) break;
+
+ pageNum++;
+ await page.sleep(500);
+}
+```
+
+---
+
+## Putting It Together: The Extraction Ladder
+
+When building a new connector, try each rung in order. A single test call tells you whether to continue or move on.
+
+```javascript
+// Rung 1: try in-page fetch
+const probe = await page.evaluate(`
+ (async () => {
+ try {
+ const r = await fetch('/api/v1/me', { credentials: 'include' });
+ if (!r.ok) return { _failed: true, status: r.status };
+ const ct = r.headers.get('content-type') || '';
+ if (!ct.includes('json')) return { _failed: true, reason: 'not-json' };
+ return await r.json();
+ } catch(e) { return { _failed: true, error: e.message }; }
+ })()
+`);
+
+if (!probe._failed) {
+ // Rung 1 works -- use fetchApi pattern for all data collection
+} else {
+ // Rung 1 failed -- go to Rung 2 or 3
+
+ // Rung 2: captureNetwork (must be set up BEFORE navigating)
+ await page.captureNetwork({ key: "api", urlPattern: "api.platform.com" });
+ await page.goto("https://platform.com/dashboard");
+ await page.sleep(5000);
+ const captured = await page.getCapturedResponse("api");
+
+ if (captured && captured.data) {
+ // Rung 2 works -- use network capture pattern
+ } else {
+ // Rung 3: DOM extraction -- always works
+ const data = await page.evaluate(`
+ (() => {
+ // Read data from the rendered page
+ const items = document.querySelectorAll('[data-testid="item"]');
+ return Array.from(items).map(el => ({
+ title: (el.querySelector('h3')?.textContent || '').trim(),
+ // ...
+ }));
+ })()
+ `);
+ }
+}
+```
+
+---
+
+## Platform Characteristics That Affect Strategy
+
+### WebSocket-based SPAs
+
+Some platforms (e.g., real-time collaboration tools, project management apps) load data over **WebSocket** after the initial page render, not via HTTP fetch calls. This has major implications:
+
+- **`captureNetwork` captures nothing** — network capture only intercepts HTTP requests, not WebSocket frames.
+- **In-page `fetch()` won't find same-origin API endpoints** — the platform may not have REST/GraphQL endpoints accessible from the browser page context at all.
+- **`httpFetch` with extracted cookies often fails** — if the API is behind Cloudflare or similar bot protection, cookies are bound to the browser's TLS context and won't replay from Node.js.
+
+**How to detect:** After login, open DevTools Network tab and filter by WS/WebSocket. If the app loads data over a WebSocket connection rather than XHR/fetch, you're dealing with this pattern.
+
+**What works:** API keys (if the platform offers them) or DOM extraction (Rung 3). The extraction ladder's Rungs 1–2 will fail — recognize the pattern early and skip to what works.
+
+### Cloudflare-protected APIs
+
+Some platforms use Cloudflare (or similar CDN/bot protection) that binds session cookies to the browser's TLS fingerprint. Symptoms:
+
+- Browser login works fine, cookies are extracted successfully
+- `httpFetch` with those cookies returns 401/403
+- The same cookies work in the browser but not from Node.js
+
+**What works:** In-page `fetch()` (if same-origin), API keys, or DOM extraction. The `closeBrowser()` + `httpFetch()` strategy is non-viable for these platforms.
+
+---
+
+## Common Patterns
+
+### Login detection:
+
+Use URL-based detection as the primary signal. DOM selectors are supplementary.
+
+```javascript
+const checkLoginStatus = async () => {
+ try {
+ return await page.evaluate(`
+ (() => {
+ const path = window.location.pathname;
+
+ // URL-based (primary)
+ if (/\\/(login|signin|sign-in|auth|sso|callback)/.test(path)) return false;
+ if (path === '/') return false;
+ if (!window.location.hostname.includes('PLATFORM_DOMAIN')) return false;
+
+ if (!!document.querySelector('input[type="password"]')) return false;
+
+ // DOM-based (supplementary) -- use a selector specific to the app shell
+ // Good: meta[name='user-login'][content], button[data-testid='user-widget-link']
+ // Bad: aside, nav, main (too generic, matches marketing pages)
+ return !!document.querySelector('LOGGED_IN_SELECTOR');
+ })()
+ `);
+ } catch (e) {
+ return false; // navigation in progress (e.g. OAuth redirect)
+ }
+};
+```
+
+### Dismissing popups/modals:
+
+```javascript
+// Dismiss cookie banners, upgrade prompts, etc.
+await page.evaluate(`
+ (() => {
+ const dismissSelectors = [
+ 'button[aria-label="Close"]',
+ 'button[aria-label="Dismiss"]',
+ '[data-testid="close-button"]',
+ ];
+ for (const sel of dismissSelectors) {
+ const btn = document.querySelector(sel);
+ if (btn) { btn.click(); break; }
+ }
+ })()
+`);
+await page.sleep(500);
+```
+
+### Safe text extraction:
+
+```javascript
+// Always guard against null/undefined
+const getText = (selector) =>
+ `(document.querySelector('${selector}')?.textContent || '').trim()`;
+
+const name = await page.evaluate(getText("h1.profile-name"));
+```
diff --git a/skills/next-prompt/SKILL.md b/skills/next-prompt/SKILL.md
new file mode 100644
index 00000000..d0859432
--- /dev/null
+++ b/skills/next-prompt/SKILL.md
@@ -0,0 +1,119 @@
+---
+name: next-prompt
+description: >
+ Generate and execute the next agent prompt from connected personal data.
+ Use when: the user says "what should I work on", "vana next", "what's next",
+ or when the agent has no pending task. Also triggers after completing a task
+ when autopilot is enabled.
+---
+
+# Next Prompt
+
+Generate the next task from connected personal data and user-defined guidance.
+
+## Config
+
+Read `~/.vana/next-prompt.md` for the user's guidance. If it doesn't exist, create it:
+
+```markdown
+# Next Prompt
+
+## Priorities
+
+- (edit this list to steer your agent)
+
+## Standing instructions
+
+- Prefer small, completable tasks
+
+## Notify me when
+
+- Something needs my decision
+- You're about to take an irreversible action
+```
+
+## Flow
+
+### 1. Gather context and check freshness
+
+```bash
+vana status --json
+```
+
+This tells you what sources are connected, their sync state, and `lastCollectedAt` timestamps.
+
+Check freshness: if any source's `lastCollectedAt` is more than 24 hours old, suggest recollection before generating prompts:
+
+- For one source: `vana collect `
+- For all stale sources: `vana collect` (re-collects all connected sources)
+
+Only suggest recollection, don't block on it. Work with whatever data is available.
+
+### 2. Read connected data
+
+For each connected source, read the result file:
+
+```bash
+ls ~/.vana/results/
+```
+
+Read each JSON file. Look for timestamped data from the last 24 hours. Common timestamp fields:
+
+- `created_at`, `updated_at`, `timestamp`, `date`
+- `create_time` (ChatGPT conversations)
+- `startedAt`, `endedAt` (Oura, activity data)
+
+If timestamps aren't available, treat all data as current.
+
+### 3. Read guidance
+
+```bash
+cat ~/.vana/next-prompt.md
+```
+
+### 4. Reason and generate
+
+Based on the connected data and guidance, generate 1-3 prioritized suggestions. Each suggestion should be:
+
+- Specific enough to execute without further clarification
+- Aligned with the user's stated priorities
+- Informed by what the data shows (recent activity, pending items, time-sensitive things)
+
+Present them:
+
+```
+Based on your data and priorities:
+
+1. [Most important action with reasoning]
+2. [Second action]
+3. [Third action]
+
+Pick one, or say "go" and I'll start on #1.
+```
+
+### 5. Execute or wait
+
+If the user picks one or says "go", execute it as your next task.
+
+If `~/.vana/next-prompt.md` says not to notify for this type of task, skip the prompt and execute directly.
+
+## What to look for in each source
+
+**GitHub:** Recent commits (what was worked on), open issues, PRs awaiting review, dependency alerts, repos with recent activity vs. stale repos.
+
+**ChatGPT:** Recent conversation topics (what the user is thinking about), saved memories (stated preferences and goals), repeated questions (knowledge gaps or recurring concerns).
+
+**LinkedIn:** Unread messages (especially from contacts matching "anchor customer" or similar priority labels), profile views, job-relevant activity.
+
+**Spotify:** Listening patterns can indicate work state (focus music = deep work, podcasts = learning, silence = meetings or away).
+
+**Shop/Uber:** Time-sensitive receipts, returns windows, upcoming trips.
+
+## Rules
+
+1. Never fabricate data. Only reference what's actually in the result files.
+2. Respect the notify/don't-notify preferences in the config.
+3. If no data is connected, list the unconnected sources and tell the user to connect them in their own terminal. Do NOT run `vana connect` yourself — that is a separate skill (`connect-data`) and most sources require a headed browser you cannot access.
+4. Work with whatever data IS available. Do not block on missing sources.
+5. Weight time-sensitive items higher (messages aging toward an SLA, expiring deadlines).
+6. Don't repeat suggestions the user has already dismissed.
diff --git a/skills/registry.json b/skills/registry.json
new file mode 100644
index 00000000..690fae4c
--- /dev/null
+++ b/skills/registry.json
@@ -0,0 +1,35 @@
+{
+ "skills": [
+ {
+ "id": "connect-data",
+ "name": "Connect Data",
+ "description": "How to connect platforms and collect personal data using the vana CLI",
+ "version": "1.0.0",
+ "files": {
+ "skill": "connect-data/SKILL.md"
+ }
+ },
+ {
+ "id": "create-connector",
+ "name": "Create Connector",
+ "description": "How to build, test, validate, and contribute a new data connector",
+ "version": "1.0.0",
+ "files": {
+ "skill": "create-connector/SKILL.md",
+ "supplemental": [
+ "create-connector/reference/PAGE-API.md",
+ "create-connector/reference/PATTERNS.md"
+ ]
+ }
+ },
+ {
+ "id": "next-prompt",
+ "name": "Next Prompt",
+ "description": "Generate and execute the next agent prompt from connected personal data",
+ "version": "1.0.0",
+ "files": {
+ "skill": "next-prompt/SKILL.md"
+ }
+ }
+ ]
+}
diff --git a/src/cli/bin.ts b/src/cli/bin.ts
new file mode 100644
index 00000000..10ad4e20
--- /dev/null
+++ b/src/cli/bin.ts
@@ -0,0 +1,17 @@
+#!/usr/bin/env node
+
+import { runCli } from "./index.js";
+
+for (const stream of [process.stdout, process.stderr]) {
+ stream.on("error", (error: NodeJS.ErrnoException) => {
+ if (error.code === "EPIPE") {
+ process.exit(0);
+ }
+ throw error;
+ });
+}
+
+const exitCode = await runCli(process.argv);
+if (typeof exitCode === "number") {
+ process.exitCode = exitCode;
+}
diff --git a/src/cli/index.ts b/src/cli/index.ts
new file mode 100644
index 00000000..599133c6
--- /dev/null
+++ b/src/cli/index.ts
@@ -0,0 +1,4662 @@
+import fs from "node:fs";
+import fsp from "node:fs/promises";
+import path from "node:path";
+import { createRequire } from "node:module";
+import { spawn, execSync } from "node:child_process";
+import os from "node:os";
+
+import { confirm, input, password, select } from "@inquirer/prompts";
+import { Command, CommanderError } from "commander";
+
+// Vana-branded theme for inquirer prompts — matches brand palette
+const VANA_BLUE = "\x1b[38;2;65;65;252m";
+const VANA_MUTED = "\x1b[38;2;112;112;112m";
+const RESET = "\x1b[0m";
+const BOLD = "\x1b[1m";
+const BOLD_RESET = "\x1b[22m";
+const vanaPromptTheme = {
+ theme: {
+ prefix: { idle: `${VANA_BLUE}?${RESET}`, done: `${VANA_BLUE}✓${RESET}` },
+ style: {
+ answer: (text: string) => `${BOLD}${text}${BOLD_RESET}`,
+ message: (text: string, status: "idle" | "done" | "loading") =>
+ status === "done" ? `${VANA_MUTED}${text}${RESET}` : text,
+ highlight: (text: string) => `${VANA_BLUE}${text}${RESET}`,
+ help: (text: string) => `${VANA_MUTED}${text}${RESET}`,
+ error: (text: string) => `\x1b[38;2;231;0;11m${text}${RESET}`,
+ },
+ },
+};
+
+import {
+ createConnectRenderer,
+ createHumanRenderer,
+ formatDisplayPath,
+ formatRelativeTime,
+} from "./render/index.js";
+import type { ConnectRenderer } from "./render/connect-renderer.js";
+import {
+ CliOutcomeStatus,
+ migrateLegacyDataHome,
+ getBrowserProfilesDir,
+ getConnectorCacheDir,
+ getLogsDir,
+ getSessionsDir,
+ getSourceResultPath,
+ readCliState,
+ readCliConfig,
+ updateCliConfig,
+ updateSourceState,
+} from "../core/index.js";
+import type {
+ CliChannel,
+ CliEvent,
+ CliInstallMethod,
+ CliOutcome,
+ CliStatus,
+ SourceStatus,
+} from "../core/cli-types.js";
+import type { AvailableSource } from "../connectors/registry.js";
+import {
+ fetchConnectorToCache,
+ listAvailableSources,
+ readCachedConnectorMetadata,
+} from "../connectors/registry.js";
+import {
+ detectPersonalServerTarget,
+ ingestResult,
+} from "../personal-server/index.js";
+import {
+ findDataConnectorsDir,
+ ManagedPlaywrightRuntime,
+} from "../runtime/index.js";
+import {
+ listAvailableSkills,
+ installSkill,
+ readInstalledSkills,
+} from "../skills/index.js";
+import {
+ queryStatus,
+ querySources,
+ queryDataList,
+ queryDataShow,
+ queryDoctor,
+} from "./queries.js";
+
+interface GlobalOptions {
+ json?: boolean;
+ noInput?: boolean;
+ ipc?: boolean;
+ yes?: boolean;
+ quiet?: boolean;
+ detach?: boolean;
+}
+
+interface SourceLabelMap {
+ [source: string]: string;
+}
+
+interface SourceMetadataMap {
+ [source: string]: {
+ name: string;
+ company?: string;
+ description?: string;
+ authMode?: "automated" | "interactive" | "legacy";
+ };
+}
+
+function cleanDescription(desc: string): string {
+ return desc
+ .replace(/ using Playwright browser automation\.?/i, ".")
+ .replace(/^Exports\b\s*(your\s+)?/i, "Your ");
+}
+
+interface Emitter {
+ event(event: CliEvent | CliOutcome): void;
+ info(message: string): void;
+ blank(): void;
+ title(message: string): void;
+ success(message: string): void;
+ section(message: string): void;
+ keyValue(label: string, value: string, tone?: RenderTone): void;
+ detail(message: string): void;
+ next(command: string): void;
+ bullet(message: string): void;
+ sourceTitle(
+ name: string,
+ badges?: Array<{ text: string; tone?: RenderTone }>,
+ ): void;
+ badge(text: string, tone?: RenderTone): string;
+ code(text: string): string;
+}
+
+type RenderTone = "accent" | "success" | "warning" | "error" | "muted" | "info";
+const require = createRequire(import.meta.url);
+
+type SourceStatusDetail =
+ | {
+ kind: "text";
+ message: string;
+ }
+ | {
+ kind: "row";
+ label: string;
+ value: string;
+ tone?: RenderTone;
+ };
+
+export async function runCli(argv = process.argv): Promise {
+ // Migrate ~/.dataconnect → ~/.vana, symlink old path for DataConnect compat
+ if (migrateLegacyDataHome()) {
+ process.stderr.write("Moved your data to ~/.vana.\n\n");
+ }
+
+ const normalizedArgv = normalizeArgv(argv);
+ if (normalizedArgv.length <= 2) {
+ normalizedArgv.push("--help");
+ }
+ const parsedOptions = extractGlobalOptions(normalizedArgv);
+ const cliVersion = getCliVersion();
+ const program = new Command();
+ program
+ .name("vana")
+ .description("Connect sources, collect data, and inspect it locally.")
+ .version(cliVersion, "-v, --version", "Print CLI version")
+ .showSuggestionAfterError(true)
+ .addHelpText(
+ "after",
+ `
+Quick start:
+ vana connect Connect a source and collect data
+ vana sources Browse available sources
+ vana status Check system health
+
+Data:
+ vana data list List collected datasets
+ vana data show Inspect a dataset
+
+Server:
+ vana server Personal Server status and management
+
+Agent:
+ vana mcp Start MCP server (for Claude Code, Cursor, etc.)
+ vana skills list List available agent skills
+ vana skills install Install a skill for your agent
+
+Background:
+ vana connect --detach Connect in the background
+ vana schedule add Schedule daily collection
+ vana schedule list Show scheduled tasks
+
+More:
+ vana doctor Detailed diagnostics
+ vana logs [source] View run logs
+ vana setup Install or repair runtime
+`,
+ );
+ program.exitOverride();
+
+ program
+ .command("version")
+ .description("Print CLI version")
+ .option("--json", "Output machine-readable JSON")
+ .action(async () => {
+ if (parsedOptions.json) {
+ process.stdout.write(
+ `${JSON.stringify({
+ cliVersion,
+ channel: getCliChannel(cliVersion),
+ installMethod: getCliInstallMethod(),
+ })}\n`,
+ );
+ process.exitCode = 0;
+ return;
+ }
+
+ process.stdout.write(
+ `${cliVersion} (${getCliChannel(cliVersion)}, ${formatInstallMethodLabel(getCliInstallMethod()).toLowerCase()})\n`,
+ );
+ process.exitCode = 0;
+ });
+
+ const connectCommand = program
+ .command("connect [source]")
+ .description("Connect a source and collect data")
+ .option("--json", "Output machine-readable JSON")
+ .option("--no-input", "Fail instead of prompting for input")
+ .option("--ipc", "Use file-based IPC for credential prompts (for agents)")
+ .option("--yes", "Approve safe setup prompts automatically")
+ .option("--quiet", "Reduce non-essential output")
+ .option("--detach", "Run in the background")
+ .action(async (source?: string) => {
+ if (parsedOptions.detach && source) {
+ process.exitCode = await runDetached("connect", source, parsedOptions);
+ return;
+ }
+ process.exitCode = source
+ ? await runConnect(source, parsedOptions)
+ : await runConnectEntry(parsedOptions);
+ });
+ connectCommand.addHelpText(
+ "after",
+ `
+Examples:
+ vana connect
+ vana connect github
+ vana connect github --json --no-input
+ vana connect github --json --ipc
+`,
+ );
+
+ const sourcesCommand = program
+ .command("sources [source]")
+ .description("List supported sources, or show detail for one source")
+ .option("--json", "Output machine-readable JSON")
+ .action(async (source?: string) => {
+ process.exitCode = source
+ ? await runSourceDetail(source, parsedOptions)
+ : await runList(parsedOptions);
+ });
+ sourcesCommand.addHelpText(
+ "after",
+ `
+Examples:
+ vana sources
+ vana sources github
+ vana sources --json | jq '.sources'
+`,
+ );
+
+ const collectCommand = program
+ .command("collect [source]")
+ .description("Re-collect data from a previously connected source")
+ .option("--json", "Output machine-readable JSON")
+ .option("--no-input", "Fail instead of prompting for input")
+ .option("--ipc", "Use file-based IPC for credential prompts (for agents)")
+ .option("--yes", "Approve safe setup prompts automatically")
+ .option("--quiet", "Reduce non-essential output")
+ .option("--detach", "Run in the background")
+ .option("--all", "Collect from all connected sources")
+ .action(async (source?: string) => {
+ if (parsedOptions.detach && source) {
+ process.exitCode = await runDetached("collect", source, parsedOptions);
+ return;
+ }
+ process.exitCode = source
+ ? await runCollect(source, parsedOptions)
+ : await runCollectAll(parsedOptions);
+ });
+ collectCommand.addHelpText(
+ "after",
+ `
+Examples:
+ vana collect github
+ vana collect
+ vana collect --json
+`,
+ );
+
+ const statusCommand = program
+ .command("status")
+ .description("Show runtime and Personal Server status")
+ .option("--json", "Output machine-readable JSON")
+ .action(async () => {
+ process.exitCode = await runStatus(parsedOptions);
+ });
+ statusCommand.addHelpText(
+ "after",
+ `
+Examples:
+ vana status
+ vana status --json | jq
+`,
+ );
+
+ const doctorCommand = program
+ .command("doctor")
+ .description("Inspect local CLI, runtime, and install health")
+ .option("--json", "Output machine-readable JSON")
+ .action(async () => {
+ process.exitCode = await runDoctor(parsedOptions);
+ });
+ doctorCommand.addHelpText(
+ "after",
+ `
+Examples:
+ vana doctor
+ vana doctor --json | jq
+`,
+ );
+
+ const setupCommand = program
+ .command("setup")
+ .description("Install or repair the local runtime")
+ .option("--json", "Output machine-readable JSON")
+ .option("--yes", "Approve safe setup prompts automatically")
+ .action(async () => {
+ process.exitCode = await runSetup(parsedOptions);
+ });
+ setupCommand.addHelpText(
+ "after",
+ `
+Examples:
+ vana setup
+ vana setup --yes
+`,
+ );
+
+ const data = program
+ .command("data")
+ .description("Inspect collected datasets, paths, and summaries");
+ data.addHelpText(
+ "after",
+ `
+Examples:
+ vana data list
+ vana data show github
+ vana data path github --json
+`,
+ );
+ data.action(() => {
+ data.outputHelp();
+ process.exitCode = 0;
+ });
+
+ const dataListCommand = data
+ .command("list")
+ .description("List locally available collected datasets")
+ .option("--json", "Output machine-readable JSON")
+ .action(async () => {
+ process.exitCode = await runDataList(parsedOptions);
+ });
+ dataListCommand.addHelpText(
+ "after",
+ `
+Examples:
+ vana data list
+ vana data list --json | jq '.datasets'
+`,
+ );
+
+ const dataShowCommand = data
+ .command("show ")
+ .description("Show a collected dataset")
+ .option("--json", "Output machine-readable JSON")
+ .action(async (source: string) => {
+ process.exitCode = await runDataShow(source, parsedOptions);
+ });
+ dataShowCommand.addHelpText(
+ "after",
+ `
+Examples:
+ vana data show github
+ vana data show github --json | jq '.summary'
+`,
+ );
+
+ const dataPathCommand = data
+ .command("path ")
+ .description("Print the local path for a collected dataset")
+ .option("--json", "Output machine-readable JSON")
+ .action(async (source: string) => {
+ process.exitCode = await runDataPath(source, parsedOptions);
+ });
+ dataPathCommand.addHelpText(
+ "after",
+ `
+Examples:
+ vana data path github
+ vana data path github --json | jq -r '.path'
+`,
+ );
+
+ const logsCommand = program
+ .command("logs [source]")
+ .description("Inspect stored connector run logs")
+ .option("--json", "Output machine-readable JSON")
+ .action(async (source?: string) => {
+ process.exitCode = await runLogs(source, parsedOptions);
+ });
+ logsCommand.addHelpText(
+ "after",
+ `
+Examples:
+ vana logs
+ vana logs github
+ vana logs github --json | jq
+`,
+ );
+
+ const server = program
+ .command("server")
+ .description("Manage Personal Server connection")
+ .option("--json", "Output machine-readable JSON");
+ server.addHelpText(
+ "after",
+ `
+Examples:
+ vana server
+ vana server set-url http://localhost:8080
+ vana server set-url https://ps-abc123.server.vana.org
+ vana server clear-url
+`,
+ );
+ server.action(async () => {
+ process.exitCode = await runServerStatus(parsedOptions);
+ });
+
+ server
+ .command("status")
+ .description("Show Personal Server status")
+ .option("--json", "Output machine-readable JSON")
+ .action(async () => {
+ process.exitCode = await runServerStatus(parsedOptions);
+ });
+
+ server
+ .command("set-url ")
+ .description("Save a Personal Server URL")
+ .option("--json", "Output machine-readable JSON")
+ .action(async (url: string) => {
+ process.exitCode = await runServerSetUrl(url, parsedOptions);
+ });
+
+ server
+ .command("clear-url")
+ .description("Remove the saved Personal Server URL")
+ .option("--json", "Output machine-readable JSON")
+ .action(async () => {
+ process.exitCode = await runServerClearUrl(parsedOptions);
+ });
+
+ server
+ .command("sync")
+ .description("Sync all local-only datasets to your Personal Server")
+ .option("--json", "Output machine-readable JSON")
+ .action(async () => {
+ process.exitCode = await runServerSync(parsedOptions);
+ });
+
+ server
+ .command("data [scope]")
+ .description("List scopes stored in your Personal Server")
+ .option("--json", "Output machine-readable JSON")
+ .action(async (scope?: string) => {
+ process.exitCode = await runServerData(scope, parsedOptions);
+ });
+
+ program
+ .command("mcp")
+ .description("Start MCP server for agent integration")
+ .action(async () => {
+ const { startMcpServer } = await import("./mcp-server.js");
+ await startMcpServer();
+ });
+
+ const skill = program.command("skills").description("Manage agent skills");
+ skill.addHelpText(
+ "after",
+ `
+Examples:
+ vana skills list
+ vana skills install connect-data
+ vana skills show connect-data
+`,
+ );
+ skill.action(() => {
+ skill.outputHelp();
+ process.exitCode = 0;
+ });
+
+ skill
+ .command("list")
+ .description("List available agent skills")
+ .option("--json", "Output as JSON")
+ .action(async () => {
+ process.exitCode = await runSkillList(parsedOptions);
+ });
+
+ skill
+ .command("install ")
+ .description("Install a skill for your agent")
+ .action(async (name: string) => {
+ process.exitCode = await runSkillInstall(name, parsedOptions);
+ });
+
+ skill
+ .command("show ")
+ .description("Show skill details")
+ .action(async (name: string) => {
+ process.exitCode = await runSkillShow(name, parsedOptions);
+ });
+
+ // --- Schedule commands ---
+ const schedule = program
+ .command("schedule")
+ .description("Manage scheduled data collection");
+ schedule.addHelpText(
+ "after",
+ `
+Examples:
+ vana schedule add
+ vana schedule add --every 12h
+ vana schedule list
+ vana schedule remove
+`,
+ );
+ schedule.action(() => {
+ schedule.outputHelp();
+ process.exitCode = 0;
+ });
+
+ schedule
+ .command("add")
+ .description("Add a scheduled collection")
+ .option(
+ "--every ",
+ "Collection interval (e.g. 24h, 12h, 1h)",
+ "24h",
+ )
+ .action(async (opts: { every: string }) => {
+ process.exitCode = await runScheduleAdd(opts.every, parsedOptions);
+ });
+
+ schedule
+ .command("list")
+ .description("Show scheduled tasks")
+ .option("--json", "Output machine-readable JSON")
+ .action(async () => {
+ process.exitCode = await runScheduleList(parsedOptions);
+ });
+
+ schedule
+ .command("remove")
+ .description("Remove the scheduled collection")
+ .action(async () => {
+ process.exitCode = await runScheduleRemove(parsedOptions);
+ });
+
+ try {
+ await program.parseAsync(normalizedArgv);
+ } catch (error) {
+ if (error instanceof CommanderError) {
+ if (
+ error.code === "commander.help" ||
+ error.code === "commander.helpDisplayed" ||
+ error.code === "commander.version"
+ ) {
+ process.exitCode = error.exitCode;
+ return Number(process.exitCode ?? 0);
+ }
+ // Commander already printed to stderr; just set exit code.
+ process.exitCode = error.exitCode;
+ return Number(process.exitCode ?? 1);
+ }
+ throw error;
+ }
+ return Number(process.exitCode ?? 0);
+}
+
+async function runConnect(
+ rawSource: string,
+ options: GlobalOptions,
+): Promise {
+ const source = rawSource.toLowerCase();
+ const runtime = new ManagedPlaywrightRuntime();
+ const emit = createEmitter(options);
+ const renderer: ConnectRenderer | null =
+ !options.json && !options.quiet ? createConnectRenderer() : null;
+ const registrySources = await loadRegistrySources();
+ const sourceLabels = createSourceLabelMap(registrySources);
+ const displayName = displaySource(source, sourceLabels);
+ let setupLogPath: string | undefined;
+ let fetchLogPath: string | undefined;
+ let runLogPath: string | undefined;
+ let terminalExitCode: number | null = null;
+
+ try {
+ // Title
+ renderer?.title(displayName);
+
+ const target = await detectPersonalServerTarget();
+
+ // --- Phase 1: Runtime check (silent if installed) ---
+ if (runtime.state !== "installed") {
+ if (options.noInput) {
+ emit.event({
+ type: "outcome",
+ status: CliOutcomeStatus.SETUP_REQUIRED,
+ source,
+ });
+ renderer?.fail(
+ `${displayName} needs a local browser runtime. Run without --no-input to install.`,
+ );
+ return 1;
+ }
+
+ if (!options.yes) {
+ renderer?.cleanup();
+ process.stderr.write("\n");
+ process.stderr.write("Vana Connect needs a local browser runtime.\n\n");
+ process.stderr.write("This will install:\n");
+ process.stderr.write(" \u2022 Connector runner\n");
+ process.stderr.write(" \u2022 Chromium browser engine\n");
+ process.stderr.write(" \u2022 Local files under ~/.vana/\n\n");
+ process.stderr.write("Your credentials stay on this machine.\n\n");
+
+ const shouldContinue = await confirm({
+ message: "Continue?",
+ default: true,
+ ...vanaPromptTheme,
+ });
+ if (!shouldContinue) {
+ renderer?.fail("Cancelled.");
+ emit.event({
+ type: "outcome",
+ status: CliOutcomeStatus.SETUP_REQUIRED,
+ source,
+ reason: "setup_declined",
+ });
+ return 1;
+ }
+ process.stderr.write("\n");
+ }
+
+ const installResult = await runtime.ensureInstalled(Boolean(options.yes));
+ setupLogPath = installResult.logPath;
+ emit.event({
+ type: "setup-complete",
+ runtime: installResult.runtime,
+ logPath: installResult.logPath,
+ });
+ renderer?.scopeDone("Runtime ready");
+ } else {
+ emit.event({
+ type: "setup-check",
+ runtime: runtime.state,
+ });
+ }
+
+ // --- Phase 2: Connector fetch (silent if cached/fast) ---
+ let fetched: Awaited<
+ ReturnType
+ >;
+ try {
+ fetched = await runtime.fetchConnector(source);
+ } catch (firstError) {
+ const firstMessage =
+ firstError instanceof Error ? firstError.message : "";
+ const isChecksumError =
+ firstMessage.toLowerCase().includes("checksum") ||
+ firstMessage.toLowerCase().includes("mismatch");
+
+ // Auto-retry on stale cache: clear cached connector and re-fetch
+ // from remote (skip local data-connectors dir which may be stale).
+ if (isChecksumError) {
+ try {
+ const cacheDir = getConnectorCacheDir();
+ const sourceCacheDir = path.join(cacheDir, source);
+ await fsp.rm(sourceCacheDir, { recursive: true, force: true });
+ const resolution = await fetchConnectorToCache(
+ source,
+ cacheDir,
+ undefined, // force remote fetch, skip local data-connectors
+ );
+ fetched = {
+ connectorPath: resolution.connectorPath,
+ logPath: "",
+ version: resolution.version,
+ };
+ } catch (retryError) {
+ const retryMessage =
+ retryError instanceof Error
+ ? retryError.message
+ : `Could not fetch ${displayName} connector.`;
+ const message = formatHumanSourceMessage(
+ retryMessage,
+ source,
+ displayName,
+ );
+ await updateSourceState(source, {
+ connectorInstalled: false,
+ lastRunAt: new Date().toISOString(),
+ lastRunOutcome: CliOutcomeStatus.CONNECTOR_UNAVAILABLE,
+ dataState: "none",
+ lastError: message,
+ lastResultPath: null,
+ lastLogPath: getErrorLogPath(retryError),
+ });
+ renderer?.fail(`${displayName} connector could not be verified.`);
+ renderer?.detail(
+ `Try again later, or report: https://github.com/vana-com/data-connectors/issues`,
+ );
+ emit.event({
+ type: "outcome",
+ status: CliOutcomeStatus.CONNECTOR_UNAVAILABLE,
+ source,
+ reason: message,
+ });
+ return 1;
+ }
+ } else {
+ const message = formatHumanSourceMessage(
+ firstMessage ||
+ `No connector is available for ${displayName} right now.`,
+ source,
+ displayName,
+ );
+ await updateSourceState(source, {
+ connectorInstalled: false,
+ lastRunAt: new Date().toISOString(),
+ lastRunOutcome: CliOutcomeStatus.CONNECTOR_UNAVAILABLE,
+ dataState: "none",
+ lastError: message,
+ lastResultPath: null,
+ lastLogPath: getErrorLogPath(firstError),
+ });
+ renderer?.fail(`${displayName} is not available.`);
+ renderer?.detail(`See what’s ready: vana sources`);
+ emit.event({
+ type: "outcome",
+ status: CliOutcomeStatus.CONNECTOR_UNAVAILABLE,
+ source,
+ reason: message,
+ });
+ return 1;
+ }
+ }
+ fetchLogPath = fetched.logPath;
+ const sourceDetails = registrySources.find((item) => item.id === source);
+ const resolution = {
+ source,
+ connectorPath: fetched.connectorPath,
+ } as const;
+ emit.event({
+ type: "connector-resolved",
+ source: resolution.source,
+ connectorPath: resolution.connectorPath,
+ logPath: fetched.logPath,
+ });
+
+ // --- Phase 3: Pre-connection validation (silent) ---
+ const profilePath = path.join(
+ getBrowserProfilesDir(),
+ `${path.basename(resolution.connectorPath, path.extname(resolution.connectorPath))}`,
+ );
+
+ if (
+ sourceDetails?.authMode === "legacy" &&
+ !options.noInput &&
+ process.platform === "linux" &&
+ !process.env.DISPLAY &&
+ !process.env.WAYLAND_DISPLAY
+ ) {
+ const message =
+ "This source needs a manual browser step, but no local display server is available.";
+ await updateSourceState(resolution.source, {
+ connectorInstalled: true,
+ sessionPresent: fs.existsSync(profilePath),
+ lastRunAt: new Date().toISOString(),
+ lastRunOutcome: CliOutcomeStatus.LEGACY_AUTH,
+ dataState: "none",
+ lastError: message,
+ lastResultPath: null,
+ lastLogPath: fetchLogPath ?? null,
+ });
+ renderer?.fail(
+ `${displayName} requires a browser window, but no display is available.`,
+ );
+ renderer?.detail("Run this command in a desktop terminal.");
+ emit.event({
+ type: "outcome",
+ status: CliOutcomeStatus.LEGACY_AUTH,
+ source: resolution.source,
+ reason: "display_server_unavailable",
+ });
+ return 1;
+ }
+
+ await updateSourceState(resolution.source, {
+ connectorInstalled: true,
+ sessionPresent: fs.existsSync(profilePath),
+ lastError: null,
+ lastLogPath: fetchLogPath ?? null,
+ });
+
+ // --- Phase 4-5: Authentication + Collection ---
+ let finalStatus: CliOutcome["status"] =
+ CliOutcomeStatus.UNEXPECTED_INTERNAL_ERROR;
+ let finalDataState: SourceStatus["dataState"] = "none";
+ let ingestFailureMessage: string | null = null;
+ let resultPath = getSourceResultPath(source);
+ let collectedResult = false;
+ let ingestScopeResults:
+ | Array<{
+ scope: string;
+ status: "stored" | "failed";
+ syncedAt?: string;
+ error?: string;
+ }>
+ | undefined;
+
+ // In IPC mode (--ipc), don’t provide an interactive callback.
+ // The runtime will write a pending-input file and poll for the
+ // response, letting an external agent handle credential collection.
+ const interactiveCallback = options.ipc
+ ? undefined
+ : async (needInput: {
+ message?: string;
+ fields: string[];
+ schema?: { properties?: Record };
+ responseInputPath: string;
+ }) => {
+ renderer?.pauseForPrompt();
+
+ // Show connector’s prompt message
+ if (renderer) {
+ const promptMessage =
+ needInput.message ?? `${displayName} needs your login.`;
+ process.stderr.write(`\n${promptMessage}\n\n`);
+ }
+
+ const values: Record = {};
+ try {
+ for (const field of needInput.fields) {
+ const isPasswordField = field.toLowerCase().includes("password");
+ if (isPasswordField) {
+ values[field] = await password({
+ message: humanizeField(field),
+ ...vanaPromptTheme,
+ });
+ } else {
+ values[field] = await input({
+ message: humanizeField(field),
+ ...vanaPromptTheme,
+ });
+ }
+ }
+ } catch (error) {
+ if (isPromptCancelled(error)) {
+ throw new Error("__vana_prompt_cancelled__");
+ }
+ throw error;
+ }
+ if (renderer) {
+ process.stderr.write("\n");
+ }
+ renderer?.resumeAfterPrompt();
+ return values;
+ };
+
+ for await (const event of runtime.runConnector({
+ connectorPath: resolution.connectorPath,
+ source: resolution.source,
+ noInput: options.ipc ? false : options.noInput,
+ onNeedInput: interactiveCallback,
+ })) {
+ emit.event(event);
+ if (event.logPath) {
+ runLogPath = event.logPath;
+ }
+
+ if (terminalExitCode !== null) {
+ continue;
+ }
+
+ if (event.type === "needs-input") {
+ await updateSourceState(resolution.source, {
+ lastRunAt: new Date().toISOString(),
+ lastRunOutcome: CliOutcomeStatus.NEEDS_INPUT,
+ lastError: event.message ?? "Input required.",
+ lastLogPath: event.logPath,
+ connectionHealth: "needs_reauth",
+ });
+ emit.event({
+ type: "outcome",
+ status: CliOutcomeStatus.NEEDS_INPUT,
+ source: resolution.source,
+ });
+ renderer?.fail(
+ `${displayName} needs credentials. Run without --no-input to authenticate.`,
+ );
+ terminalExitCode = 1;
+ continue;
+ }
+
+ if (event.type === "progress-update") {
+ // Drive the renderer with scope information from the event
+ const scopeName = extractScopeName(event);
+ if (scopeName && renderer) {
+ const isComplete =
+ typeof event.message === "string" &&
+ /^complete\b/i.test(event.message.trim());
+ if (isComplete) {
+ const detail = formatScopeDetail(event);
+ renderer.scopeDone(scopeName, detail);
+ } else {
+ renderer.scopeActive(scopeName);
+ }
+ }
+ continue;
+ }
+
+ if (event.type === "status-update") {
+ // Status updates are silent in the new design
+ continue;
+ }
+
+ if (event.type === "runtime-error") {
+ await updateSourceState(resolution.source, {
+ lastRunAt: new Date().toISOString(),
+ lastRunOutcome: CliOutcomeStatus.RUNTIME_ERROR,
+ lastError: event.message ?? "Connector run failed.",
+ lastLogPath: event.logPath,
+ connectionHealth: "error",
+ });
+ renderer?.fail(`Problem connecting ${displayName}.`);
+ renderer?.detail(event.message ?? "Connector run failed.");
+ renderer?.detail(`Retry: vana connect ${source}`);
+ emit.event({
+ type: "outcome",
+ status: CliOutcomeStatus.RUNTIME_ERROR,
+ source: resolution.source,
+ });
+ terminalExitCode = 1;
+ continue;
+ }
+
+ if (event.type === "headed-required") {
+ // Silent — the browser opens automatically
+ continue;
+ }
+
+ if (event.type === "legacy-auth") {
+ await updateSourceState(resolution.source, {
+ lastRunAt: new Date().toISOString(),
+ lastRunOutcome: CliOutcomeStatus.LEGACY_AUTH,
+ lastError: event.message ?? "Legacy authentication is required.",
+ dataState: "none",
+ lastResultPath: null,
+ lastLogPath: event.logPath,
+ connectionHealth: "needs_reauth",
+ });
+ renderer?.fail(`Manual step required for ${displayName}.`);
+ renderer?.detail(
+ `Complete the browser step locally, then rerun vana connect ${source}.`,
+ );
+ emit.event({
+ type: "outcome",
+ status: CliOutcomeStatus.LEGACY_AUTH,
+ source: resolution.source,
+ });
+ terminalExitCode = 1;
+ continue;
+ }
+
+ if (event.type === "collection-complete" && event.resultPath) {
+ // Check if the result is actually an error object
+ try {
+ const raw = await fsp.readFile(event.resultPath, "utf8");
+ const parsed = JSON.parse(raw);
+ if (
+ parsed &&
+ typeof parsed === "object" &&
+ "error" in parsed &&
+ Object.keys(parsed).length <= 2
+ ) {
+ // Connector returned an error, not real data
+ await updateSourceState(source, {
+ lastRunAt: new Date().toISOString(),
+ lastRunOutcome: CliOutcomeStatus.RUNTIME_ERROR,
+ connectionHealth: "error",
+ lastError:
+ typeof parsed.error === "string"
+ ? parsed.error
+ : "Collection returned an error",
+ lastLogPath: runLogPath ?? fetchLogPath,
+ });
+ renderer?.fail(`Problem connecting ${displayName}.`);
+ renderer?.detail(
+ typeof parsed.error === "string"
+ ? parsed.error
+ : "The connector returned an error instead of data.",
+ );
+ emit.event({
+ type: "outcome",
+ status: CliOutcomeStatus.RUNTIME_ERROR,
+ source,
+ });
+ terminalExitCode = 1;
+ continue;
+ }
+ } catch {
+ // Can't read/parse result — proceed normally, let downstream handle it
+ }
+
+ collectedResult = true;
+ // Copy result to per-source path so multiple sources can coexist
+ const sourceResultPath = getSourceResultPath(source);
+ try {
+ await fsp.mkdir(path.dirname(sourceResultPath), { recursive: true });
+ await fsp.copyFile(event.resultPath, sourceResultPath);
+ resultPath = sourceResultPath;
+ } catch {
+ resultPath = event.resultPath; // fall back to original path
+ }
+ const ingestEvents = await ingestResult(
+ resolution.source,
+ resultPath,
+ target,
+ );
+ for (const ingestEvent of ingestEvents) {
+ emit.event(ingestEvent);
+ }
+
+ const scopeResults = ingestEvents.find(
+ (e) =>
+ e.type === "ingest-complete" ||
+ e.type === "ingest-partial" ||
+ e.type === "ingest-failed",
+ )?.scopeResults;
+
+ const ingestCompleted = ingestEvents.some(
+ (ingestEvent) => ingestEvent.type === "ingest-complete",
+ );
+ const ingestPartial = ingestEvents.some(
+ (ingestEvent) => ingestEvent.type === "ingest-partial",
+ );
+ const ingestFailedEvent = ingestEvents.find(
+ (ingestEvent) => ingestEvent.type === "ingest-failed",
+ );
+ if (ingestCompleted) {
+ finalStatus = CliOutcomeStatus.CONNECTED_AND_INGESTED;
+ finalDataState = "ingested_personal_server";
+ } else if (ingestPartial) {
+ finalStatus = CliOutcomeStatus.CONNECTED_AND_INGESTED;
+ finalDataState = "ingested_personal_server";
+ } else if (ingestFailedEvent?.type === "ingest-failed") {
+ finalStatus = CliOutcomeStatus.INGEST_FAILED;
+ finalDataState = "ingest_failed";
+ ingestFailureMessage =
+ ingestFailedEvent.message ?? "Personal Server sync failed.";
+ } else {
+ finalStatus = CliOutcomeStatus.CONNECTED_LOCAL_ONLY;
+ finalDataState = "collected_local";
+ }
+
+ // Store per-scope results in state
+ ingestScopeResults = scopeResults?.map((r) => ({
+ scope: r.scope,
+ status: r.status,
+ syncedAt:
+ r.status === "stored" ? new Date().toISOString() : undefined,
+ error: r.error,
+ }));
+ }
+ }
+
+ if (terminalExitCode !== null) {
+ return terminalExitCode;
+ }
+
+ if (!collectedResult) {
+ await updateSourceState(resolution.source, {
+ connectorInstalled: true,
+ sessionPresent: fs.existsSync(profilePath),
+ lastRunAt: new Date().toISOString(),
+ lastRunOutcome: CliOutcomeStatus.UNEXPECTED_INTERNAL_ERROR,
+ dataState: "none",
+ lastError: "Connector run ended without a result.",
+ lastResultPath: null,
+ lastLogPath: runLogPath ?? fetchLogPath ?? null,
+ });
+ renderer?.fail(`Problem connecting ${displayName}.`);
+ renderer?.detail("Connector run ended without a result.");
+ emit.event({
+ type: "outcome",
+ status: CliOutcomeStatus.UNEXPECTED_INTERNAL_ERROR,
+ source: resolution.source,
+ reason: "Connector run ended without a result.",
+ });
+ return 1;
+ }
+
+ await updateSourceState(resolution.source, {
+ connectorInstalled: true,
+ connectorVersion: fetched.version,
+ exportFrequency: fetched.exportFrequency,
+ sessionPresent: true,
+ lastRunAt: new Date().toISOString(),
+ lastCollectedAt: new Date().toISOString(),
+ lastRunOutcome: finalStatus,
+ dataState: finalDataState,
+ lastError: ingestFailureMessage,
+ lastResultPath: resultPath,
+ lastLogPath: runLogPath ?? fetchLogPath ?? setupLogPath ?? null,
+ connectionHealth: "healthy",
+ ingestScopes: ingestScopeResults,
+ });
+
+ // Build scope-aware success summary
+ const storedCount =
+ ingestScopeResults?.filter((r) => r.status === "stored").length ?? 0;
+ const failedCount =
+ ingestScopeResults?.filter((r) => r.status === "failed").length ?? 0;
+ const totalScopes = ingestScopeResults?.length ?? 0;
+
+ let successSummary: string;
+ if (
+ finalStatus === CliOutcomeStatus.CONNECTED_AND_INGESTED &&
+ totalScopes > 0
+ ) {
+ if (failedCount === 0) {
+ successSummary = `Collected your ${displayName} data and synced it to your Personal Server.`;
+ } else {
+ successSummary = `Collected your ${displayName} data. ${storedCount}/${totalScopes} scopes synced, ${failedCount} failed.`;
+ }
+ } else if (finalStatus === CliOutcomeStatus.CONNECTED_AND_INGESTED) {
+ successSummary = `Collected your ${displayName} data and synced it to your Personal Server.`;
+ } else {
+ successSummary = `Collected your ${displayName} data and saved it locally.`;
+ }
+
+ // --- Phase 7: Success summary ---
+ renderer?.success(`Connected ${displayName}.`);
+ renderer?.detail(successSummary);
+
+ // Partial sync guidance
+ if (failedCount > 0 && storedCount > 0) {
+ renderer?.detail(`Retry: vana server sync`);
+ }
+
+ // Journey-aware next step
+ const state = await readCliState();
+ const connectedSourceCount = Object.values(state.sources ?? {}).filter(
+ (s) => hasCollectedData((s as SourceStatus)?.dataState),
+ ).length;
+
+ renderer?.detail("");
+ if (connectedSourceCount > 1) {
+ renderer?.next("vana sources");
+ } else {
+ renderer?.next(`vana data show ${source}`);
+ }
+
+ renderer?.bell();
+
+ // Emit for --json consumers (unchanged)
+ emit.event({
+ type: "outcome",
+ status: finalStatus,
+ source: resolution.source,
+ resultPath,
+ });
+ return 0;
+ } catch (error) {
+ if (
+ error instanceof Error &&
+ error.message === "__vana_prompt_cancelled__"
+ ) {
+ await updateSourceState(source, {
+ lastRunAt: new Date().toISOString(),
+ lastRunOutcome: CliOutcomeStatus.NEEDS_INPUT,
+ lastError: "Cancelled before input was completed.",
+ lastLogPath: runLogPath ?? null,
+ });
+ renderer?.fail("Cancelled.");
+ emit.event({
+ type: "outcome",
+ status: CliOutcomeStatus.NEEDS_INPUT,
+ source,
+ reason: "prompt_cancelled",
+ });
+ return 1;
+ }
+ const message =
+ error instanceof Error ? error.message : "Unexpected error.";
+ renderer?.fail(`Problem connecting ${displayName}.`);
+ renderer?.detail(message);
+ renderer?.detail(`Retry: vana connect ${source}`);
+ emit.event({
+ type: "outcome",
+ status: CliOutcomeStatus.UNEXPECTED_INTERNAL_ERROR,
+ source,
+ reason: message,
+ });
+ return 1;
+ } finally {
+ renderer?.cleanup();
+ }
+}
+
+async function runConnectEntry(options: GlobalOptions): Promise {
+ const emit = createEmitter(options);
+ const sources = await loadRegistrySources();
+ const state = await readCliState();
+ const sourceMetadata = createSourceMetadataMap(sources);
+ const statuses = await gatherSourceStatuses(state.sources, sourceMetadata);
+ const statusMap = new Map(statuses.map((source) => [source.source, source]));
+ const enrichedSources = sources.map((source) => {
+ const status = statusMap.get(source.id);
+ return {
+ ...source,
+ dataState: status?.dataState,
+ lastRunOutcome: status?.lastRunOutcome ?? null,
+ sessionPresent: status?.sessionPresent ?? false,
+ };
+ });
+ const suggestedSource =
+ enrichedSources.find(
+ (source) =>
+ source.authMode !== "legacy" && !hasCollectedData(source.dataState),
+ ) ??
+ enrichedSources.find((source) => source.authMode !== "legacy") ??
+ enrichedSources[0];
+ const missingSourceMessage =
+ formatMissingConnectSourceMessage(suggestedSource);
+
+ if (options.json) {
+ process.stdout.write(
+ `${JSON.stringify({
+ error: "source_required",
+ message: missingSourceMessage,
+ suggestedSource: suggestedSource
+ ? {
+ id: suggestedSource.id,
+ name: suggestedSource.name,
+ authMode: suggestedSource.authMode,
+ }
+ : null,
+ })}\n`,
+ );
+ return 1;
+ }
+
+ if (options.noInput) {
+ emit.info(missingSourceMessage);
+ return 1;
+ }
+
+ if (!process.stdin.isTTY || !process.stdout.isTTY) {
+ emit.info(missingSourceMessage);
+ return 1;
+ }
+
+ if (enrichedSources.length === 0) {
+ emit.info("No sources are available right now.");
+ emit.info("Run `vana sources` to verify the local connector registry.");
+ return 1;
+ }
+
+ // Build inquirer-compatible choices from enriched sources
+ const choices = enrichedSources.map((item) => {
+ const connected = hasCollectedData(item.dataState);
+ const hint = connected
+ ? "connected"
+ : item.authMode === "legacy"
+ ? "browser login"
+ : undefined;
+ return {
+ value: item.id,
+ name: item.name,
+ description: hint,
+ };
+ });
+
+ try {
+ const source = await select({
+ message: "Choose a source to connect.",
+ choices,
+ default: suggestedSource?.id,
+ ...vanaPromptTheme,
+ });
+
+ return runConnect(source as string, options);
+ } catch (error) {
+ if (isPromptCancelled(error)) {
+ emit.info("Cancelled.");
+ return 1;
+ }
+ throw error;
+ }
+}
+
+async function runList(options: GlobalOptions): Promise {
+ const result = await querySources();
+ const { sources: enrichedSources, recommendedSource } = result;
+
+ if (options.json) {
+ process.stdout.write(`${JSON.stringify(result)}\n`);
+ return 0;
+ }
+
+ const emit = createEmitter(options);
+ emit.title("Available sources");
+ emit.blank();
+
+ if (enrichedSources.length === 0) {
+ emit.info("No sources are available right now.");
+ } else {
+ const connectedSources = enrichedSources.filter((source) =>
+ hasCollectedData(source.dataState),
+ );
+ const unconnectedSources = enrichedSources.filter(
+ (source) => !hasCollectedData(source.dataState),
+ );
+
+ // Connected sources are always shown expanded
+ if (connectedSources.length > 0) {
+ emit.section("Connected");
+ for (const source of connectedSources) {
+ const badges: Array<{ text: string; tone?: RenderTone }> = [];
+ if (source.dataState === "ingested_personal_server") {
+ badges.push({ text: "synced", tone: "success" });
+ } else if (source.dataState === "ingest_failed") {
+ badges.push({ text: "sync failed", tone: "warning" });
+ } else {
+ badges.push({ text: "local", tone: "muted" });
+ }
+ emit.sourceTitle(source.name, badges);
+ emit.detail(
+ `Inspect with ${emit.code(`vana data show ${source.id}`)}.`,
+ );
+ }
+ emit.blank();
+ emit.section("Available");
+ }
+
+ for (const source of unconnectedSources) {
+ const badges: Array<{ text: string; tone?: RenderTone }> = [];
+ if (
+ recommendedSource?.id === source.id &&
+ recommendedSource.authMode !== "legacy"
+ ) {
+ badges.push({ text: "recommended", tone: "accent" });
+ }
+ emit.sourceTitle(source.name, badges);
+ if (source.description) {
+ emit.detail(cleanDescription(source.description));
+ }
+ }
+
+ if (recommendedSource) {
+ emit.blank();
+ emit.next(`vana connect ${recommendedSource.id}`);
+ }
+ }
+ return 0;
+}
+
+async function runStatus(options: GlobalOptions): Promise {
+ const { status, nextSteps } = await queryStatus();
+ const state = await readCliState();
+
+ // Build per-source health map from stored state
+ const sourceHealthMap: Record<
+ string,
+ {
+ connectionHealth?: string;
+ lastCollectedAt?: string;
+ }
+ > = {};
+ for (const [sourceId, stored] of Object.entries(state.sources)) {
+ if (stored) {
+ sourceHealthMap[sourceId] = {
+ connectionHealth: stored.connectionHealth,
+ lastCollectedAt: stored.lastCollectedAt,
+ };
+ }
+ }
+
+ if (options.json) {
+ const compactJson = {
+ runtime: status.runtime,
+ personalServer: status.personalServer,
+ personalServerUrl: status.personalServerUrl,
+ sources: {
+ connected: status.summary?.connectedCount ?? 0,
+ needsAttention: status.summary?.needsAttentionCount ?? 0,
+ },
+ sourceHealth: sourceHealthMap,
+ next: nextSteps[0] ?? null,
+ };
+ process.stdout.write(`${JSON.stringify(compactJson)}\n`);
+ return 0;
+ }
+
+ const emit = createEmitter(options);
+ const registrySources = await loadRegistrySources();
+ const sourceLabels = createSourceLabelMap(registrySources);
+ emit.title("Vana Connect");
+ emit.blank();
+ emit.keyValue("Runtime", status.runtime, toneForRuntime(status.runtime));
+ if (status.personalServer === "available") {
+ emit.keyValue(
+ "Personal Server",
+ status.personalServerUrl ?? "connected",
+ "success",
+ );
+ } else {
+ emit.keyValue("Personal Server", "not connected", "warning");
+ }
+ const connectedCount = status.summary?.connectedCount ?? 0;
+ const attentionCount = status.summary?.needsAttentionCount ?? 0;
+ const sourceParts = [
+ connectedCount > 0 ? `${connectedCount} connected` : "none connected",
+ ...(connectedCount > 0 && attentionCount > 0
+ ? [`${attentionCount} need${attentionCount === 1 ? "s" : ""} attention`]
+ : []),
+ ];
+ emit.keyValue(
+ "Sources",
+ sourceParts.join(", "),
+ attentionCount > 0 && connectedCount > 0
+ ? "warning"
+ : connectedCount > 0
+ ? "success"
+ : "muted",
+ );
+
+ // Show per-source health when sources are connected
+ const connectedSources = Object.entries(state.sources).filter(
+ ([, stored]) =>
+ stored?.connectorInstalled &&
+ (stored.dataState === "collected_local" ||
+ stored.dataState === "ingested_personal_server" ||
+ stored.dataState === "ingest_failed" ||
+ stored.connectionHealth),
+ );
+ if (connectedSources.length > 0) {
+ emit.blank();
+ let needsReauthSource: string | null = null;
+ for (const [sourceId, stored] of connectedSources) {
+ const health = stored?.connectionHealth ?? "healthy";
+ const displayName = displaySource(sourceId, sourceLabels);
+ const healthTone = toneForHealth(health);
+ const healthLabel = health === "needs_reauth" ? "needs login" : health;
+ const collectedAgo = stored?.lastCollectedAt
+ ? `collected ${formatRelativeTime(stored.lastCollectedAt)}`
+ : "";
+ emit.keyValue(
+ ` ${displayName}`,
+ `${healthLabel} ${collectedAgo}`,
+ healthTone,
+ );
+ if (health === "needs_reauth" && !needsReauthSource) {
+ needsReauthSource = sourceId;
+ }
+ }
+ if (needsReauthSource) {
+ emit.blank();
+ emit.next(`vana connect ${needsReauthSource}`);
+ return 0;
+ }
+ }
+
+ if (nextSteps.length > 0) {
+ emit.blank();
+ const command = extractCommand(nextSteps[0]);
+ if (command) {
+ emit.next(command);
+ } else {
+ emit.detail(`Next: ${nextSteps[0]}`);
+ }
+ }
+ return 0;
+}
+
+async function runDoctor(options: GlobalOptions): Promise {
+ const payload = await queryDoctor();
+
+ if (options.json) {
+ process.stdout.write(`${JSON.stringify(payload)}\n`);
+ return 0;
+ }
+
+ const sourceLabels = createSourceLabelMap(await loadRegistrySources());
+ const { recentSources } = payload;
+ const attentionSources = recentSources.filter(
+ (source) => rankSourceStatus(source) <= 4,
+ );
+
+ const emit = createEmitter(options);
+ emit.title("Vana Connect doctor");
+ emit.section("Summary");
+ emit.keyValue("CLI", payload.cliVersion, "muted");
+ emit.keyValue("Channel", payload.channel, "muted");
+ emit.keyValue(
+ "Install",
+ formatInstallMethodLabel(payload.installMethod),
+ "muted",
+ );
+ emit.keyValue("Runtime", payload.runtime, toneForRuntime(payload.runtime));
+ emit.keyValue(
+ "Personal Server",
+ payload.personalServer,
+ payload.personalServer === "available" ? "success" : "warning",
+ );
+ emit.keyValue(
+ "Tracked sources",
+ String(payload.summary.trackedSourceCount),
+ "muted",
+ );
+ emit.keyValue(
+ "Attention",
+ String(payload.summary.attentionCount),
+ payload.summary.attentionCount > 0 ? "warning" : "muted",
+ );
+ emit.keyValue(
+ "Connected",
+ String(payload.summary.connectedCount),
+ payload.summary.connectedCount > 0 ? "success" : "muted",
+ );
+ emit.blank();
+ emit.section("Checks");
+ for (const check of payload.checks) {
+ const tone: RenderTone =
+ check.status === "ok"
+ ? "success"
+ : check.status === "warn"
+ ? "warning"
+ : "error";
+ emit.keyValue(check.label, check.detail, tone);
+ }
+ if (recentSources.length > 0) {
+ emit.blank();
+ emit.section(
+ attentionSources.length > 0
+ ? "Needs attention"
+ : "Recent source activity",
+ );
+ for (const source of attentionSources.length > 0
+ ? attentionSources
+ : recentSources) {
+ const status = getSourceStatusPresentation(source);
+ const badges: Array<{ text: string; tone?: RenderTone }> = [];
+ badges.push({ text: status.label, tone: status.tone });
+ emit.sourceTitle(displaySource(source.source, sourceLabels), badges);
+ const details = formatSourceStatusDetails(source);
+ for (const detail of details) {
+ if (detail.kind === "row") {
+ emit.keyValue(detail.label, detail.value, detail.tone ?? "muted");
+ } else {
+ emit.detail(humanizeIssue(detail.message));
+ }
+ }
+ }
+ }
+ emit.blank();
+ emit.section("Paths");
+ emit.keyValue(
+ "Executable",
+ formatDisplayPath(payload.paths.executable),
+ "muted",
+ );
+ if (payload.paths.appRoot) {
+ emit.keyValue(
+ "App root",
+ formatDisplayPath(payload.paths.appRoot),
+ "muted",
+ );
+ }
+ emit.keyValue(
+ "Data home",
+ formatDisplayPath(payload.paths.dataHome),
+ "muted",
+ );
+ emit.keyValue(
+ "State file",
+ formatDisplayPath(payload.paths.stateFile),
+ "muted",
+ );
+ emit.keyValue(
+ "Connector cache",
+ formatDisplayPath(payload.paths.connectorCache),
+ "muted",
+ );
+ emit.keyValue(
+ "Browser profiles",
+ formatDisplayPath(payload.paths.browserProfiles),
+ "muted",
+ );
+ emit.keyValue("Logs", formatDisplayPath(payload.paths.logs), "muted");
+ emit.blank();
+ emit.section("Lifecycle");
+ emit.keyValue("Upgrade", payload.lifecycle.upgrade, "muted");
+ emit.keyValue("Uninstall", payload.lifecycle.uninstall, "muted");
+ if (payload.nextSteps.length > 0) {
+ emit.blank();
+ const command = extractCommand(payload.nextSteps[0]);
+ if (command) {
+ emit.next(command);
+ } else {
+ emit.detail(`Next: ${payload.nextSteps[0]}`);
+ }
+ }
+
+ return 0;
+}
+
+async function runServerStatus(options: GlobalOptions): Promise {
+ const emit = createEmitter(options);
+ const target = await detectPersonalServerTarget();
+ const state = await readCliState();
+
+ // Count scopes from state
+ let totalScopeCount = 0;
+ for (const stored of Object.values(state.sources)) {
+ if (stored?.ingestScopes) {
+ totalScopeCount += stored.ingestScopes.filter(
+ (s) => s.status === "stored",
+ ).length;
+ }
+ }
+
+ if (options.json) {
+ process.stdout.write(
+ `${JSON.stringify({
+ state: target.state,
+ url: target.url,
+ source: target.source,
+ health: target.health,
+ scopeCount: totalScopeCount,
+ })}\n`,
+ );
+ return 0;
+ }
+
+ emit.title("Personal Server");
+ emit.blank();
+
+ if (target.url) {
+ const urlSuffix =
+ target.source === "scan"
+ ? "(auto-detected)"
+ : target.source === "config"
+ ? "(saved)"
+ : target.source === "env"
+ ? "(from VANA_PERSONAL_SERVER_URL)"
+ : `(${target.source ?? "unknown"})`;
+ emit.keyValue("URL", `${target.url} ${urlSuffix}`, "muted");
+ }
+
+ const stateLabel = target.state === "available" ? "healthy" : "Not connected";
+ emit.keyValue(
+ "Status",
+ stateLabel,
+ target.state === "available" ? "success" : "warning",
+ );
+
+ if (target.health) {
+ emit.keyValue("Version", target.health.version, "muted");
+ }
+
+ if (totalScopeCount > 0) {
+ emit.keyValue("Scopes", `${totalScopeCount} stored`, "muted");
+ }
+
+ if (target.source && !target.url) {
+ const sourceLabel: Record = {
+ config: "Saved config",
+ env: "VANA_PERSONAL_SERVER_URL",
+ scan: "Localhost scan",
+ };
+ emit.keyValue(
+ "Resolved via",
+ sourceLabel[target.source] ?? target.source,
+ "muted",
+ );
+ }
+
+ if (target.health) {
+ emit.keyValue("Uptime", formatUptime(target.health.uptime), "muted");
+ if (target.health.owner) {
+ emit.keyValue("Owner", target.health.owner, "muted");
+ }
+ }
+
+ if (target.source === "scan" && target.url) {
+ emit.blank();
+ emit.detail(`Save with ${emit.code(`vana server set-url ${target.url}`)}.`);
+ }
+
+ if (target.state !== "available") {
+ emit.blank();
+ emit.next("vana server set-url ");
+ }
+
+ emit.blank();
+ emit.detail(
+ `More: ${emit.code("vana server sync")} | ${emit.code("vana server data")} | ${emit.code("vana server --help")}`,
+ );
+
+ return 0;
+}
+
+async function runServerSetUrl(
+ url: string,
+ options: GlobalOptions,
+): Promise {
+ const emit = createEmitter(options);
+
+ try {
+ new URL(url);
+ } catch {
+ if (options.json) {
+ process.stdout.write(
+ `${JSON.stringify({ ok: false, error: "Invalid URL" })}\n`,
+ );
+ } else {
+ emit.info(`Invalid URL: ${url}`);
+ }
+ return 1;
+ }
+
+ await updateCliConfig({ personalServerUrl: url });
+
+ const target = await detectPersonalServerTarget();
+
+ if (options.json) {
+ process.stdout.write(
+ `${JSON.stringify({
+ ok: true,
+ url,
+ reachable: target.state === "available",
+ health: target.health,
+ })}\n`,
+ );
+ return 0;
+ }
+
+ emit.info(`Saved Personal Server URL: ${url}`);
+ if (target.state === "available") {
+ emit.info(
+ `Server is reachable (${target.health?.version ?? "unknown version"}).`,
+ );
+ } else {
+ emit.info("Server is not reachable yet. It will be used when available.");
+ }
+
+ return 0;
+}
+
+async function runServerClearUrl(options: GlobalOptions): Promise {
+ const emit = createEmitter(options);
+ const config = await readCliConfig();
+
+ if (!config.personalServerUrl) {
+ if (options.json) {
+ process.stdout.write(`${JSON.stringify({ ok: true, cleared: false })}\n`);
+ } else {
+ const target = await detectPersonalServerTarget();
+ if (target.source === "scan" && target.url) {
+ emit.info(
+ "No saved URL to clear. Current connection is auto-detected on localhost.",
+ );
+ emit.info(
+ `Run ${emit.code("vana server set-url ")} to save a specific URL.`,
+ );
+ } else {
+ emit.info("No saved Personal Server URL to clear.");
+ }
+ }
+ return 0;
+ }
+
+ await updateCliConfig({ personalServerUrl: undefined });
+
+ if (options.json) {
+ process.stdout.write(`${JSON.stringify({ ok: true, cleared: true })}\n`);
+ } else {
+ emit.info("Cleared saved Personal Server URL.");
+ }
+
+ return 0;
+}
+
+export function formatUptime(seconds: number): string {
+ if (seconds < 60) return `${Math.round(seconds)}s`;
+ if (seconds < 3600) return `${Math.floor(seconds / 60)}m`;
+ if (seconds < 86400) {
+ const hours = Math.floor(seconds / 3600);
+ const minutes = Math.floor((seconds % 3600) / 60);
+ return minutes > 0 ? `${hours}h ${minutes}m` : `${hours}h`;
+ }
+ const days = Math.floor(seconds / 86400);
+ const hours = Math.floor((seconds % 86400) / 3600);
+ return hours > 0 ? `${days}d ${hours}h` : `${days}d`;
+}
+
+async function runSetup(options: GlobalOptions): Promise {
+ const emit = createEmitter(options);
+ const runtime = new ManagedPlaywrightRuntime();
+ const registrySources = await loadRegistrySources();
+ const suggestedSource =
+ registrySources.find((source) => source.authMode !== "legacy") ??
+ registrySources[0];
+
+ emit.title("Vana Connect setup");
+ emit.section("Runtime");
+
+ if (runtime.state === "installed") {
+ emit.info("The local runtime is already installed.");
+ if (runtime.runtimePath) {
+ emit.keyValue("Browser", formatDisplayPath(runtime.runtimePath), "muted");
+ }
+ emit.blank();
+ if (suggestedSource) {
+ emit.next(`vana connect ${suggestedSource.id}`);
+ } else {
+ emit.next("vana connect");
+ }
+ emit.event({ type: "setup-check", runtime: runtime.state });
+ return 0;
+ }
+
+ try {
+ const result = await runtime.ensureInstalled(Boolean(options.yes));
+ emit.success("Runtime ready.");
+ if (result.logPath) {
+ emit.detail(`Setup log: ${formatDisplayPath(result.logPath)}`);
+ }
+ emit.blank();
+ if (suggestedSource) {
+ emit.next(`vana connect ${suggestedSource.id}`);
+ } else {
+ emit.next("vana connect");
+ }
+ emit.event({
+ type: "setup-complete",
+ runtime: result.runtime,
+ logPath: result.logPath,
+ });
+ return 0;
+ } catch (error) {
+ const message =
+ error instanceof Error
+ ? error.message
+ : "Vana Connect could not finish installing the local runtime.";
+ emit.info(message);
+ emit.event({
+ type: "outcome",
+ status: CliOutcomeStatus.RUNTIME_ERROR,
+ reason: message,
+ });
+ return 1;
+ }
+}
+
+async function runDataList(options: GlobalOptions): Promise {
+ const result = await queryDataList();
+
+ if (options.json) {
+ process.stdout.write(`${JSON.stringify(result)}\n`);
+ return 0;
+ }
+
+ const { datasets: datasetRecords } = result;
+ const registrySources = await loadRegistrySources();
+ const emit = createEmitter(options);
+ if (datasetRecords.length === 0) {
+ const suggestedSource =
+ registrySources.find((source) => source.authMode !== "legacy") ??
+ registrySources[0];
+ emit.title("Collected data");
+ emit.blank();
+ emit.info(" No datasets yet.");
+ emit.blank();
+ if (suggestedSource) {
+ emit.next(`vana connect ${suggestedSource.id}`);
+ } else {
+ emit.next("vana connect");
+ }
+ return 0;
+ }
+
+ emit.title(
+ datasetRecords.length > 0
+ ? `Collected data (${datasetRecords.length})`
+ : "Collected data",
+ );
+ emit.blank();
+ emit.info(
+ joinOverviewParts([
+ formatCountLabel("dataset", datasetRecords.length),
+ formatCountLabel(
+ "local only",
+ datasetRecords.filter(
+ (dataset) => dataset.dataState !== "ingested_personal_server",
+ ).length,
+ ),
+ formatCountLabel(
+ "synced",
+ datasetRecords.filter(
+ (dataset) => dataset.dataState === "ingested_personal_server",
+ ).length,
+ ),
+ datasetRecords.some((dataset) => dataset.dataState === "ingest_failed")
+ ? formatCountLabel(
+ "sync failed",
+ datasetRecords.filter(
+ (dataset) => dataset.dataState === "ingest_failed",
+ ).length,
+ )
+ : "",
+ ]),
+ );
+ emit.blank();
+ datasetRecords.forEach((dataset, index) => {
+ if (index > 0) {
+ emit.blank();
+ }
+ const badges =
+ dataset.dataState === "ingested_personal_server"
+ ? [{ text: "synced", tone: "success" as const }]
+ : dataset.dataState === "ingest_failed"
+ ? [{ text: "sync failed", tone: "warning" as const }]
+ : [{ text: "local", tone: "muted" as const }];
+ emit.sourceTitle(dataset.name ?? displaySource(dataset.source), badges);
+ if (dataset.summary) {
+ for (const line of dataset.summary.lines) {
+ emit.detail(line);
+ }
+ }
+ if (dataset.dataState === "ingested_personal_server") {
+ emit.keyValue("State", "Synced to Personal Server", "success");
+ } else if (dataset.dataState === "ingest_failed") {
+ emit.keyValue("State", "Saved locally, sync failed", "warning");
+ } else {
+ emit.keyValue("State", "Saved locally", "muted");
+ }
+ if (dataset.lastRunAt) {
+ emit.keyValue("Updated", formatTimestamp(dataset.lastRunAt), "muted");
+ }
+ if (dataset.path) {
+ emit.keyValue("Path", formatDisplayPath(dataset.path), "muted");
+ }
+ });
+ emit.blank();
+ if (datasetRecords.length > 0) {
+ emit.next(`vana data show ${datasetRecords[0].source}`);
+ }
+ return 0;
+}
+
+async function runDataShow(
+ source: string,
+ options: GlobalOptions,
+): Promise {
+ const result = await queryDataShow(source);
+
+ if (!result.ok) {
+ if (result.error === "dataset_not_found") {
+ if (options.json) {
+ process.stdout.write(
+ `${JSON.stringify({
+ error: result.error,
+ source: result.source,
+ message: result.message,
+ nextSteps: result.nextSteps,
+ })}\n`,
+ );
+ } else {
+ const emit = createEmitter(options);
+ emit.info(result.message);
+ emit.blank();
+ emit.next(`vana connect ${source}`);
+ }
+ return 1;
+ }
+ // dataset_read_failed
+ if (options.json) {
+ process.stdout.write(
+ `${JSON.stringify({ error: result.error, source: result.source, path: result.path, message: result.message })}\n`,
+ );
+ } else {
+ createEmitter(options).info(result.message);
+ }
+ return 1;
+ }
+
+ if (options.json) {
+ process.stdout.write(
+ `${JSON.stringify({
+ source: result.source,
+ name: result.name,
+ path: result.path,
+ summary: result.summary,
+ lastRunAt: result.lastRunAt,
+ dataState: result.dataState,
+ nextSteps: result.nextSteps,
+ data: result.data,
+ })}\n`,
+ );
+ return 0;
+ }
+
+ const emit = createEmitter(options);
+ const state = await readCliState();
+ const record = state.sources[source];
+ emit.title(`${result.name} data`);
+ emit.blank();
+ if (result.summary) {
+ for (const line of result.summary.lines) {
+ emit.detail(line);
+ }
+ emit.blank();
+ }
+ emit.keyValue("Path", formatDisplayPath(result.path), "muted");
+ if (record?.lastRunAt) {
+ emit.keyValue("Updated", formatTimestamp(record.lastRunAt), "muted");
+ }
+ if (record?.dataState === "ingested_personal_server") {
+ emit.keyValue("State", "Synced to Personal Server", "success");
+ } else if (record?.dataState === "ingest_failed") {
+ emit.keyValue("State", "Saved locally, sync failed", "warning");
+ } else {
+ emit.keyValue("State", "Saved locally", "muted");
+ }
+ emit.blank();
+ if (result.datasetCount > 1) {
+ emit.next("vana data list");
+ } else {
+ emit.next(`vana connect ${source}`);
+ }
+ return 0;
+}
+
+async function runDataPath(
+ source: string,
+ options: GlobalOptions,
+): Promise {
+ const sourceLabels = createSourceLabelMap(await loadRegistrySources());
+ const state = await readCliState();
+ const resultPath = state.sources[source]?.lastResultPath;
+
+ if (!resultPath) {
+ if (options.json) {
+ process.stdout.write(
+ `${JSON.stringify({
+ error: "dataset_not_found",
+ source,
+ name: displaySource(source, sourceLabels),
+ message: `No collected dataset found for ${displaySource(source, sourceLabels)}. Run \`vana connect ${source}\` first.`,
+ })}\n`,
+ );
+ } else {
+ createEmitter(options).info(
+ `No collected dataset found for ${displaySource(source, sourceLabels)}. Run \`vana connect ${source}\` first.`,
+ );
+ }
+ return 1;
+ }
+
+ if (options.json) {
+ process.stdout.write(
+ `${JSON.stringify({
+ source,
+ name: displaySource(source, sourceLabels),
+ path: resultPath,
+ lastRunAt: state.sources[source]?.lastRunAt ?? null,
+ dataState: state.sources[source]?.dataState ?? null,
+ nextSteps: [
+ `Inspect the dataset with \`vana data show ${source}\`.`,
+ `Reconnect ${displaySource(source, sourceLabels)} with \`vana connect ${source}\`.`,
+ ],
+ })}\n`,
+ );
+ } else {
+ process.stdout.write(`${formatDisplayPath(resultPath)}\n`);
+ }
+ return 0;
+}
+
+async function runLogs(
+ source: string | undefined,
+ options: GlobalOptions,
+): Promise {
+ const sourceLabels = createSourceLabelMap(await loadRegistrySources());
+ const state = await readCliState();
+ const records = Object.entries(state.sources)
+ .filter(([, entry]) => Boolean(entry?.lastLogPath))
+ .map(([sourceId, entry]) => ({
+ source: sourceId,
+ name: displaySource(sourceId, sourceLabels),
+ path: entry?.lastLogPath ?? "",
+ lastRunAt: entry?.lastRunAt ?? null,
+ lastRunOutcome: entry?.lastRunOutcome ?? null,
+ dataState: (entry?.dataState === "collected_local" ||
+ entry?.dataState === "ingested_personal_server" ||
+ entry?.dataState === "ingest_failed"
+ ? entry.dataState
+ : null) as SourceStatus["dataState"] | null,
+ }))
+ .sort(compareLogRecordOrder);
+ const logSummary = {
+ attentionCount: records.filter((record) =>
+ isAttentionLog(record.lastRunOutcome, record.dataState),
+ ).length,
+ successfulCount: records.filter(
+ (record) =>
+ record.dataState === "collected_local" ||
+ record.dataState === "ingested_personal_server",
+ ).length,
+ localCount: records.filter(
+ (record) => record.dataState === "collected_local",
+ ).length,
+ syncedCount: records.filter(
+ (record) => record.dataState === "ingested_personal_server",
+ ).length,
+ };
+
+ if (source) {
+ const match = records.find((record) => record.source === source);
+ if (!match) {
+ const payload = {
+ error: "log_not_found",
+ source,
+ message: `No stored run log found for ${displaySource(source, sourceLabels)}.`,
+ nextSteps: [
+ `Run \`vana connect ${source}\` to create a new log.`,
+ ...(records.length > 0
+ ? ["Run `vana logs` to inspect other logs."]
+ : []),
+ ],
+ };
+ if (options.json) {
+ process.stdout.write(`${JSON.stringify(payload)}\n`);
+ } else {
+ const emit = createEmitter(options);
+ emit.info(payload.message);
+ emit.blank();
+ emit.next(`vana connect ${source}`);
+ }
+ return 1;
+ }
+
+ if (options.json) {
+ process.stdout.write(`${JSON.stringify(match)}\n`);
+ } else {
+ process.stdout.write(`${formatDisplayPath(match.path)}\n`);
+ }
+ return 0;
+ }
+
+ const nextSteps = buildLogsNextSteps(records);
+ if (options.json) {
+ process.stdout.write(
+ `${JSON.stringify({
+ count: records.length,
+ latestLog: records[0] ?? null,
+ nextSteps,
+ summary: logSummary,
+ logs: records,
+ })}\n`,
+ );
+ return 0;
+ }
+
+ const emit = createEmitter(options);
+ emit.title(records.length > 0 ? `Run logs (${records.length})` : "Run logs");
+ emit.blank();
+
+ if (records.length === 0) {
+ emit.info("No stored run logs yet.");
+ emit.blank();
+ emit.next("vana connect");
+ return 0;
+ }
+
+ emit.info(
+ joinOverviewParts([
+ logSummary.attentionCount > 0
+ ? formatCountLabel("need attention", logSummary.attentionCount)
+ : "",
+ logSummary.successfulCount > 0
+ ? formatCountLabel("successful", logSummary.successfulCount)
+ : "",
+ logSummary.localCount > 0
+ ? formatCountLabel("local", logSummary.localCount)
+ : "",
+ logSummary.syncedCount > 0
+ ? formatCountLabel("synced", logSummary.syncedCount)
+ : "",
+ ]),
+ );
+ emit.blank();
+
+ const groups = [
+ {
+ title: "Needs attention",
+ items: records.filter((record) =>
+ isAttentionLog(record.lastRunOutcome, record.dataState),
+ ),
+ },
+ {
+ title: "Successful runs",
+ items: records.filter(
+ (record) => !isAttentionLog(record.lastRunOutcome, record.dataState),
+ ),
+ },
+ ].filter((group) => group.items.length > 0);
+
+ groups.forEach((group, groupIndex) => {
+ if (groupIndex > 0) {
+ emit.blank();
+ }
+ emit.section(formatCountLabel(group.title, group.items.length));
+ for (const record of group.items) {
+ emit.sourceTitle(record.name, [
+ {
+ text: formatLogOutcomeLabel(record.lastRunOutcome, record.dataState),
+ tone: toneForLogOutcome(record.lastRunOutcome, record.dataState),
+ },
+ ]);
+ emit.keyValue("Path", formatDisplayPath(record.path), "muted");
+ if (record.lastRunAt) {
+ emit.keyValue("Updated", formatTimestamp(record.lastRunAt), "muted");
+ }
+ }
+ });
+
+ emit.blank();
+ if (nextSteps.length > 0) {
+ const command = extractCommand(nextSteps[0]);
+ if (command) {
+ emit.next(command);
+ } else {
+ emit.detail(`Next: ${nextSteps[0]}`);
+ }
+ }
+ return 0;
+}
+
+async function runSourceDetail(
+ source: string,
+ options: GlobalOptions,
+): Promise {
+ const emit = createEmitter(options);
+ const registrySources = await loadRegistrySources();
+ const state = await readCliState();
+ const match = registrySources.find(
+ (s) => s.id === source || s.name.toLowerCase() === source.toLowerCase(),
+ );
+
+ if (!match) {
+ if (options.json) {
+ process.stdout.write(
+ `${JSON.stringify({ error: "unknown_source", source, message: `Unknown source: ${source}. Run \`vana sources\` to see available options.` })}\n`,
+ );
+ } else {
+ emit.info(
+ `Unknown source: ${source}. Run \`vana sources\` to see available options.`,
+ );
+ }
+ return 1;
+ }
+
+ const stored = state.sources[match.id];
+ const metadata = await readCachedConnectorMetadata(
+ match.id,
+ getConnectorCacheDir(),
+ );
+ const scopes = metadata?.scopes ?? [];
+ const sourceStatus = stored
+ ? ({
+ source: match.id,
+ installed: Boolean(stored.connectorInstalled),
+ sessionPresent: stored.sessionPresent ?? false,
+ lastRunOutcome: stored.lastRunOutcome ?? null,
+ dataState: stored.dataState as SourceStatus["dataState"],
+ } as SourceStatus)
+ : undefined;
+ const badge = sourceStatus ? getSourceBadge(sourceStatus) : undefined;
+
+ if (options.json) {
+ process.stdout.write(
+ `${JSON.stringify({
+ id: match.id,
+ name: match.name,
+ company: match.company,
+ description: match.description,
+ version: match.version ?? stored?.connectorVersion,
+ exportFrequency: match.exportFrequency ?? stored?.exportFrequency,
+ authMode: match.authMode,
+ scopes,
+ scopeLabels: scopes.map((s) => s.label),
+ connectorVersion: stored?.connectorVersion,
+ lastCollectedAt: stored?.lastCollectedAt,
+ dataState: stored?.dataState,
+ })}\n`,
+ );
+ return 0;
+ }
+
+ const iconPrefix = await renderIconInline(match.id);
+ const badgeList: Array<{ text: string; tone?: RenderTone }> = [];
+ if (badge && badge.label !== "new") {
+ badgeList.push({ text: badge.label, tone: badge.style });
+ }
+ emit.sourceTitle(`${iconPrefix}${match.name}`, badgeList);
+ emit.blank();
+ if (match.description) {
+ emit.info(cleanDescription(match.description));
+ emit.blank();
+ }
+
+ if (scopes.length > 0) {
+ emit.section("Collects");
+ for (const scope of scopes) {
+ if (scope.description) {
+ emit.keyValue(
+ scope.label,
+ cleanDescription(scope.description),
+ "muted",
+ );
+ } else {
+ emit.bullet(scope.label);
+ }
+ }
+ }
+
+ if (
+ stored?.connectorVersion &&
+ match.version &&
+ stored.connectorVersion !== match.version
+ ) {
+ emit.blank();
+ emit.detail(
+ `A newer connector version is available (${match.version}). Reconnect to update.`,
+ );
+ }
+
+ emit.blank();
+ emit.next(`vana connect ${match.id}`);
+ return 0;
+}
+
+async function runCollect(
+ source: string,
+ options: GlobalOptions,
+): Promise {
+ const emit = createEmitter(options);
+ const state = await readCliState();
+ const stored = state.sources[source];
+
+ if (!stored || !stored.connectorInstalled) {
+ if (options.json) {
+ process.stdout.write(
+ `${JSON.stringify({
+ error: "not_previously_connected",
+ source,
+ message: `Source "${source}" has not been connected yet. Run \`vana connect ${source}\` first.`,
+ })}\n`,
+ );
+ } else {
+ emit.info(
+ `Source "${source}" has not been connected yet. Run \`vana connect ${source}\` first.`,
+ );
+ }
+ return 1;
+ }
+
+ return runConnect(source, options);
+}
+
+async function runCollectAll(options: GlobalOptions): Promise {
+ const emit = createEmitter(options);
+ const state = await readCliState();
+ const dueSources = Object.entries(state.sources)
+ .filter(
+ ([, stored]) =>
+ stored?.connectorInstalled &&
+ isCollectionDue(stored.exportFrequency, stored.lastCollectedAt),
+ )
+ .map(([id]) => id);
+
+ if (dueSources.length === 0) {
+ if (options.json) {
+ process.stdout.write(
+ `${JSON.stringify({ message: "No sources are due for collection.", count: 0 })}\n`,
+ );
+ } else {
+ emit.info("No sources are due for collection.");
+ }
+ return 0;
+ }
+
+ let exitCode = 0;
+ for (const source of dueSources) {
+ const result = await runConnect(source, options);
+ if (result !== 0) {
+ exitCode = result;
+ }
+ }
+ return exitCode;
+}
+
+async function runServerSync(options: GlobalOptions): Promise {
+ const emit = createEmitter(options);
+ const target = await detectPersonalServerTarget();
+
+ if (target.state !== "available") {
+ if (options.json) {
+ process.stdout.write(
+ `${JSON.stringify({
+ error: "personal_server_unavailable",
+ message:
+ "Personal Server is not available. Run `vana server set-url ` to configure.",
+ })}\n`,
+ );
+ } else {
+ emit.info(
+ "Personal Server is not available. Run `vana server set-url ` to configure.",
+ );
+ }
+ return 1;
+ }
+
+ const state = await readCliState();
+ // Find sources that are local-only OR have failed ingest scopes
+ const pendingSources = Object.entries(state.sources).filter(
+ ([, stored]) =>
+ stored?.lastResultPath &&
+ (stored.dataState === "collected_local" ||
+ stored.dataState === "ingest_failed" ||
+ stored?.ingestScopes?.some((s) => s.status === "failed")),
+ );
+
+ if (pendingSources.length === 0) {
+ if (options.json) {
+ process.stdout.write(
+ `${JSON.stringify({ message: "No pending datasets to sync.", syncedCount: 0 })}\n`,
+ );
+ } else {
+ emit.info("No pending datasets to sync.");
+ }
+ return 0;
+ }
+
+ let syncedCount = 0;
+ const allScopeResults: Array<{
+ source: string;
+ scopeResults?: Array<{ scope: string; status: string; error?: string }>;
+ }> = [];
+
+ for (const [source, stored] of pendingSources) {
+ if (!stored?.lastResultPath) {
+ continue;
+ }
+ const ingestEvents = await ingestResult(
+ source,
+ stored.lastResultPath,
+ target,
+ );
+
+ const resultEvent = ingestEvents.find(
+ (e) =>
+ e.type === "ingest-complete" ||
+ e.type === "ingest-partial" ||
+ e.type === "ingest-failed",
+ );
+ const scopeResults = resultEvent?.scopeResults;
+
+ const ingestCompleted = ingestEvents.some(
+ (e) => e.type === "ingest-complete",
+ );
+ const ingestPartial = ingestEvents.some((e) => e.type === "ingest-partial");
+
+ if (ingestCompleted || ingestPartial) {
+ syncedCount++;
+ const dataState =
+ ingestCompleted || ingestPartial
+ ? "ingested_personal_server"
+ : stored.dataState;
+ await updateSourceState(source, {
+ dataState,
+ ingestScopes: scopeResults?.map((r) => ({
+ scope: r.scope,
+ status: r.status,
+ syncedAt:
+ r.status === "stored" ? new Date().toISOString() : undefined,
+ error: r.error,
+ })),
+ });
+ }
+
+ allScopeResults.push({ source, scopeResults });
+ for (const event of ingestEvents) {
+ emit.event(event);
+ }
+ }
+
+ if (options.json) {
+ process.stdout.write(
+ `${JSON.stringify({ message: `Synced ${syncedCount} dataset(s).`, syncedCount })}\n`,
+ );
+ } else {
+ // Show per-scope results with scope manifest style
+ const renderer = createHumanRenderer();
+ for (const entry of allScopeResults) {
+ if (entry.scopeResults && entry.scopeResults.length > 0) {
+ emit.info(`${entry.source}:`);
+ for (const sr of entry.scopeResults) {
+ if (sr.status === "stored") {
+ emit.info(` ${renderer.theme.success("\u2713")} ${sr.scope}`);
+ } else {
+ const errDetail = sr.error ?? "failed";
+ emit.info(
+ ` ${renderer.theme.error("\u2717")} ${sr.scope} ${renderer.theme.muted(`\u2014 ${errDetail}`)}`,
+ );
+ }
+ }
+ }
+ }
+ emit.blank();
+ const allStored = allScopeResults.every(
+ (entry) =>
+ !entry.scopeResults ||
+ entry.scopeResults.every((sr) => sr.status === "stored"),
+ );
+ emit.success(`Synced ${syncedCount} dataset(s).`);
+ emit.blank();
+ if (allStored) {
+ emit.next("vana data list");
+ } else {
+ emit.next("vana server sync");
+ }
+ }
+ return 0;
+}
+
+async function runServerData(
+ scope: string | undefined,
+ options: GlobalOptions,
+): Promise {
+ const emit = createEmitter(options);
+ const target = await detectPersonalServerTarget();
+ const state = await readCliState();
+
+ // Gather locally-known scopes from state
+ const localScopes: Array<{ scope: string; source: string; status: string }> =
+ [];
+ for (const [src, stored] of Object.entries(state.sources)) {
+ if (stored?.ingestScopes) {
+ for (const is of stored.ingestScopes) {
+ localScopes.push({ scope: is.scope, source: src, status: is.status });
+ }
+ }
+ }
+
+ // If PS is available, try to list remote scopes via client
+ let remoteScopes: Array<{ scope: string; count: number }> = [];
+ if (target.state === "available" && target.url) {
+ try {
+ const { createPersonalServerClient: createClient } =
+ await import("../personal-server/client.js");
+ const client = createClient({ url: target.url });
+ remoteScopes = await client.listScopes(scope);
+ } catch {
+ // Auth required or PS unavailable — fall back to local
+ }
+ }
+
+ // Use remote scopes if available, otherwise fall back to local
+ const scopeList =
+ remoteScopes.length > 0
+ ? remoteScopes.map((s) => ({
+ scope: s.scope,
+ detail: `${s.count} version${s.count !== 1 ? "s" : ""}`,
+ }))
+ : localScopes
+ .filter((s) => s.status === "stored")
+ .filter((s) => !scope || s.scope.startsWith(scope))
+ .map((s) => ({ scope: s.scope, detail: "1 version" }));
+
+ if (options.json) {
+ process.stdout.write(
+ `${JSON.stringify({
+ count: scopeList.length,
+ scopes: scopeList,
+ source: remoteScopes.length > 0 ? "remote" : "local",
+ })}\n`,
+ );
+ return 0;
+ }
+
+ if (scopeList.length === 0) {
+ emit.info("No scopes found.");
+ if (target.state !== "available") {
+ emit.detail(
+ "Personal Server is not available. Showing locally-known scopes only.",
+ );
+ }
+ return 0;
+ }
+
+ for (const entry of scopeList) {
+ emit.keyValue(entry.scope, entry.detail, "muted");
+ }
+
+ if (remoteScopes.length === 0 && localScopes.length > 0) {
+ emit.blank();
+ emit.detail(
+ "Showing locally-known scopes. Connect your Personal Server for live data.",
+ );
+ }
+
+ return 0;
+}
+
+function getSourceBadge(source: SourceStatus): {
+ label: string;
+ style: "success" | "warning" | "error" | "muted";
+} {
+ if (
+ source.dataState === "collected_local" ||
+ source.dataState === "ingested_personal_server" ||
+ source.dataState === "ingest_failed"
+ ) {
+ return { label: "connected", style: "success" };
+ }
+
+ if (
+ source.lastRunOutcome === CliOutcomeStatus.NEEDS_INPUT ||
+ source.lastRunOutcome === CliOutcomeStatus.LEGACY_AUTH
+ ) {
+ return { label: "needs login", style: "warning" };
+ }
+
+ if (
+ source.lastRunOutcome === CliOutcomeStatus.RUNTIME_ERROR ||
+ source.lastRunOutcome === CliOutcomeStatus.UNEXPECTED_INTERNAL_ERROR
+ ) {
+ return { label: "error", style: "error" };
+ }
+
+ return { label: "new", style: "muted" };
+}
+
+function isCollectionDue(
+ frequency: string | undefined,
+ lastCollectedAt: string | undefined,
+): boolean {
+ if (!frequency || !lastCollectedAt) {
+ return true;
+ }
+
+ const lastMs = new Date(lastCollectedAt).getTime();
+ if (Number.isNaN(lastMs)) {
+ return true;
+ }
+
+ const now = Date.now();
+ const elapsed = now - lastMs;
+ const intervalMs = parseFrequencyToMs(frequency);
+ return elapsed >= intervalMs;
+}
+
+function parseFrequencyToMs(frequency: string): number {
+ const lower = frequency.toLowerCase().trim();
+ if (lower === "daily") {
+ return 24 * 60 * 60 * 1000;
+ }
+ if (lower === "weekly") {
+ return 7 * 24 * 60 * 60 * 1000;
+ }
+ if (lower === "monthly") {
+ return 30 * 24 * 60 * 60 * 1000;
+ }
+
+ const match = /^(\d+)\s*(h|d|m|w)$/i.exec(lower);
+ if (match) {
+ const value = parseInt(match[1], 10);
+ const unit = match[2].toLowerCase();
+ if (unit === "h") return value * 60 * 60 * 1000;
+ if (unit === "d") return value * 24 * 60 * 60 * 1000;
+ if (unit === "w") return value * 7 * 24 * 60 * 60 * 1000;
+ if (unit === "m") return value * 30 * 24 * 60 * 60 * 1000;
+ }
+
+ // Default to daily if unparseable.
+ return 24 * 60 * 60 * 1000;
+}
+
+async function renderIconInline(source: string): Promise {
+ const iconPath = findCachedIconPath(source);
+ if (!iconPath) {
+ return "";
+ }
+ try {
+ // terminal-image is optional — not in package.json dependencies.
+ // The `as string` cast prevents TypeScript from resolving the module at compile time.
+ const terminalImage = (await import("terminal-image" as string)) as {
+ default: {
+ buffer: (
+ input: Buffer,
+ options?: { width?: number; height?: number },
+ ) => Promise;
+ };
+ };
+ const imageBuffer = await fsp.readFile(iconPath);
+ return await terminalImage.default.buffer(imageBuffer, {
+ width: 2,
+ height: 1,
+ });
+ } catch {
+ return "";
+ }
+}
+
+function findCachedIconPath(source: string): string | null {
+ const cacheDir = getConnectorCacheDir();
+ const extensions = [".png", ".svg", ".jpg", ".jpeg", ".webp"];
+ for (const ext of extensions) {
+ const candidate = path.join(cacheDir, `${source}.icon${ext}`);
+ if (fs.existsSync(candidate)) {
+ return candidate;
+ }
+ }
+ return null;
+}
+
+function createEmitter(options: GlobalOptions): Emitter {
+ const renderer = createHumanRenderer();
+
+ return {
+ event(event: CliEvent | CliOutcome) {
+ if (options.json) {
+ process.stdout.write(`${JSON.stringify(event)}\n`);
+ }
+ },
+ info(message: string) {
+ if (options.json || options.quiet) {
+ return;
+ }
+ process.stdout.write(`${message}\n`);
+ },
+ blank() {
+ if (options.json || options.quiet) {
+ return;
+ }
+ process.stdout.write("\n");
+ },
+ title(message: string) {
+ if (options.json || options.quiet) {
+ return;
+ }
+ process.stdout.write(`${renderer.title(message)}\n`);
+ },
+ success(message: string) {
+ if (options.json || options.quiet) {
+ return;
+ }
+ process.stdout.write(`${renderer.success(message)}\n`);
+ },
+ section(message: string) {
+ if (options.json || options.quiet) {
+ return;
+ }
+ process.stdout.write(`${renderer.section(message)}\n`);
+ },
+ keyValue(label: string, value: string, tone: RenderTone = "muted") {
+ if (options.json || options.quiet) {
+ return;
+ }
+ process.stdout.write(`${renderer.keyValue(label, value, tone)}\n`);
+ },
+ detail(message: string) {
+ if (options.json || options.quiet) {
+ return;
+ }
+ process.stdout.write(`${renderer.detail(message)}\n`);
+ },
+ next(command: string) {
+ if (options.json || options.quiet) {
+ return;
+ }
+ process.stdout.write(
+ ` ${renderer.theme.muted("Next:")} ${renderer.theme.code(command)}\n`,
+ );
+ },
+ bullet(message: string) {
+ if (options.json || options.quiet) {
+ return;
+ }
+ process.stdout.write(`${renderer.bullet(message)}\n`);
+ },
+ sourceTitle(
+ name: string,
+ badges: Array<{ text: string; tone?: RenderTone }> = [],
+ ) {
+ if (options.json || options.quiet) {
+ return;
+ }
+ process.stdout.write(
+ `${renderer.sourceTitle(
+ name,
+ badges.map((badge) => renderer.badge(badge.text, badge.tone)),
+ )}\n`,
+ );
+ },
+ badge(text: string, tone: RenderTone = "muted") {
+ return renderer.badge(text, tone);
+ },
+ code(text: string) {
+ return renderer.theme.code(text);
+ },
+ };
+}
+
+export function displaySource(
+ source: string,
+ labels: SourceLabelMap = {},
+): string {
+ return labels[source] ?? source.charAt(0).toUpperCase() + source.slice(1);
+}
+
+function formatCountLabel(label: string, count: number): string {
+ const normalizedLabel = label.charAt(0).toUpperCase() + label.slice(1);
+ return `${normalizedLabel} (${count})`;
+}
+
+function joinOverviewParts(parts: string[]): string {
+ return parts.filter(Boolean).join(" · ");
+}
+
+function humanizeField(value: string): string {
+ return value
+ .replace(/([a-z])([A-Z])/g, "$1 $2")
+ .replace(/[_-]/g, " ")
+ .replace(/^\w/, (match) => match.toUpperCase());
+}
+
+export function humanizeIssue(message: string): string {
+ if (/checksum|mismatch/i.test(message)) {
+ return "Connector is out of date. Will auto-update on next connect.";
+ }
+ return message;
+}
+
+function formatHumanSourceMessage(
+ message: string,
+ source: string,
+ displayName: string,
+): string {
+ if (!message || source === displayName) {
+ return message;
+ }
+
+ return message.replace(
+ new RegExp(`\\b${escapeRegExp(source)}\\b`, "gi"),
+ displayName,
+ );
+}
+
+function escapeRegExp(value: string): string {
+ return value.replace(/[.*+?^${}()|[\]\\]/g, "\\$&");
+}
+
+function getErrorLogPath(error: unknown): string | null {
+ if (
+ error &&
+ typeof error === "object" &&
+ "logPath" in error &&
+ typeof (error as { logPath?: unknown }).logPath === "string"
+ ) {
+ return (error as { logPath: string }).logPath;
+ }
+
+ return null;
+}
+
+export async function gatherSourceStatuses(
+ storedSources: Record<
+ string,
+ Awaited>["sources"][string]
+ >,
+ metadata: SourceMetadataMap = {},
+): Promise {
+ const installedFiles = await listInstalledConnectorFiles();
+ const sourceNames = new Set([
+ ...Object.keys(storedSources),
+ ...installedFiles.map((file) => file.source),
+ ]);
+
+ return [...sourceNames]
+ .map((source): SourceStatus => {
+ const stored = storedSources[source] ?? {};
+ const installed = installedFiles.some((file) => file.source === source);
+ const details = metadata[source];
+ const dataState: SourceStatus["dataState"] =
+ stored.dataState === "ingested_personal_server"
+ ? "ingested_personal_server"
+ : stored.dataState === "ingest_failed"
+ ? "ingest_failed"
+ : stored.dataState === "collected_local"
+ ? "collected_local"
+ : "none";
+ const ingestScopes = stored.ingestScopes;
+ const syncedScopeCount =
+ ingestScopes?.filter((s) => s.status === "stored").length ?? 0;
+ const failedScopeCount =
+ ingestScopes?.filter((s) => s.status === "failed").length ?? 0;
+ return {
+ source,
+ name: details?.name,
+ company: details?.company,
+ description: details?.description,
+ authMode:
+ details?.authMode ?? inferInstalledAuthMode(installedFiles, source),
+ connectorVersion: stored.connectorVersion,
+ exportFrequency: stored.exportFrequency,
+ lastCollectedAt: stored.lastCollectedAt,
+ installed,
+ sessionPresent: stored.sessionPresent ?? false,
+ lastRunAt: stored.lastRunAt ?? null,
+ lastRunOutcome: stored.lastRunOutcome ?? null,
+ dataState,
+ lastError: stored.lastError ?? null,
+ lastResultPath: stored.lastResultPath ?? null,
+ lastLogPath: stored.lastLogPath ?? null,
+ ingestScopes,
+ syncedScopeCount: syncedScopeCount > 0 ? syncedScopeCount : undefined,
+ failedScopeCount: failedScopeCount > 0 ? failedScopeCount : undefined,
+ };
+ })
+ .sort(compareSourceStatusOrder);
+}
+
+export async function listInstalledConnectorFiles(): Promise<
+ Array<{ source: string; path: string }>
+> {
+ const connectorsDir = getConnectorCacheDir();
+ try {
+ const results: Array<{ source: string; path: string }> = [];
+ const entries = await fsp.readdir(connectorsDir, { withFileTypes: true });
+ for (const entry of entries) {
+ if (!entry.isDirectory()) {
+ continue;
+ }
+ const companyDir = path.join(connectorsDir, entry.name);
+ const files = await fsp.readdir(companyDir);
+ for (const file of files) {
+ if (!file.endsWith("-playwright.js")) {
+ continue;
+ }
+ results.push({
+ source: file.replace(/-playwright\.js$/, ""),
+ path: path.join(companyDir, file),
+ });
+ }
+ }
+ return results;
+ } catch {
+ return [];
+ }
+}
+
+function formatSourceStatusDetails(source: SourceStatus): SourceStatusDetail[] {
+ const details: SourceStatusDetail[] = [];
+ const displayName = source.name ?? displaySource(source.source);
+
+ if (source.lastRunOutcome === CliOutcomeStatus.NEEDS_INPUT) {
+ details.push(
+ source.lastError
+ ? {
+ kind: "text",
+ message: `${source.lastError}. Run \`vana connect ${source.source}\` interactively.`,
+ }
+ : {
+ kind: "text",
+ message: `Run \`vana connect ${source.source}\` interactively.`,
+ },
+ );
+ }
+
+ if (source.lastRunOutcome === CliOutcomeStatus.LEGACY_AUTH) {
+ details.push({
+ kind: "text",
+ message: `Run \`vana connect ${source.source}\` without \`--no-input\` to complete the manual browser step.`,
+ });
+ }
+
+ if (source.lastRunOutcome === CliOutcomeStatus.RUNTIME_ERROR) {
+ details.push(
+ source.lastError
+ ? {
+ kind: "text",
+ message: formatHumanSourceMessage(
+ source.lastError,
+ source.source,
+ displayName,
+ ),
+ }
+ : {
+ kind: "text",
+ message: "The last connector run failed.",
+ },
+ );
+ }
+
+ if (source.lastRunOutcome === CliOutcomeStatus.CONNECTOR_UNAVAILABLE) {
+ details.push(
+ source.lastError
+ ? {
+ kind: "text",
+ message: formatHumanSourceMessage(
+ source.lastError,
+ source.source,
+ displayName,
+ ),
+ }
+ : {
+ kind: "text",
+ message: "No connector is available for this source.",
+ },
+ );
+ }
+
+ if (!source.lastRunOutcome && source.installed) {
+ details.push({
+ kind: "text",
+ message: `Run \`vana connect ${source.source}\` to collect data.`,
+ });
+ }
+
+ if (
+ source.lastRunOutcome === CliOutcomeStatus.CONNECTED_LOCAL_ONLY &&
+ source.lastResultPath
+ ) {
+ details.push({
+ kind: "text",
+ message: `Inspect the latest local dataset with \`vana data show ${source.source}\`.`,
+ });
+ }
+
+ if (
+ source.sessionPresent &&
+ (source.lastRunOutcome === CliOutcomeStatus.CONNECTED_LOCAL_ONLY ||
+ source.lastRunOutcome === CliOutcomeStatus.CONNECTED_AND_INGESTED ||
+ source.lastRunOutcome === CliOutcomeStatus.INGEST_FAILED)
+ ) {
+ details.push({
+ kind: "row",
+ label: "Session",
+ value: "Session cached.",
+ tone: "muted",
+ });
+ }
+
+ if (source.lastRunOutcome === CliOutcomeStatus.CONNECTED_AND_INGESTED) {
+ details.push({
+ kind: "text",
+ message: `Inspect the latest local dataset with \`vana data show ${source.source}\` or use your Personal Server copy.`,
+ });
+ }
+
+ if (source.lastRunOutcome === CliOutcomeStatus.INGEST_FAILED) {
+ details.push(
+ source.lastError
+ ? {
+ kind: "text",
+ message: `${source.lastError} Inspect the local dataset with \`vana data show ${source.source}\`.`,
+ }
+ : {
+ kind: "text",
+ message: `Personal Server sync failed. Inspect the local dataset with \`vana data show ${source.source}\`.`,
+ },
+ );
+ }
+
+ if (source.dataState === "ingested_personal_server") {
+ details.push({
+ kind: "row",
+ label: "State",
+ value: "Synced to Personal Server",
+ tone: "success",
+ });
+ } else if (source.dataState === "ingest_failed") {
+ details.push({
+ kind: "row",
+ label: "State",
+ value: "Saved locally, sync failed",
+ tone: "warning",
+ });
+ } else if (source.dataState === "collected_local") {
+ details.push({
+ kind: "row",
+ label: "State",
+ value: "Saved locally",
+ tone: "muted",
+ });
+ }
+
+ if (source.lastRunAt) {
+ details.push({
+ kind: "row",
+ label: "Updated",
+ value: `${formatTimestamp(source.lastRunAt)} (${formatRelativeTime(source.lastRunAt)})`,
+ tone: "muted",
+ });
+ }
+
+ if (source.lastResultPath && source.dataState !== "none") {
+ details.push({
+ kind: "row",
+ label: "Path",
+ value: formatDisplayPath(source.lastResultPath),
+ tone: "muted",
+ });
+ }
+
+ if (
+ source.lastLogPath &&
+ source.lastRunOutcome &&
+ source.lastRunOutcome !== CliOutcomeStatus.CONNECTED_LOCAL_ONLY &&
+ source.lastRunOutcome !== CliOutcomeStatus.CONNECTED_AND_INGESTED
+ ) {
+ details.push({
+ kind: "row",
+ label: "Run log",
+ value: formatDisplayPath(source.lastLogPath),
+ tone: "muted",
+ });
+ }
+
+ return details;
+}
+
+export function buildStatusNextSteps(
+ sources: SourceStatus[],
+ sourceLabels: SourceLabelMap = {},
+ runtime: CliStatus["runtime"] = "unhealthy",
+ availableSources: Array<{ id: string; name: string; authMode?: string }> = [],
+): string[] {
+ const nextSteps: string[] = [];
+ const highestPriority = [...sources].sort(compareSourceStatusOrder)[0];
+ const connectedSources = sources.filter(
+ (source) =>
+ source.dataState === "collected_local" ||
+ source.dataState === "ingested_personal_server" ||
+ source.dataState === "ingest_failed",
+ );
+ const needsAttention = sources.some(
+ (source) => rankSourceStatus(source) <= 4,
+ );
+ const highestPriorityLabel = highestPriority
+ ? displaySource(highestPriority.source, sourceLabels)
+ : null;
+ const suggestedSource =
+ availableSources.find((source) => source.authMode !== "legacy") ??
+ availableSources[0];
+
+ if (!highestPriority) {
+ if (runtime === "installed") {
+ if (suggestedSource) {
+ nextSteps.push(
+ `Connect ${suggestedSource.name} with \`vana connect ${suggestedSource.id}\`.`,
+ );
+ } else {
+ nextSteps.push("Connect your first source with `vana connect`.");
+ }
+ } else if (runtime === "missing") {
+ nextSteps.push("Install the local runtime with `vana setup`.");
+ nextSteps.push("Inspect install health with `vana doctor`.");
+ } else if (runtime === "unhealthy") {
+ nextSteps.push("Inspect install health with `vana doctor`.");
+ }
+ } else if (highestPriority.lastRunOutcome === CliOutcomeStatus.NEEDS_INPUT) {
+ nextSteps.push(
+ `Continue ${highestPriorityLabel} with \`vana connect ${highestPriority.source}\`.`,
+ );
+ if (highestPriority.lastLogPath) {
+ nextSteps.push(
+ `Inspect the latest run log with \`vana logs ${highestPriority.source}\`.`,
+ );
+ }
+ } else if (highestPriority.lastRunOutcome === CliOutcomeStatus.LEGACY_AUTH) {
+ nextSteps.push(
+ `Complete the manual browser step for ${highestPriorityLabel} with \`vana connect ${highestPriority.source}\`.`,
+ );
+ if (highestPriority.lastLogPath) {
+ nextSteps.push(
+ `Inspect the latest run log with \`vana logs ${highestPriority.source}\`.`,
+ );
+ }
+ } else if (
+ highestPriority.lastRunOutcome === CliOutcomeStatus.CONNECTOR_UNAVAILABLE
+ ) {
+ nextSteps.push("Browse available sources with `vana sources`.");
+ if (highestPriority.lastLogPath) {
+ nextSteps.push(
+ `Inspect the latest run log with \`vana logs ${highestPriority.source}\`.`,
+ );
+ }
+ } else if (
+ highestPriority.dataState === "collected_local" ||
+ highestPriority.dataState === "ingested_personal_server" ||
+ highestPriority.dataState === "ingest_failed"
+ ) {
+ if (connectedSources.length > 1) {
+ nextSteps.push("Review your collected data with `vana data list`.");
+ } else {
+ nextSteps.push(
+ `Inspect the latest dataset with \`vana data show ${highestPriority.source}\`.`,
+ );
+ }
+ }
+
+ if (connectedSources.length > 0 && needsAttention) {
+ nextSteps.push(
+ connectedSources.length > 1
+ ? "Review the data you already collected with `vana data list`."
+ : `Inspect the data you already collected with \`vana data show ${connectedSources[0].source}\`.`,
+ );
+ }
+
+ if (
+ sources.some((source) => source.installed || source.lastRunOutcome) &&
+ (!needsAttention || connectedSources.length === 0)
+ ) {
+ nextSteps.push("Connect another source with `vana sources`.");
+ }
+
+ if (
+ runtime !== "installed" ||
+ sources.some(
+ (source) =>
+ source.lastRunOutcome === CliOutcomeStatus.RUNTIME_ERROR ||
+ source.lastRunOutcome === CliOutcomeStatus.UNEXPECTED_INTERNAL_ERROR,
+ )
+ ) {
+ nextSteps.push("Inspect install health with `vana doctor`.");
+ }
+
+ return [...new Set(nextSteps)];
+}
+
+export function buildSourcesNextSteps(
+ recommendedSource:
+ | {
+ id: string;
+ name: string;
+ authMode?: "automated" | "interactive" | "legacy";
+ }
+ | null
+ | undefined,
+ connectedCount: number,
+): string[] {
+ const nextSteps: string[] = [];
+
+ if (connectedCount > 0) {
+ nextSteps.push("Inspect what you already collected with `vana data list`.");
+ }
+ if (recommendedSource) {
+ nextSteps.push(
+ `${
+ recommendedSource.authMode === "legacy" ? "Complete" : "Connect"
+ } ${recommendedSource.name} with \`vana connect ${recommendedSource.id}\`.`,
+ );
+ }
+ nextSteps.push("Or browse the guided picker with `vana connect`.");
+
+ return [...new Set(nextSteps)];
+}
+
+export function buildDataListNextSteps(
+ datasetRecords: Array<{
+ source: string;
+ name?: string | null;
+ }>,
+ registrySources: Array<{
+ id: string;
+ authMode?: "automated" | "interactive" | "legacy";
+ }>,
+): string[] {
+ if (datasetRecords.length === 0) {
+ const suggestedSource =
+ registrySources.find((source) => source.authMode !== "legacy") ??
+ registrySources[0];
+
+ return [
+ suggestedSource
+ ? `Collect your first dataset with \`vana connect ${suggestedSource.id}\`.`
+ : "Collect your first dataset with `vana connect`.",
+ "Check overall status with `vana status`.",
+ ];
+ }
+
+ return [
+ `Inspect ${datasetRecords[0].name ?? displaySource(datasetRecords[0].source)} with \`vana data show ${datasetRecords[0].source}\`.`,
+ `Or print its path with \`vana data path ${datasetRecords[0].source}\`.`,
+ "Connect another source with `vana sources`.",
+ ];
+}
+
+export function buildDataShowNextSteps(
+ source: string,
+ datasetCount: number,
+ sourceLabels: SourceLabelMap = {},
+): string[] {
+ return [
+ `Print the path with \`vana data path ${source}\`.`,
+ `Reconnect ${displaySource(source, sourceLabels)} with \`vana connect ${source}\`.`,
+ ...(datasetCount > 1
+ ? ["See all datasets with `vana data list`."]
+ : ["Connect another source with `vana sources`."]),
+ ];
+}
+
+function buildLogsNextSteps(
+ records: Array<{
+ source: string;
+ lastRunOutcome: string | null;
+ dataState: SourceStatus["dataState"] | null;
+ }>,
+): string[] {
+ if (records.length === 0) {
+ return [
+ "Run `vana connect ` to create a connector run log.",
+ "Check overall status with `vana status`.",
+ ];
+ }
+
+ const attentionRecord = records.find((record) =>
+ isAttentionLog(record.lastRunOutcome, record.dataState),
+ );
+ const successfulRecord = records.find(
+ (record) => !isAttentionLog(record.lastRunOutcome, record.dataState),
+ );
+ return [
+ attentionRecord
+ ? `Inspect the latest issue log with \`vana logs ${attentionRecord.source}\`.`
+ : `Print the latest log path with \`vana logs ${records[0].source}\`.`,
+ ...(successfulRecord
+ ? [
+ `Inspect a successful run with \`vana logs ${successfulRecord.source}\`.`,
+ ]
+ : []),
+ "Check overall status with `vana status`.",
+ ];
+}
+
+/** Extract a `vana ...` command from a next-step sentence wrapped in backticks. */
+function extractCommand(sentence: string): string | null {
+ const match = sentence.match(/`(vana\s[^`]+)`/);
+ return match ? match[1] : null;
+}
+
+// describeConnectTrust and buildConnectChoices removed — replaced by clack-based picker
+
+function formatMissingConnectSourceMessage(
+ source:
+ | {
+ id: string;
+ name: string;
+ }
+ | undefined,
+): string {
+ if (source) {
+ return `Specify a source. Start with \`vana connect ${source.id}\`, or run \`vana sources\` to see available options.`;
+ }
+
+ return "Specify a source. Run `vana sources` to see available options.";
+}
+
+// formatSourcePickerDescription removed — replaced by clack-based picker with hints
+
+function normalizeArgv(argv: string[]): string[] {
+ if (
+ argv[2] === "connect" &&
+ ["list", "status", "setup"].includes(argv[3] ?? "")
+ ) {
+ const mapping: Record = {
+ list: "sources",
+ status: "status",
+ setup: "setup",
+ };
+ return [argv[0], argv[1], mapping[argv[3]], ...argv.slice(4)];
+ }
+
+ return argv;
+}
+
+export function getCliVersion(): string {
+ if (process.env.VANA_APP_ROOT) {
+ try {
+ const packageJson = JSON.parse(
+ fs.readFileSync(
+ path.join(process.env.VANA_APP_ROOT, "package.json"),
+ "utf8",
+ ),
+ ) as { version?: string };
+ if (packageJson.version) {
+ return packageJson.version;
+ }
+ } catch {
+ // Fall through to the repo/dev package metadata.
+ }
+ }
+
+ try {
+ const packageJson = require("../../package.json") as { version?: string };
+ if (packageJson.version) {
+ return packageJson.version;
+ }
+ } catch {
+ // Fall through to the hard default.
+ }
+
+ return "0.0.0";
+}
+
+export function getCliChannel(version = getCliVersion()): "stable" | "canary" {
+ if (version.includes("canary")) {
+ return "canary";
+ }
+
+ const candidates = [process.env.VANA_APP_ROOT ?? "", process.execPath].map(
+ (value) => value.replace(/\\/g, "/").toLowerCase(),
+ );
+
+ return candidates.some((normalizedPath) =>
+ /\/releases\/canary-[^/]+(?:\/app)?$/.test(normalizedPath),
+ )
+ ? "canary"
+ : "stable";
+}
+
+export function getCliInstallMethod(
+ execPath = process.execPath,
+): CliInstallMethod {
+ const candidates = [process.env.VANA_APP_ROOT ?? "", execPath].map((value) =>
+ value.replace(/\\/g, "/").toLowerCase(),
+ );
+
+ for (const normalizedPath of candidates) {
+ if (!normalizedPath) {
+ continue;
+ }
+ if (normalizedPath.includes("/cellar/vana/")) {
+ return "homebrew";
+ }
+ if (
+ normalizedPath.includes("/.local/share/vana/") ||
+ normalizedPath.includes("/appdata/local/vana/") ||
+ normalizedPath.endsWith("/current/app") ||
+ /\/releases\/[^/]+\/app$/.test(normalizedPath)
+ ) {
+ return "installer";
+ }
+ if (
+ normalizedPath.endsWith("/node") ||
+ normalizedPath.endsWith("/node.exe") ||
+ normalizedPath.includes("/.nvm/") ||
+ normalizedPath.includes("/volta/") ||
+ normalizedPath.includes("/pnpm/")
+ ) {
+ return "development";
+ }
+ }
+ return "unknown";
+}
+
+function getCliAppRoot(execPath = process.execPath): string {
+ return process.env.VANA_APP_ROOT ?? path.join(path.dirname(execPath), "app");
+}
+
+export function getDoctorAppRootPath(
+ installMethod: CliInstallMethod,
+ execPath = process.execPath,
+): string | null {
+ if (process.env.VANA_APP_ROOT) {
+ return process.env.VANA_APP_ROOT;
+ }
+ if (installMethod === "homebrew" || installMethod === "installer") {
+ return getCliAppRoot(execPath);
+ }
+ return null;
+}
+
+export function formatInstallMethodLabel(method: CliInstallMethod): string {
+ switch (method) {
+ case "homebrew":
+ return "Homebrew";
+ case "installer":
+ return "Hosted installer";
+ case "development":
+ return "Development checkout";
+ default:
+ return "Unknown";
+ }
+}
+
+export function getLifecycleCommands(
+ installMethod: CliInstallMethod,
+ channel: CliChannel,
+): { upgrade: string; uninstall: string } {
+ switch (installMethod) {
+ case "homebrew":
+ return {
+ upgrade: "brew update && brew upgrade vana",
+ uninstall: "brew uninstall vana",
+ };
+ case "installer":
+ return {
+ upgrade:
+ channel === "canary"
+ ? "curl -fsSL https://raw.githubusercontent.com/vana-com/vana-connect/feat/connect-cli-v1/install/install.sh | sh -s -- --version canary-feat-connect-cli-v1"
+ : "curl -fsSL https://raw.githubusercontent.com/vana-com/vana-connect/main/install/install.sh | sh",
+ uninstall:
+ "rm -f ~/.local/bin/vana && rm -rf ~/.local/share/vana ~/.vana",
+ };
+ case "development":
+ return {
+ upgrade: "git pull && pnpm install && pnpm build",
+ uninstall: "Remove the local checkout and any generated ~/.vana state.",
+ };
+ default:
+ return {
+ upgrade: "Reinstall vana using Homebrew or the hosted installer.",
+ uninstall:
+ "Remove the installed vana binary and any ~/.vana state you no longer need.",
+ };
+ }
+}
+
+function extractGlobalOptions(argv: string[]): GlobalOptions {
+ return {
+ json: argv.includes("--json"),
+ noInput: argv.includes("--no-input"),
+ ipc: argv.includes("--ipc"),
+ yes: argv.includes("--yes"),
+ quiet: argv.includes("--quiet"),
+ detach: argv.includes("--detach"),
+ };
+}
+
+export function createSourceLabelMap(
+ sources: Array<{ id: string; name: string }>,
+): SourceLabelMap {
+ return Object.fromEntries(sources.map((source) => [source.id, source.name]));
+}
+
+export function createSourceMetadataMap(
+ sources: Array<{
+ id: string;
+ name: string;
+ company?: string;
+ description?: string;
+ authMode?: "automated" | "interactive" | "legacy";
+ }>,
+): SourceMetadataMap {
+ return Object.fromEntries(
+ sources.map((source) => [
+ source.id,
+ {
+ name: source.name,
+ company: source.company,
+ description: source.description,
+ authMode: source.authMode,
+ },
+ ]),
+ );
+}
+
+// formatAuthModeBadge removed — replaced by clack-based picker with hints
+
+export function getSourceStatusPresentation(source: SourceStatus): {
+ label: string;
+ tone: RenderTone;
+} {
+ if (!source.installed && !source.lastRunOutcome) {
+ return { label: "not connected", tone: "muted" };
+ }
+
+ if (!source.lastRunOutcome) {
+ return { label: "installed", tone: "success" };
+ }
+
+ if (source.lastRunOutcome === CliOutcomeStatus.NEEDS_INPUT) {
+ return { label: "needs input", tone: "warning" };
+ }
+
+ if (source.lastRunOutcome === CliOutcomeStatus.RUNTIME_ERROR) {
+ return { label: "error", tone: "error" };
+ }
+
+ if (source.lastRunOutcome === CliOutcomeStatus.CONNECTOR_UNAVAILABLE) {
+ return { label: "unavailable", tone: "warning" };
+ }
+
+ if (source.lastRunOutcome === CliOutcomeStatus.LEGACY_AUTH) {
+ return { label: "manual step", tone: "warning" };
+ }
+
+ if (source.dataState === "ingested_personal_server") {
+ // Check per-scope state for more granular badges
+ if (source.ingestScopes && source.ingestScopes.length > 0) {
+ const storedCount = source.ingestScopes.filter(
+ (s) => s.status === "stored",
+ ).length;
+ const failedCount = source.ingestScopes.filter(
+ (s) => s.status === "failed",
+ ).length;
+ if (failedCount > 0 && storedCount > 0) {
+ return { label: "partial sync", tone: "warning" };
+ }
+ if (failedCount > 0 && storedCount === 0) {
+ return { label: "sync failed", tone: "error" };
+ }
+ }
+ return { label: "synced", tone: "success" };
+ }
+
+ if (source.dataState === "collected_local") {
+ return { label: "local", tone: "muted" };
+ }
+
+ if (source.dataState === "ingest_failed") {
+ return { label: "sync failed", tone: "error" };
+ }
+
+ return { label: "connected", tone: "success" };
+}
+
+export function toneForRuntime(runtime: CliStatus["runtime"]): RenderTone {
+ if (runtime === "installed") {
+ return "success";
+ }
+ if (runtime === "missing") {
+ return "warning";
+ }
+ return "muted";
+}
+
+// formatProgressUpdate removed — replaced by ConnectRenderer scope methods
+
+/**
+ * Extract a human-readable scope name from a progress-update event.
+ * The scope name comes from `phase.label` when `phase` is a structured object.
+ */
+function extractScopeName(event: {
+ phase?: unknown;
+ message?: string;
+}): string | null {
+ if (
+ event.phase &&
+ typeof event.phase === "object" &&
+ "label" in event.phase &&
+ typeof (event.phase as { label?: unknown }).label === "string"
+ ) {
+ return (event.phase as { label: string }).label;
+ }
+ return null;
+}
+
+/**
+ * Format detail text for a completed scope (e.g. "8 found").
+ * Extracts count from event.count or parses it from the message.
+ */
+function formatScopeDetail(event: {
+ count?: number;
+ message?: string;
+}): string | undefined {
+ if (typeof event.count === "number") {
+ return `${event.count} found`;
+ }
+ // Try to extract a count from the completion message (e.g. "Complete! 8 repositories collected.")
+ if (typeof event.message === "string") {
+ const match = event.message.match(/(\d+)\s+\w+/);
+ if (match) {
+ return match[0];
+ }
+ }
+ return undefined;
+}
+
+// shouldRenderStatusUpdate removed — status updates are silent in the new design
+
+function inferInstalledAuthMode(
+ installedFiles: Array<{ source: string; path: string }>,
+ source: string,
+): "automated" | "interactive" | "legacy" | undefined {
+ const match = installedFiles.find((file) => file.source === source);
+ if (!match) {
+ return undefined;
+ }
+
+ try {
+ const script = fs.readFileSync(match.path, "utf8");
+ if (/page\.requestInput\(/.test(script)) {
+ return "interactive";
+ }
+ if (/page\.(showBrowser|promptUser)\(/.test(script)) {
+ return "legacy";
+ }
+ return "automated";
+ } catch {
+ return undefined;
+ }
+}
+
+export async function loadRegistrySources() {
+ try {
+ return (
+ (await listAvailableSources(findDataConnectorsDir() ?? undefined)) ?? []
+ ).sort(compareRegistrySourceOrder);
+ } catch {
+ return [];
+ }
+}
+
+function compareRegistrySourceOrder(
+ left: AvailableSource,
+ right: AvailableSource,
+): number {
+ return (
+ rankAuthMode(left.authMode) - rankAuthMode(right.authMode) ||
+ left.name.localeCompare(right.name, undefined, { sensitivity: "base" })
+ );
+}
+
+export function compareSourceStatusOrder(
+ left: SourceStatus,
+ right: SourceStatus,
+): number {
+ return (
+ rankSourceStatus(left) - rankSourceStatus(right) ||
+ compareRegistrySourceOrder(
+ {
+ id: left.source,
+ name: left.name ?? displaySource(left.source),
+ authMode: left.authMode,
+ },
+ {
+ id: right.source,
+ name: right.name ?? displaySource(right.source),
+ authMode: right.authMode,
+ },
+ )
+ );
+}
+
+export function rankSourceStatus(source: SourceStatus): number {
+ if (source.lastRunOutcome === CliOutcomeStatus.NEEDS_INPUT) {
+ return 0;
+ }
+ if (source.lastRunOutcome === CliOutcomeStatus.LEGACY_AUTH) {
+ return 1;
+ }
+ if (source.lastRunOutcome === CliOutcomeStatus.INGEST_FAILED) {
+ return 2;
+ }
+ if (source.lastRunOutcome === CliOutcomeStatus.RUNTIME_ERROR) {
+ return 3;
+ }
+ if (source.lastRunOutcome === CliOutcomeStatus.CONNECTOR_UNAVAILABLE) {
+ return 4;
+ }
+ if (source.dataState === "ingested_personal_server") {
+ return 5;
+ }
+ if (source.dataState === "collected_local") {
+ return 6;
+ }
+ if (source.installed) {
+ return 7;
+ }
+ return 8;
+}
+
+function rankAuthMode(authMode: AvailableSource["authMode"]): number {
+ if (authMode === "interactive") {
+ return 0;
+ }
+ if (authMode === "automated") {
+ return 1;
+ }
+ if (authMode === "legacy") {
+ return 2;
+ }
+ return 3;
+}
+
+export async function readResultSummary(
+ resultPath: string,
+): Promise<{ lines: string[] } | null> {
+ try {
+ const raw = await fsp.readFile(resultPath, "utf8");
+ return summarizeResultData(JSON.parse(raw) as Record);
+ } catch {
+ return null;
+ }
+}
+
+export function summarizeResultData(
+ data: Record,
+): { lines: string[] } | null {
+ const lines: string[] = [];
+ const exportSummary =
+ typeof data.exportSummary === "object" && data.exportSummary
+ ? (data.exportSummary as Record)
+ : null;
+ const profile =
+ typeof data.profile === "object" && data.profile
+ ? (data.profile as Record)
+ : null;
+
+ if (profile?.username && typeof profile.username === "string") {
+ lines.push(`Profile: ${profile.username}`);
+ }
+
+ if (Array.isArray(data.repositories)) {
+ lines.push(`Repositories: ${data.repositories.length}`);
+ const preview = summarizeNamedItems(data.repositories, "Latest repos");
+ if (preview) {
+ lines.push(preview);
+ }
+ }
+
+ if (Array.isArray(data.starred)) {
+ lines.push(`Starred: ${data.starred.length}`);
+ }
+
+ if (Array.isArray(data.orders)) {
+ lines.push(`Orders: ${data.orders.length}`);
+ }
+
+ if (Array.isArray(data.playlists)) {
+ lines.push(`Playlists: ${data.playlists.length}`);
+ const preview = summarizeNamedItems(data.playlists, "Playlists");
+ if (preview) {
+ lines.push(preview);
+ }
+ }
+
+ if (
+ exportSummary?.details &&
+ typeof exportSummary.details === "string" &&
+ !lines.includes(exportSummary.details) &&
+ !Array.isArray(data.repositories) &&
+ !Array.isArray(data.starred) &&
+ !Array.isArray(data.orders) &&
+ !Array.isArray(data.playlists)
+ ) {
+ lines.push(exportSummary.details);
+ }
+
+ return lines.length > 0 ? { lines } : null;
+}
+
+function summarizeNamedItems(
+ items: unknown[],
+ label: string,
+ maxItems = 2,
+): string | null {
+ const names = items
+ .map((item) => {
+ if (
+ typeof item === "object" &&
+ item &&
+ "name" in item &&
+ typeof (item as { name?: unknown }).name === "string"
+ ) {
+ return (item as { name: string }).name;
+ }
+ return null;
+ })
+ .filter((value): value is string => Boolean(value))
+ .slice(0, maxItems);
+
+ if (names.length === 0) {
+ return null;
+ }
+
+ return `${label}: ${names.join(", ")}`;
+}
+
+export function formatTimestamp(value: string): string {
+ const date = new Date(value);
+ if (Number.isNaN(date.getTime())) {
+ return value;
+ }
+
+ return new Intl.DateTimeFormat(undefined, {
+ dateStyle: "medium",
+ timeStyle: "short",
+ }).format(date);
+}
+
+export function compareDatasetOrder(
+ left: {
+ lastRunAt: string | null;
+ name: string | undefined;
+ source: string;
+ },
+ right: {
+ lastRunAt: string | null;
+ name: string | undefined;
+ source: string;
+ },
+): number {
+ const leftTime = left.lastRunAt ? Date.parse(left.lastRunAt) : 0;
+ const rightTime = right.lastRunAt ? Date.parse(right.lastRunAt) : 0;
+ return (
+ rightTime - leftTime ||
+ (left.name ?? left.source).localeCompare(
+ right.name ?? right.source,
+ undefined,
+ {
+ sensitivity: "base",
+ },
+ )
+ );
+}
+
+function compareLogRecordOrder(
+ left: {
+ source: string;
+ lastRunAt: string | null;
+ },
+ right: {
+ source: string;
+ lastRunAt: string | null;
+ },
+): number {
+ const leftTimestamp = left.lastRunAt ? Date.parse(left.lastRunAt) : 0;
+ const rightTimestamp = right.lastRunAt ? Date.parse(right.lastRunAt) : 0;
+ return (
+ rightTimestamp - leftTimestamp ||
+ left.source.localeCompare(right.source, undefined, {
+ sensitivity: "base",
+ })
+ );
+}
+
+export function hasCollectedData(
+ dataState: SourceStatus["dataState"] | null | undefined,
+): boolean {
+ return (
+ dataState === "collected_local" ||
+ dataState === "ingested_personal_server" ||
+ dataState === "ingest_failed"
+ );
+}
+
+function formatLogOutcomeLabel(
+ lastRunOutcome: string | null,
+ dataState: SourceStatus["dataState"] | null,
+): string {
+ if (lastRunOutcome === CliOutcomeStatus.CONNECTOR_UNAVAILABLE) {
+ return "unavailable";
+ }
+ if (lastRunOutcome === CliOutcomeStatus.LEGACY_AUTH) {
+ return "manual step";
+ }
+ if (lastRunOutcome === CliOutcomeStatus.RUNTIME_ERROR) {
+ return "error";
+ }
+ if (lastRunOutcome === CliOutcomeStatus.NEEDS_INPUT) {
+ return "needs input";
+ }
+ if (dataState === "ingested_personal_server") {
+ return "synced";
+ }
+ if (dataState === "ingest_failed") {
+ return "sync failed";
+ }
+ if (dataState === "collected_local") {
+ return "local";
+ }
+ return "recent";
+}
+
+function isAttentionLog(
+ lastRunOutcome: string | null,
+ dataState: SourceStatus["dataState"] | null,
+): boolean {
+ return !(
+ dataState === "collected_local" ||
+ dataState === "ingested_personal_server" ||
+ lastRunOutcome === CliOutcomeStatus.CONNECTED_LOCAL_ONLY ||
+ lastRunOutcome === CliOutcomeStatus.CONNECTED_AND_INGESTED
+ );
+}
+
+function toneForLogOutcome(
+ lastRunOutcome: string | null,
+ dataState: SourceStatus["dataState"] | null,
+): RenderTone {
+ if (lastRunOutcome === CliOutcomeStatus.RUNTIME_ERROR) {
+ return "error";
+ }
+ if (
+ lastRunOutcome === CliOutcomeStatus.CONNECTOR_UNAVAILABLE ||
+ lastRunOutcome === CliOutcomeStatus.LEGACY_AUTH ||
+ lastRunOutcome === CliOutcomeStatus.NEEDS_INPUT ||
+ dataState === "ingest_failed"
+ ) {
+ return "warning";
+ }
+ if (dataState === "ingested_personal_server") {
+ return "success";
+ }
+ if (dataState === "collected_local") {
+ return "muted";
+ }
+ return "muted";
+}
+
+function toneForHealth(health: string): RenderTone {
+ switch (health) {
+ case "healthy":
+ return "accent";
+ case "needs_reauth":
+ return "warning";
+ case "error":
+ return "error";
+ case "stale":
+ return "muted";
+ default:
+ return "muted";
+ }
+}
+
+// ---------------------------------------------------------------------------
+// Detach (background process)
+// ---------------------------------------------------------------------------
+
+async function runDetached(
+ command: string,
+ source: string,
+ options: GlobalOptions,
+): Promise {
+ const emit = createEmitter(options);
+ const registrySources = await loadRegistrySources();
+ const sourceLabels = createSourceLabelMap(registrySources);
+ const displayName = displaySource(source, sourceLabels);
+
+ // Check if source has been previously connected (has a session to reuse).
+ // Detach is for re-collection with existing sessions, not first-time auth.
+ const state = await readCliState();
+ const sourceState = state.sources[source];
+ if (!sourceState?.lastResultPath && !sourceState?.sessionPresent) {
+ emit.info(
+ `Run ${emit.code(`vana connect ${source}`)} first to authenticate.`,
+ );
+ emit.detail(
+ "Use --detach for background re-collection after the first connect.",
+ );
+ return 1;
+ }
+
+ const sessionsDir = getSessionsDir();
+ const logsDir = getLogsDir();
+ await fsp.mkdir(sessionsDir, { recursive: true });
+ await fsp.mkdir(logsDir, { recursive: true });
+
+ const logPath = path.join(logsDir, `${source}-detach.log`);
+ const sessionPath = path.join(sessionsDir, `${source}.json`);
+
+ const logFd = fs.openSync(logPath, "a");
+
+ // --no-input: if auth is needed, fail fast and record needs_reauth.
+ // Don't use --ipc: nobody is watching a detached process.
+ const childArgs = [
+ process.argv[1],
+ command,
+ source,
+ "--json",
+ "--quiet",
+ "--no-input",
+ ];
+ const child = spawn(process.execPath, childArgs, {
+ detached: true,
+ stdio: ["ignore", logFd, logFd],
+ env: { ...process.env, VANA_DETACHED: "1" },
+ });
+ child.unref();
+ fs.closeSync(logFd);
+
+ // Write session file
+ const session = {
+ source,
+ command,
+ pid: child.pid,
+ startedAt: new Date().toISOString(),
+ status: "running",
+ logPath,
+ };
+ await fsp.writeFile(sessionPath, `${JSON.stringify(session, null, 2)}\n`);
+
+ if (options.json) {
+ process.stdout.write(`${JSON.stringify(session)}\n`);
+ return 0;
+ }
+
+ const verb = command === "connect" ? "Connecting" : "Collecting";
+ emit.info(`${verb} ${displayName} in the background.`);
+ emit.detail(`Check progress: ${emit.code("vana status")}`);
+ return 0;
+}
+
+// ---------------------------------------------------------------------------
+// Schedule commands
+// ---------------------------------------------------------------------------
+
+const LAUNCHD_LABEL = "com.vana.collect";
+const LAUNCHD_PLIST_PATH = path.join(
+ os.homedir(),
+ "Library",
+ "LaunchAgents",
+ `${LAUNCHD_LABEL}.plist`,
+);
+const CRONTAB_MARKER = "# vana-scheduled-collection";
+
+function parseIntervalSeconds(interval: string): number {
+ const lower = interval.toLowerCase().trim();
+ const match = /^(\d+)\s*(h|d|m|w)$/i.exec(lower);
+ if (match) {
+ const value = parseInt(match[1], 10);
+ const unit = match[2].toLowerCase();
+ if (unit === "h") return value * 3600;
+ if (unit === "d") return value * 86400;
+ if (unit === "w") return value * 7 * 86400;
+ if (unit === "m") return value * 30 * 86400;
+ }
+ if (lower === "daily") return 86400;
+ if (lower === "weekly") return 7 * 86400;
+ // Default to 24h
+ return 86400;
+}
+
+function formatIntervalHuman(seconds: number): string {
+ if (seconds < 3600) return `${Math.round(seconds / 60)}m`;
+ if (seconds < 86400) return `${Math.round(seconds / 3600)}h`;
+ return `${Math.round(seconds / 86400)}d`;
+}
+
+function resolveVanaBinaryPath(): string {
+ // For SEA binaries, process.execPath is the binary itself
+ const installMethod = getCliInstallMethod();
+ if (installMethod === "homebrew" || installMethod === "installer") {
+ return process.execPath;
+ }
+ // For development, try to find the vana binary via which
+ try {
+ return execSync("which vana", { encoding: "utf8" }).trim();
+ } catch {
+ // Fall back to process.execPath + argv[1]
+ return `${process.execPath} ${process.argv[1]}`;
+ }
+}
+
+function generateLaunchdPlist(
+ vanaBinary: string,
+ intervalSeconds: number,
+): string {
+ const logsPath = path.join(getLogsDir(), "schedule.log");
+ // Handle the case where vanaBinary might contain a space (node + script)
+ const programArgs = vanaBinary.includes(" ")
+ ? vanaBinary
+ .split(" ")
+ .map((arg) => ` ${arg}`)
+ .join("\n")
+ : ` ${vanaBinary}`;
+
+ return `
+
+
+
+ Label
+ ${LAUNCHD_LABEL}
+ ProgramArguments
+
+${programArgs}
+ collect
+ --all
+ --quiet
+
+ StartInterval
+ ${intervalSeconds}
+ StandardOutPath
+ ${logsPath}
+ StandardErrorPath
+ ${logsPath}
+ RunAtLoad
+
+
+
+`;
+}
+
+function generateCrontabEntry(
+ vanaBinary: string,
+ intervalHours: number,
+): string {
+ const logsPath = path.join(getLogsDir(), "schedule.log");
+ const hourExpr = intervalHours >= 24 ? "0" : `0`;
+ const dayExpr = "*";
+ const hourInterval =
+ intervalHours >= 24 ? "0" : intervalHours >= 1 ? `*/${intervalHours}` : "*";
+ return `${hourExpr} ${hourInterval} ${dayExpr} * * ${vanaBinary} collect --all --quiet >> ${logsPath} 2>&1 ${CRONTAB_MARKER}`;
+}
+
+async function runScheduleAdd(
+ interval: string,
+ options: GlobalOptions,
+): Promise {
+ const emit = createEmitter(options);
+ const intervalSeconds = parseIntervalSeconds(interval);
+ const intervalLabel = formatIntervalHuman(intervalSeconds);
+ const vanaBinary = resolveVanaBinaryPath();
+
+ await fsp.mkdir(getLogsDir(), { recursive: true });
+
+ if (process.platform === "darwin") {
+ // macOS: launchd
+ const plist = generateLaunchdPlist(vanaBinary, intervalSeconds);
+ const plistDir = path.dirname(LAUNCHD_PLIST_PATH);
+ await fsp.mkdir(plistDir, { recursive: true });
+
+ // Unload existing if present
+ try {
+ execSync(`launchctl unload "${LAUNCHD_PLIST_PATH}" 2>/dev/null`, {
+ stdio: "ignore",
+ });
+ } catch {
+ // Not loaded, that's fine
+ }
+
+ await fsp.writeFile(LAUNCHD_PLIST_PATH, plist);
+
+ try {
+ execSync(`launchctl load "${LAUNCHD_PLIST_PATH}"`, { stdio: "ignore" });
+ } catch {
+ emit.info("Could not load the launchd plist. Load it manually:");
+ emit.detail(`launchctl load "${LAUNCHD_PLIST_PATH}"`);
+ return 1;
+ }
+
+ if (options.json) {
+ process.stdout.write(
+ `${JSON.stringify({ ok: true, interval: intervalLabel, mechanism: "launchd", plistPath: LAUNCHD_PLIST_PATH })}\n`,
+ );
+ return 0;
+ }
+
+ emit.info(
+ `Added ${intervalLabel === "1d" ? "daily" : `every ${intervalLabel}`} collection schedule.`,
+ );
+ emit.detail(`Runs: ${emit.code("vana collect --all --quiet")}`);
+ emit.detail(`Managed by: launchd`);
+ return 0;
+ }
+
+ if (process.platform === "linux") {
+ // Linux: crontab
+ const intervalHours = Math.max(1, Math.round(intervalSeconds / 3600));
+ const entry = generateCrontabEntry(vanaBinary, intervalHours);
+
+ try {
+ // Read existing crontab, filter out old vana entries, add new one
+ let existing = "";
+ try {
+ existing = execSync("crontab -l 2>/dev/null", {
+ encoding: "utf8",
+ });
+ } catch {
+ // No existing crontab
+ }
+ const filtered = existing
+ .split("\n")
+ .filter((line) => !line.includes(CRONTAB_MARKER))
+ .join("\n");
+ const newCrontab = `${filtered.trimEnd()}\n${entry}\n`;
+ execSync("crontab -", {
+ input: newCrontab,
+ encoding: "utf8",
+ });
+ } catch {
+ emit.info("Could not update crontab. Add this entry manually:");
+ emit.detail(entry);
+ return 1;
+ }
+
+ if (options.json) {
+ process.stdout.write(
+ `${JSON.stringify({ ok: true, interval: intervalLabel, mechanism: "cron" })}\n`,
+ );
+ return 0;
+ }
+
+ emit.info(
+ `Added ${intervalLabel === "1d" ? "daily" : `every ${intervalLabel}`} collection schedule.`,
+ );
+ emit.detail(`Runs: ${emit.code("vana collect --all --quiet")}`);
+ emit.detail(`Managed by: cron`);
+ return 0;
+ }
+
+ // Unsupported platform
+ emit.info(
+ "Scheduled collection requires launchd (macOS) or cron (Linux). Run `vana collect --all` manually or set up a cron job.",
+ );
+ return 1;
+}
+
+async function runScheduleList(options: GlobalOptions): Promise {
+ const emit = createEmitter(options);
+
+ if (process.platform === "darwin") {
+ // Check launchd plist
+ try {
+ await fsp.access(LAUNCHD_PLIST_PATH);
+ const content = await fsp.readFile(LAUNCHD_PLIST_PATH, "utf8");
+ const intervalMatch = content.match(
+ /StartInterval<\/key>\s*(\d+)<\/integer>/,
+ );
+ const intervalSeconds = intervalMatch
+ ? parseInt(intervalMatch[1], 10)
+ : 86400;
+ const intervalLabel = formatIntervalHuman(intervalSeconds);
+ const nextInSeconds = intervalSeconds; // Approximate; launchd doesn't expose exact next-run
+ const nextLabel = `~${formatIntervalHuman(nextInSeconds)}`;
+
+ if (options.json) {
+ process.stdout.write(
+ `${JSON.stringify({
+ scheduled: true,
+ interval: intervalLabel,
+ intervalSeconds,
+ mechanism: "launchd",
+ plistPath: LAUNCHD_PLIST_PATH,
+ })}\n`,
+ );
+ return 0;
+ }
+
+ emit.keyValue(
+ "Daily collection",
+ `every ${intervalLabel} next: ${nextLabel}`,
+ "muted",
+ );
+ emit.detail(`Managed by: ${LAUNCHD_PLIST_PATH}`);
+ return 0;
+ } catch {
+ // No plist found
+ }
+ }
+
+ if (process.platform === "linux") {
+ // Check crontab
+ try {
+ const crontab = execSync("crontab -l 2>/dev/null", {
+ encoding: "utf8",
+ });
+ const vanaLine = crontab
+ .split("\n")
+ .find((line) => line.includes(CRONTAB_MARKER));
+ if (vanaLine) {
+ if (options.json) {
+ process.stdout.write(
+ `${JSON.stringify({
+ scheduled: true,
+ mechanism: "cron",
+ entry: vanaLine,
+ })}\n`,
+ );
+ return 0;
+ }
+
+ emit.keyValue("Daily collection", "cron", "muted");
+ emit.detail(`Entry: ${vanaLine.replace(CRONTAB_MARKER, "").trim()}`);
+ return 0;
+ }
+ } catch {
+ // No crontab available
+ }
+ }
+
+ if (options.json) {
+ process.stdout.write(`${JSON.stringify({ scheduled: false })}\n`);
+ return 0;
+ }
+
+ emit.info("No scheduled collection found.");
+ emit.detail(`Add one with ${emit.code("vana schedule add")}.`);
+ return 0;
+}
+
+async function runScheduleRemove(options: GlobalOptions): Promise {
+ const emit = createEmitter(options);
+
+ if (process.platform === "darwin") {
+ try {
+ await fsp.access(LAUNCHD_PLIST_PATH);
+ try {
+ execSync(`launchctl unload "${LAUNCHD_PLIST_PATH}"`, {
+ stdio: "ignore",
+ });
+ } catch {
+ // Already unloaded
+ }
+ await fsp.unlink(LAUNCHD_PLIST_PATH);
+
+ if (options.json) {
+ process.stdout.write(
+ `${JSON.stringify({ ok: true, removed: true })}\n`,
+ );
+ return 0;
+ }
+
+ emit.info("Removed daily collection schedule.");
+ return 0;
+ } catch {
+ // No plist found, fall through
+ }
+ }
+
+ if (process.platform === "linux") {
+ try {
+ const existing = execSync("crontab -l 2>/dev/null", {
+ encoding: "utf8",
+ });
+ if (existing.includes(CRONTAB_MARKER)) {
+ const filtered = existing
+ .split("\n")
+ .filter((line) => !line.includes(CRONTAB_MARKER))
+ .join("\n");
+ execSync("crontab -", {
+ input: `${filtered.trimEnd()}\n`,
+ encoding: "utf8",
+ });
+
+ if (options.json) {
+ process.stdout.write(
+ `${JSON.stringify({ ok: true, removed: true })}\n`,
+ );
+ return 0;
+ }
+
+ emit.info("Removed daily collection schedule.");
+ return 0;
+ }
+ } catch {
+ // No crontab available
+ }
+ }
+
+ if (options.json) {
+ process.stdout.write(`${JSON.stringify({ ok: true, removed: false })}\n`);
+ return 0;
+ }
+
+ emit.info("No scheduled collection found to remove.");
+ return 0;
+}
+
+function isPromptCancelled(error: unknown): boolean {
+ return (
+ error instanceof Error &&
+ (error.name === "ExitPromptError" || error.message.includes("SIGINT"))
+ );
+}
+
+// ---------------------------------------------------------------------------
+// Skill commands
+// ---------------------------------------------------------------------------
+
+async function runSkillList(options: GlobalOptions): Promise {
+ const emit = createEmitter(options);
+
+ try {
+ const skills = await listAvailableSkills();
+ const installed = await readInstalledSkills();
+ const installedIds = new Set(installed.map((s) => s.id));
+
+ const enriched = skills.map((skill) => ({
+ ...skill,
+ installed: installedIds.has(skill.id),
+ }));
+
+ if (options.json) {
+ process.stdout.write(
+ `${JSON.stringify({ count: enriched.length, skills: enriched })}\n`,
+ );
+ return 0;
+ }
+
+ emit.title("Available skills");
+ emit.blank();
+
+ if (enriched.length === 0) {
+ emit.info("No skills are available right now.");
+ return 0;
+ }
+
+ for (const skill of enriched) {
+ const tag = skill.installed
+ ? ` ${emit.badge("installed", "accent")}`
+ : "";
+ emit.info(` ${skill.id}${tag}`);
+ }
+
+ const uninstalled = enriched.find((s) => !s.installed);
+ if (uninstalled) {
+ emit.blank();
+ emit.next(`vana skills install ${uninstalled.id}`);
+ }
+
+ return 0;
+ } catch (error) {
+ if (options.json) {
+ process.stdout.write(
+ `${JSON.stringify({ error: error instanceof Error ? error.message : String(error) })}\n`,
+ );
+ } else {
+ emit.info(error instanceof Error ? error.message : String(error));
+ }
+ return 1;
+ }
+}
+
+async function runSkillInstall(
+ name: string,
+ options: GlobalOptions,
+): Promise {
+ const emit = createEmitter(options);
+
+ try {
+ const { installedPath } = await installSkill(name);
+
+ if (options.json) {
+ process.stdout.write(
+ `${JSON.stringify({ ok: true, id: name, installedPath })}\n`,
+ );
+ return 0;
+ }
+
+ emit.success(`Installed ${name}.`);
+ emit.blank();
+ const skills = await listAvailableSkills();
+ const installed = await readInstalledSkills();
+ const installedIds = new Set([...installed.map((s) => s.id), name]);
+ const nextSkill = skills.find((s) => !installedIds.has(s.id));
+ emit.next(
+ nextSkill ? `vana skills install ${nextSkill.id}` : "vana skills list",
+ );
+
+ return 0;
+ } catch (error) {
+ if (options.json) {
+ process.stdout.write(
+ `${JSON.stringify({ ok: false, error: error instanceof Error ? error.message : String(error) })}\n`,
+ );
+ } else {
+ emit.info(error instanceof Error ? error.message : String(error));
+ }
+ return 1;
+ }
+}
+
+async function runSkillShow(
+ name: string,
+ options: GlobalOptions,
+): Promise {
+ const emit = createEmitter(options);
+
+ try {
+ const skills = await listAvailableSkills();
+ const match = skills.find((s) => s.id.toLowerCase() === name.toLowerCase());
+
+ if (!match) {
+ if (options.json) {
+ process.stdout.write(
+ `${JSON.stringify({ error: `No skill found with id "${name}".` })}\n`,
+ );
+ } else {
+ emit.info(`No skill found with id "${name}".`);
+ emit.blank();
+ emit.next("vana skills list");
+ }
+ return 1;
+ }
+
+ const installed = await readInstalledSkills();
+ const isInstalled = installed.some((s) => s.id === match.id);
+
+ if (options.json) {
+ process.stdout.write(
+ `${JSON.stringify({ ...match, installed: isInstalled })}\n`,
+ );
+ return 0;
+ }
+
+ const badges: Array<{ text: string; tone?: RenderTone }> = [];
+ if (isInstalled) {
+ badges.push({ text: "installed", tone: "success" });
+ }
+ emit.sourceTitle(match.name, badges);
+ emit.detail(match.description);
+ emit.keyValue("Version", match.version);
+
+ if (!isInstalled) {
+ emit.blank();
+ emit.next(`vana skills install ${match.id}`);
+ }
+
+ return 0;
+ } catch (error) {
+ if (options.json) {
+ process.stdout.write(
+ `${JSON.stringify({ error: error instanceof Error ? error.message : String(error) })}\n`,
+ );
+ } else {
+ emit.info(error instanceof Error ? error.message : String(error));
+ }
+ return 1;
+ }
+}
diff --git a/src/cli/main.ts b/src/cli/main.ts
new file mode 100644
index 00000000..1c849fbe
--- /dev/null
+++ b/src/cli/main.ts
@@ -0,0 +1,10 @@
+for (const stream of [process.stdout, process.stderr]) {
+ stream.on("error", (error: NodeJS.ErrnoException) => {
+ if (error.code === "EPIPE") {
+ process.exit(0);
+ }
+ throw error;
+ });
+}
+
+export { runCli } from "./index.js";
diff --git a/src/cli/mcp-server.ts b/src/cli/mcp-server.ts
new file mode 100644
index 00000000..d37fac13
--- /dev/null
+++ b/src/cli/mcp-server.ts
@@ -0,0 +1,298 @@
+/**
+ * MCP (Model Context Protocol) server for agent integration.
+ *
+ * Exposes high-level tools over stdio so any MCP-compatible agent
+ * (Claude Code, Cursor, etc.) can discover and call them.
+ *
+ * CRITICAL: All logging/output goes to stderr. stdout is the JSON-RPC transport.
+ */
+
+import { spawn } from "node:child_process";
+
+import { McpServer } from "@modelcontextprotocol/sdk/server/mcp.js";
+import { StdioServerTransport } from "@modelcontextprotocol/sdk/server/stdio.js";
+import { z } from "zod";
+
+import { getCliVersion } from "./index.js";
+import {
+ queryStatus,
+ querySources,
+ queryDataShow,
+ queryDoctor,
+} from "./queries.js";
+
+/**
+ * Start the MCP server on stdio.
+ *
+ * Returns a promise that resolves when the transport disconnects.
+ */
+export async function startMcpServer(): Promise {
+ const version = getCliVersion();
+
+ const server = new McpServer({
+ name: "vana",
+ version,
+ });
+
+ // ── Tool: check_status ───────────────────────────────────────────────
+
+ server.tool(
+ "check_status",
+ "Check system health: runtime state, Personal Server connection, and connected source status",
+ async () => {
+ const result = await queryStatus();
+ return {
+ content: [{ type: "text", text: JSON.stringify(result, null, 2) }],
+ };
+ },
+ );
+
+ // ── Tool: list_sources ───────────────────────────────────────────────
+
+ server.tool(
+ "list_sources",
+ "List available data sources that can be connected for personal data collection",
+ { filter: z.string().optional().describe("Filter sources by name") },
+ async ({ filter }) => {
+ const result = await querySources();
+ if (filter) {
+ const lowerFilter = filter.toLowerCase();
+ result.sources = result.sources.filter(
+ (s) =>
+ s.id.toLowerCase().includes(lowerFilter) ||
+ s.name.toLowerCase().includes(lowerFilter),
+ );
+ result.count = result.sources.length;
+ }
+ return {
+ content: [{ type: "text", text: JSON.stringify(result, null, 2) }],
+ };
+ },
+ );
+
+ // ── Tool: show_data ──────────────────────────────────────────────────
+
+ server.tool(
+ "show_data",
+ "Inspect collected data for a connected source. Shows data summary, sync status, and file paths",
+ { source: z.string().describe("Source identifier (e.g. github, twitter)") },
+ async ({ source }) => {
+ const result = await queryDataShow(source);
+ return {
+ content: [{ type: "text", text: JSON.stringify(result, null, 2) }],
+ };
+ },
+ );
+
+ // ── Tool: connect_source ─────────────────────────────────────────────
+
+ server.tool(
+ "connect_source",
+ "Connect a platform and collect personal data. Runs the full connect flow: setup, authentication, data collection, and sync",
+ { source: z.string().describe("Source identifier (e.g. github, twitter)") },
+ async ({ source }) => {
+ // Check auth mode before spawning — legacy sources need a headed
+ // browser and cannot be connected by an agent.
+ const sourcesResult = await querySources();
+ const sourceInfo = sourcesResult.sources?.find(
+ (s) =>
+ s.id === source || s.name?.toLowerCase() === source.toLowerCase(),
+ );
+
+ if (sourceInfo?.authMode === "legacy") {
+ return {
+ content: [
+ {
+ type: "text" as const,
+ text: `${sourceInfo.name ?? source} requires browser login. The user must run this in their own terminal:\n\nvana connect ${source}\n\nThis source cannot be connected by an agent.`,
+ },
+ ],
+ };
+ }
+
+ return await runConnectAsChild(source);
+ },
+ );
+
+ // ── Tool: run_diagnostics ────────────────────────────────────────────
+
+ server.tool(
+ "run_diagnostics",
+ "Run detailed system diagnostics: CLI version, runtime paths, browser state, connector cache, and source-level issues",
+ async () => {
+ const result = await queryDoctor();
+ return {
+ content: [{ type: "text", text: JSON.stringify(result, null, 2) }],
+ };
+ },
+ );
+
+ // ── Tool: generate_context (placeholder) ─────────────────────────────
+
+ server.tool(
+ "generate_context",
+ "Generate prioritized suggestions for the next agent prompt based on connected personal data (coming soon)",
+ async () => {
+ return {
+ content: [
+ {
+ type: "text",
+ text: "Not yet implemented. Install with: vana skill install next-prompt",
+ },
+ ],
+ };
+ },
+ );
+
+ // ── Connect transport and run ────────────────────────────────────────
+
+ const transport = new StdioServerTransport();
+ await server.connect(transport);
+
+ // Wait until the transport closes
+ return new Promise((resolve) => {
+ transport.onclose = () => resolve();
+ });
+}
+
+// ── Child process runner for connect_source ──────────────────────────
+
+/**
+ * Run `vana connect --json --ipc` as a child process.
+ *
+ * The MCP server's stdout is the JSON-RPC transport, so the connect flow
+ * must run in a separate process. We collect the child's stdout (JSONL events)
+ * and stderr, parse the final outcome, and return a structured summary.
+ *
+ * Uses --ipc instead of --no-input so the connector can pause for
+ * credential input via file-based IPC rather than failing immediately.
+ */
+async function runConnectAsChild(source: string) {
+ return new Promise<{
+ content: Array<{ type: "text"; text: string }>;
+ isError?: boolean;
+ }>((resolve) => {
+ const child = spawn(
+ process.execPath,
+ [process.argv[1], "connect", source, "--json", "--ipc"],
+ {
+ stdio: ["ignore", "pipe", "pipe"],
+ env: { ...process.env },
+ },
+ );
+
+ const stdoutChunks: Buffer[] = [];
+ const stderrChunks: Buffer[] = [];
+
+ child.stdout.on("data", (chunk: Buffer) => stdoutChunks.push(chunk));
+ child.stderr.on("data", (chunk: Buffer) => stderrChunks.push(chunk));
+
+ child.on("error", (err) => {
+ resolve({
+ content: [
+ {
+ type: "text",
+ text: `Failed to start connect process: ${err.message}`,
+ },
+ ],
+ isError: true,
+ });
+ });
+
+ child.on("close", (code) => {
+ const stdout = Buffer.concat(stdoutChunks).toString("utf8");
+ const stderr = Buffer.concat(stderrChunks).toString("utf8");
+
+ // Parse JSONL events from stdout
+ const events: Record[] = [];
+ for (const line of stdout.split("\n")) {
+ const trimmed = line.trim();
+ if (!trimmed) continue;
+ try {
+ events.push(JSON.parse(trimmed) as Record);
+ } catch {
+ // Skip non-JSON lines
+ }
+ }
+
+ // Find the final outcome event (last event with an "outcome" field)
+ const outcomeEvent = [...events]
+ .reverse()
+ .find((e) => "outcome" in e || "event" in e);
+
+ const summary = buildConnectSummary(source, code, events, outcomeEvent);
+
+ // If the outcome indicates interactive input is needed, explain
+ if (
+ outcomeEvent &&
+ (outcomeEvent.outcome === "needs_input" ||
+ outcomeEvent.event === "needs_input")
+ ) {
+ summary.push(
+ "",
+ `This source requires interactive authentication. Run \`vana connect ${source}\` in a terminal to complete the flow.`,
+ );
+ }
+
+ if (stderr.trim()) {
+ summary.push("", "Stderr:", stderr.trim());
+ }
+
+ resolve({
+ content: [{ type: "text", text: summary.join("\n") }],
+ isError: code !== 0,
+ });
+ });
+ });
+}
+
+/**
+ * Build a human-readable summary from connect child process results.
+ */
+function buildConnectSummary(
+ source: string,
+ exitCode: number | null,
+ events: Record[],
+ outcomeEvent: Record | undefined,
+): string[] {
+ const lines: string[] = [];
+
+ if (exitCode === 0) {
+ lines.push(`Connected ${source} successfully.`);
+ } else {
+ lines.push(`Connect ${source} exited with code ${exitCode ?? "unknown"}.`);
+ }
+
+ // Summarize collected data from events
+ const scopeEvents = events.filter(
+ (e) => e.event === "scope_complete" || e.event === "scope_collected",
+ );
+ if (scopeEvents.length > 0) {
+ lines.push(
+ "",
+ "Collected data:",
+ ...scopeEvents.map((e) => {
+ const scope = (e.scope as string) ?? (e.name as string) ?? "unknown";
+ const count = e.count ?? e.itemCount;
+ return count != null ? ` ${scope} (${count} items)` : ` ${scope}`;
+ }),
+ );
+ }
+
+ if (outcomeEvent) {
+ const outcome =
+ (outcomeEvent.outcome as string) ?? (outcomeEvent.event as string);
+ if (
+ outcome &&
+ outcome !== "scope_complete" &&
+ outcome !== "scope_collected"
+ ) {
+ lines.push("", `Outcome: ${outcome}`);
+ }
+ if (outcomeEvent.message) {
+ lines.push(`Detail: ${outcomeEvent.message as string}`);
+ }
+ }
+
+ return lines;
+}
diff --git a/src/cli/queries.ts b/src/cli/queries.ts
new file mode 100644
index 00000000..9ef13f5e
--- /dev/null
+++ b/src/cli/queries.ts
@@ -0,0 +1,656 @@
+/**
+ * Pure data-gathering query functions for CLI commands.
+ *
+ * Each function returns the structured data that the corresponding `run*`
+ * handler needs for both `--json` serialization and human rendering.
+ * No stdout/stderr writes happen here.
+ */
+
+import fs from "node:fs";
+import fsp from "node:fs/promises";
+
+import {
+ getCliStatePath,
+ getBrowserProfilesDir,
+ getConnectorCacheDir,
+ getVanaHome,
+ getLogsDir,
+ readCliState,
+} from "../core/index.js";
+import type {
+ CliDoctor,
+ CliDoctorCheck,
+ CliStatus,
+ SourceStatus,
+} from "../core/cli-types.js";
+import type { AvailableSource } from "../connectors/registry.js";
+import { detectPersonalServerTarget } from "../personal-server/index.js";
+import { ManagedPlaywrightRuntime } from "../runtime/index.js";
+import { formatDisplayPath } from "./render/index.js";
+
+// ── Re-used internal helpers ──────────────────────────────────────────
+// Exported from index.ts for reuse by query functions. These are internal
+// helpers (not part of the public SDK API) shared between the CLI command
+// handlers and the query layer.
+
+import {
+ loadRegistrySources,
+ createSourceLabelMap,
+ createSourceMetadataMap,
+ gatherSourceStatuses,
+ listInstalledConnectorFiles,
+ hasCollectedData,
+ rankSourceStatus,
+ compareSourceStatusOrder,
+ readResultSummary,
+ summarizeResultData,
+ getCliVersion,
+ getCliChannel,
+ getCliInstallMethod,
+ getLifecycleCommands,
+ getDoctorAppRootPath,
+ buildStatusNextSteps,
+ buildSourcesNextSteps,
+ buildDataListNextSteps,
+ buildDataShowNextSteps,
+ compareDatasetOrder,
+ displaySource,
+ getSourceStatusPresentation,
+ humanizeIssue,
+} from "./index.js";
+
+// ── Return types ──────────────────────────────────────────────────────
+
+/** Result of `queryStatus()`. Contains everything both JSON and human paths need. */
+export interface StatusQueryResult {
+ status: CliStatus;
+ nextSteps: string[];
+}
+
+/** Result of `querySources()`. Matches the `--json` output shape exactly. */
+export interface SourcesQueryResult {
+ count: number;
+ recommendedSource:
+ | (AvailableSource & {
+ installed: boolean;
+ dataState?: SourceStatus["dataState"];
+ lastRunOutcome?: string | null;
+ sessionPresent?: boolean;
+ })
+ | null;
+ nextSteps: string[];
+ summary: {
+ connectedCount: number;
+ readyCount: number;
+ manualCount: number;
+ installedCount: number;
+ };
+ sources: Array<
+ AvailableSource & {
+ installed: boolean;
+ dataState?: SourceStatus["dataState"];
+ lastRunOutcome?: string | null;
+ sessionPresent?: boolean;
+ }
+ >;
+}
+
+/** A single dataset record as returned by data-list and data-show queries. */
+export interface DatasetRecord {
+ source: string;
+ name: string | null;
+ authMode: "automated" | "interactive" | "legacy" | null;
+ dataState?: SourceStatus["dataState"];
+ lastRunAt: string | null;
+ path: string | null;
+ summary: { lines: string[] } | null;
+}
+
+/** Result of `queryDataList()`. Matches the `--json` output shape exactly. */
+export interface DataListQueryResult {
+ count: number;
+ latestDataset: DatasetRecord | null;
+ nextSteps: string[];
+ summary: {
+ localCount: number;
+ syncedCount: number;
+ syncFailedCount: number;
+ };
+ datasets: DatasetRecord[];
+}
+
+/** Successful result of `queryDataShow()`. */
+export interface DataShowSuccess {
+ ok: true;
+ source: string;
+ name: string;
+ path: string;
+ summary: { lines: string[] } | null;
+ lastRunAt: string | null;
+ dataState: SourceStatus["dataState"] | null;
+ nextSteps: string[];
+ data: Record;
+ datasetCount: number;
+}
+
+/** Not-found result of `queryDataShow()`. */
+export interface DataShowNotFound {
+ ok: false;
+ error: "dataset_not_found";
+ source: string;
+ message: string;
+ nextSteps: string[];
+ datasetCount: number;
+}
+
+/** Read-failure result of `queryDataShow()`. */
+export interface DataShowReadFailed {
+ ok: false;
+ error: "dataset_read_failed";
+ source: string;
+ path: string;
+ message: string;
+}
+
+export type DataShowQueryResult =
+ | DataShowSuccess
+ | DataShowNotFound
+ | DataShowReadFailed;
+
+/** Result of `queryDoctor()`. Matches the `CliDoctor` type exactly. */
+export type DoctorQueryResult = CliDoctor;
+
+// ── Query functions ───────────────────────────────────────────────────
+
+/**
+ * Gather status data for `vana status`.
+ *
+ * Returns the full `CliStatus` plus computed `nextSteps`.
+ * The `--json` handler selects the compact subset it needs.
+ */
+export async function queryStatus(): Promise {
+ const runtime = new ManagedPlaywrightRuntime();
+ const personalServer = await detectPersonalServerTarget();
+ const state = await readCliState();
+ const registrySources = await loadRegistrySources();
+ const sourceLabels = createSourceLabelMap(registrySources);
+ const sourceMetadata = createSourceMetadataMap(registrySources);
+ const sources = await gatherSourceStatuses(state.sources, sourceMetadata);
+
+ const pendingSyncCount = sources.filter(
+ (source) => source.dataState === "collected_local",
+ ).length;
+
+ // Count stored scopes across all sources
+ let totalStoredScopes = 0;
+ for (const stored of Object.values(state.sources)) {
+ if (stored?.ingestScopes) {
+ totalStoredScopes += stored.ingestScopes.filter(
+ (s) => s.status === "stored",
+ ).length;
+ }
+ }
+
+ const status: CliStatus = {
+ cliVersion: getCliVersion(),
+ channel: getCliChannel(),
+ installMethod: getCliInstallMethod(),
+ runtime: runtime.state,
+ runtimePath: runtime.runtimePath,
+ personalServer: personalServer.state,
+ personalServerUrl: personalServer.url,
+ personalServerSource: personalServer.source,
+ personalServerInfo: {
+ url: personalServer.url,
+ status: personalServer.state,
+ scopeCount: totalStoredScopes,
+ },
+ pendingSyncCount,
+ summary: {
+ sourceCount: sources.length,
+ needsAttentionCount: sources.filter(
+ (source) => rankSourceStatus(source) <= 4,
+ ).length,
+ connectedCount: sources.filter(
+ (source) =>
+ source.dataState === "ingested_personal_server" ||
+ source.dataState === "collected_local" ||
+ source.dataState === "ingest_failed",
+ ).length,
+ installedCount: sources.filter((source) => source.installed).length,
+ localCount: sources.filter(
+ (source) => source.dataState === "collected_local",
+ ).length,
+ syncedCount: sources.filter(
+ (source) => source.dataState === "ingested_personal_server",
+ ).length,
+ syncFailedCount: sources.filter(
+ (source) => source.dataState === "ingest_failed",
+ ).length,
+ },
+ sources,
+ };
+
+ const nextSteps = buildStatusNextSteps(
+ status.sources,
+ sourceLabels,
+ status.runtime,
+ registrySources,
+ );
+
+ // Check for version updates.
+ for (const source of status.sources) {
+ const registrySource = registrySources.find((s) => s.id === source.source);
+ if (
+ registrySource?.version &&
+ source.connectorVersion &&
+ registrySource.version !== source.connectorVersion
+ ) {
+ nextSteps.push(
+ `Update ${displaySource(source.source, sourceLabels)} connector (${source.connectorVersion} -> ${registrySource.version}) with \`vana connect ${source.source}\`.`,
+ );
+ }
+ }
+
+ if (pendingSyncCount > 0) {
+ nextSteps.push(
+ `Sync ${pendingSyncCount} pending dataset(s) with \`vana server sync\`.`,
+ );
+ }
+
+ return { status, nextSteps };
+}
+
+/**
+ * Gather sources data for `vana sources`.
+ *
+ * Returns the enriched source list with counts and recommendations.
+ */
+export async function querySources(): Promise {
+ const sources = await loadRegistrySources();
+ const state = await readCliState();
+ const sourceMetadata = createSourceMetadataMap(sources);
+ const statuses = await gatherSourceStatuses(state.sources, sourceMetadata);
+ const statusMap = new Map(statuses.map((source) => [source.source, source]));
+ const installedSourceIds = new Set(
+ (await listInstalledConnectorFiles()).map((source) => source.source),
+ );
+ const enrichedSources = sources.map((source) => {
+ const status = statusMap.get(source.id);
+ return {
+ ...source,
+ installed: installedSourceIds.has(source.id),
+ dataState: status?.dataState,
+ lastRunOutcome: status?.lastRunOutcome ?? null,
+ sessionPresent: status?.sessionPresent ?? false,
+ };
+ });
+ const readyCount = enrichedSources.filter(
+ (source) =>
+ source.authMode !== "legacy" && !hasCollectedData(source.dataState),
+ ).length;
+ const manualCount = enrichedSources.filter(
+ (source) =>
+ source.authMode === "legacy" && !hasCollectedData(source.dataState),
+ ).length;
+ const connectedCount = enrichedSources.filter(
+ (source) =>
+ source.dataState === "collected_local" ||
+ source.dataState === "ingested_personal_server" ||
+ source.dataState === "ingest_failed",
+ ).length;
+ const recommendedSource =
+ enrichedSources.find(
+ (source) =>
+ source.authMode !== "legacy" &&
+ source.dataState !== "collected_local" &&
+ source.dataState !== "ingested_personal_server" &&
+ source.dataState !== "ingest_failed",
+ ) ??
+ enrichedSources.find(
+ (source) =>
+ source.dataState !== "collected_local" &&
+ source.dataState !== "ingested_personal_server" &&
+ source.dataState !== "ingest_failed",
+ ) ??
+ null;
+ const nextSteps = buildSourcesNextSteps(recommendedSource, connectedCount);
+
+ return {
+ count: enrichedSources.length,
+ recommendedSource,
+ nextSteps,
+ summary: {
+ connectedCount,
+ readyCount,
+ manualCount,
+ installedCount: enrichedSources.filter((source) => source.installed)
+ .length,
+ },
+ sources: enrichedSources,
+ };
+}
+
+/**
+ * Gather dataset list for `vana data list`.
+ *
+ * Returns all collected datasets with summaries and counts.
+ */
+export async function queryDataList(): Promise {
+ const state = await readCliState();
+ const registrySources = await loadRegistrySources();
+ const sources = await gatherSourceStatuses(
+ state.sources,
+ createSourceMetadataMap(registrySources),
+ );
+ const datasetRecords: DatasetRecord[] = await Promise.all(
+ sources
+ .filter((source) => Boolean(source.lastResultPath))
+ .map(async (source) => ({
+ source: source.source,
+ name: source.name ?? null,
+ authMode: source.authMode ?? null,
+ dataState: source.dataState,
+ lastRunAt: source.lastRunAt ?? null,
+ path: source.lastResultPath ?? null,
+ summary: source.lastResultPath
+ ? await readResultSummary(source.lastResultPath)
+ : null,
+ })),
+ );
+ // eslint-disable-next-line @typescript-eslint/no-explicit-any
+ datasetRecords.sort(compareDatasetOrder as any);
+ const nextSteps = buildDataListNextSteps(datasetRecords, registrySources);
+
+ return {
+ count: datasetRecords.length,
+ latestDataset: datasetRecords[0] ?? null,
+ nextSteps,
+ summary: {
+ localCount: datasetRecords.filter(
+ (dataset) => dataset.dataState !== "ingested_personal_server",
+ ).length,
+ syncedCount: datasetRecords.filter(
+ (dataset) => dataset.dataState === "ingested_personal_server",
+ ).length,
+ syncFailedCount: datasetRecords.filter(
+ (dataset) => dataset.dataState === "ingest_failed",
+ ).length,
+ },
+ datasets: datasetRecords,
+ };
+}
+
+/**
+ * Gather data for `vana data show `.
+ *
+ * Returns the dataset contents and metadata, or an error descriptor.
+ */
+export async function queryDataShow(
+ source: string,
+): Promise {
+ const sourceLabels = createSourceLabelMap(await loadRegistrySources());
+ const state = await readCliState();
+ const record = state.sources[source];
+ const resultPath = record?.lastResultPath;
+ const datasetCount = Object.values(state.sources).filter((entry) =>
+ Boolean(entry?.lastResultPath),
+ ).length;
+
+ if (!resultPath) {
+ return {
+ ok: false,
+ error: "dataset_not_found",
+ source,
+ message: `No collected dataset found for ${displaySource(source, sourceLabels)}. Run \`vana connect ${source}\` first.`,
+ nextSteps: [
+ `Run \`vana connect ${source}\` to collect data.`,
+ ...(datasetCount > 0
+ ? ["Run `vana data list` to inspect other datasets."]
+ : []),
+ ],
+ datasetCount,
+ };
+ }
+
+ try {
+ const raw = await fsp.readFile(resultPath, "utf8");
+ const data = JSON.parse(raw) as Record;
+ const summary = summarizeResultData(data);
+ const nextSteps = buildDataShowNextSteps(
+ source,
+ datasetCount,
+ sourceLabels,
+ );
+ return {
+ ok: true,
+ source,
+ name: displaySource(source, sourceLabels),
+ path: resultPath,
+ summary,
+ lastRunAt: record?.lastRunAt ?? null,
+ dataState: (record?.dataState ?? null) as
+ | "none"
+ | "collected_local"
+ | "ingested_personal_server"
+ | "ingest_unavailable"
+ | "ingest_failed"
+ | null,
+ nextSteps,
+ data,
+ datasetCount,
+ };
+ } catch (error) {
+ const message =
+ error instanceof Error ? error.message : `Could not read ${resultPath}.`;
+ return {
+ ok: false,
+ error: "dataset_read_failed",
+ source,
+ path: resultPath,
+ message,
+ };
+ }
+}
+
+/**
+ * Gather diagnostic data for `vana doctor`.
+ *
+ * Returns the full `CliDoctor` payload.
+ */
+export async function queryDoctor(): Promise {
+ const runtime = new ManagedPlaywrightRuntime();
+ const personalServer = await detectPersonalServerTarget();
+ const state = await readCliState();
+ const registrySources = await loadRegistrySources();
+ const sourceMetadata = createSourceMetadataMap(registrySources);
+ const sourceLabels = createSourceLabelMap(registrySources);
+ const sources = await gatherSourceStatuses(state.sources, sourceMetadata);
+ const cliVersion = getCliVersion();
+ const cliChannel = getCliChannel(cliVersion);
+ const installMethod = getCliInstallMethod();
+ const lifecycle = getLifecycleCommands(installMethod, cliChannel);
+ const appRootPath = getDoctorAppRootPath(installMethod);
+ const recentSources = [...sources]
+ .filter((source) => Boolean(source.lastRunAt))
+ .sort(compareSourceStatusOrder)
+ .slice(0, 3);
+ const attentionSources = recentSources.filter(
+ (source) => rankSourceStatus(source) <= 4,
+ );
+ const connectedCount = sources.filter(
+ (source) =>
+ source.dataState === "collected_local" ||
+ source.dataState === "ingested_personal_server" ||
+ source.dataState === "ingest_failed",
+ ).length;
+ const attentionCount = sources.filter(
+ (source) => rankSourceStatus(source) <= 4,
+ ).length;
+
+ const directories = [
+ {
+ key: "executable",
+ label: "Executable",
+ path: process.execPath,
+ present: fs.existsSync(process.execPath),
+ },
+ ...(appRootPath
+ ? [
+ {
+ key: "appRoot",
+ label: "App root",
+ path: appRootPath,
+ present: fs.existsSync(appRootPath),
+ },
+ ]
+ : []),
+ {
+ key: "dataHome",
+ label: "Data home",
+ path: getVanaHome(),
+ present: fs.existsSync(getVanaHome()),
+ },
+ {
+ key: "stateFile",
+ label: "State file",
+ path: getCliStatePath(),
+ present: fs.existsSync(getCliStatePath()),
+ },
+ {
+ key: "connectorCache",
+ label: "Connector cache",
+ path: getConnectorCacheDir(),
+ present: fs.existsSync(getConnectorCacheDir()),
+ },
+ {
+ key: "browserProfiles",
+ label: "Browser profiles",
+ path: getBrowserProfilesDir(),
+ present: fs.existsSync(getBrowserProfilesDir()),
+ },
+ {
+ key: "logs",
+ label: "Logs",
+ path: getLogsDir(),
+ present: fs.existsSync(getLogsDir()),
+ },
+ ];
+
+ const checks: CliDoctorCheck[] = [
+ {
+ key: "cli",
+ label: "CLI",
+ status: "ok",
+ detail: `Version ${cliVersion}`,
+ },
+ {
+ key: "runtime",
+ label: "Runtime",
+ status: runtime.state === "installed" ? "ok" : "warn",
+ detail:
+ runtime.state === "installed"
+ ? `Browser available at ${formatDisplayPath(runtime.runtimePath ?? "unknown")}`
+ : "Run `vana setup` to install the local browser runtime.",
+ },
+ {
+ key: "personalServer",
+ label: "Personal Server",
+ status: personalServer.state === "available" ? "ok" : "warn",
+ detail:
+ personalServer.state === "available"
+ ? (personalServer.url ?? "Available")
+ : "Unavailable. Connects will stay local until a Personal Server is reachable.",
+ },
+ ...directories.map