Skip to content
Merged
Show file tree
Hide file tree
Changes from 54 commits
Commits
Show all changes
96 commits
Select commit Hold shift + click to select a range
302d7c5
auto-claude: subtask-0a-1 - Install Vercel AI SDK v6 core + all provi…
AndyMik90 Feb 18, 2026
74d115d
auto-claude: subtask-0b-1 - Create provider types and config interfaces
AndyMik90 Feb 18, 2026
fb2f912
auto-claude: subtask-0b-2 - Create provider factory: createProvider(c…
AndyMik90 Feb 18, 2026
d7bf293
auto-claude: subtask-0b-3 - Create provider registry using createProv…
AndyMik90 Feb 18, 2026
4b207ce
auto-claude: subtask-0b-4 - Create per-provider transforms layer
AndyMik90 Feb 18, 2026
a53bac0
auto-claude: subtask-0c-1 - Port command-parser.ts from Python securi…
AndyMik90 Feb 19, 2026
eec8058
auto-claude: subtask-0c-2 - Port bash-validator.ts from Python securi…
AndyMik90 Feb 19, 2026
d4c76ac
auto-claude: subtask-0c-3 - Create path-containment.ts for filesystem…
AndyMik90 Feb 19, 2026
83f0279
auto-claude: subtask-0c-4 - Write comprehensive Vitest tests for the …
AndyMik90 Feb 19, 2026
0cdf864
auto-claude: subtask-0d-1 - Create tool types and Tool.define() wrapper
AndyMik90 Feb 19, 2026
3d50a20
auto-claude: subtask-0d-2 - Create 4 filesystem tools (Read, Write, E…
AndyMik90 Feb 19, 2026
d42afa0
auto-claude: subtask-0d-3 - Create Bash, Grep, WebFetch, WebSearch tools
AndyMik90 Feb 19, 2026
62e89ab
auto-claude: subtask-0d-4 - Create ToolRegistry class with agent conf…
AndyMik90 Feb 19, 2026
555489c
auto-claude: subtask-0e-1 - Port AGENT_CONFIGS from models.py to agen…
AndyMik90 Feb 19, 2026
5de9d3c
auto-claude: subtask-0e-2 - Port phase-config.ts from phase_config.py
AndyMik90 Feb 19, 2026
8b20a60
auto-claude: subtask-0e-3 - Create auth resolver with multi-stage fal…
AndyMik90 Feb 19, 2026
dd0f3d5
auto-claude: subtask-0e-4 - Create MCP client and registry
AndyMik90 Feb 19, 2026
c1c1293
auto-claude: subtask-0f-1 - Unit tests for provider factory, registry…
AndyMik90 Feb 19, 2026
df00aa4
auto-claude: subtask-0f-2 - Unit tests for agent configs, phase confi…
AndyMik90 Feb 19, 2026
204e633
auto-claude: subtask-1-1 - Create session types and client factory
AndyMik90 Feb 19, 2026
8a8285f
auto-claude: subtask-1-1 - Fix unused imports in client factory
AndyMik90 Feb 19, 2026
3b0e01c
auto-claude: subtask-1-2 - Create stream handler and error classifier
AndyMik90 Feb 19, 2026
9083e7d
auto-claude: subtask-1-3 - Create progress-tracker.ts for phase detec…
AndyMik90 Feb 19, 2026
288ceb6
auto-claude: subtask-1-4 - Create the core session runner: runAgentSe…
AndyMik90 Feb 19, 2026
dd6092e
auto-claude: subtask-1-5 - Write unit tests for session runtime
AndyMik90 Feb 19, 2026
7b5b15e
auto-claude: subtask-2-1 - Create AgentExecutor, worker thread, and w…
AndyMik90 Feb 19, 2026
f377388
auto-claude: subtask-2-2 - Add worker thread execution to AgentProces…
AndyMik90 Feb 19, 2026
20de994
auto-claude: subtask-2-3 - Add structured progress event handling to …
AndyMik90 Feb 19, 2026
115a6b3
auto-claude: subtask-2-4 - Write tests for worker thread integration
AndyMik90 Feb 19, 2026
0ac4ddd
auto-claude: subtask-3-1 - Create build-orchestrator.ts and subtask-i…
AndyMik90 Feb 19, 2026
f446da1
auto-claude: subtask-3-2 - Create spec-orchestrator.ts and qa-loop.ts
AndyMik90 Feb 19, 2026
04f13fb
auto-claude: subtask-3-3 - Create parallel-executor.ts and recovery-m…
AndyMik90 Feb 19, 2026
a4e16b9
auto-claude: subtask-4-1 - Port utility runners (insights, ideation, …
AndyMik90 Feb 19, 2026
7182428
auto-claude: subtask-4-2 - Port roadmap, merge-resolver, insight-extr…
AndyMik90 Feb 19, 2026
5869e9f
auto-claude: subtask-4-3 - Replace Python subprocess spawning with TS…
AndyMik90 Feb 19, 2026
522389b
auto-claude: subtask-5-1 - Port GitHub PR review engine and triage en…
AndyMik90 Feb 19, 2026
19eb6d6
auto-claude: subtask-5-2 - Port parallel PR orchestrator, followup re…
AndyMik90 Feb 19, 2026
4717f39
auto-claude: subtask-6-1 - Add provider settings translation keys to …
AndyMik90 Feb 19, 2026
4b0cc64
auto-claude: subtask-6-2 - Create Provider Settings UI component
AndyMik90 Feb 19, 2026
985f464
auto-claude: subtask-7-1 - Remove claude-agent-sdk pip dependency
AndyMik90 Feb 19, 2026
921ab3a
auto-claude: subtask-7-2 - Update CLAUDE.md to reflect the new TypeSc…
AndyMik90 Feb 19, 2026
7ea66a4
auto-claude: subtask-7-3 - Run full verification suite
AndyMik90 Feb 19, 2026
cbe800d
fix: use inputSchema instead of parameters, fix platform/worker patte…
AndyMik90 Feb 19, 2026
a2c22ef
TS logic working on kanban tasks
AndyMik90 Feb 20, 2026
a9b4d21
fix: log phase formatting and task completion state transition
AndyMik90 Feb 20, 2026
dee32ff
feat: add TypeScript worktree manager for task isolation
AndyMik90 Feb 20, 2026
927afa3
fix: normalize plan schema fields for subtask tracking
AndyMik90 Feb 21, 2026
bd1f328
fix: wire TypeScript runners to IPC handlers, resolve all tsc errors
AndyMik90 Feb 21, 2026
b80f66f
fix: wire TypeScript Vercel AI SDK changelog runner to IPC handler
AndyMik90 Feb 21, 2026
7b93267
all python logic over to TS
AndyMik90 Feb 21, 2026
01b8455
temp_memory_docs
AndyMik90 Feb 22, 2026
5ce17ab
feat: implement Memory System core engine (Steps 1-7)
AndyMik90 Feb 22, 2026
c29fc25
feat: wire Memory System UI to libSQL backend (Step 8)
AndyMik90 Feb 22, 2026
b0f89ef
fix: resolve __dirname ESM error in memory db.ts, clean up V5 naming
AndyMik90 Feb 22, 2026
1a73e92
merge: resolve conflicts with develop branch
AndyMik90 Feb 22, 2026
3494837
refactor: remove Python backend, rename apps/frontend → apps/desktop
AndyMik90 Feb 22, 2026
1f3c93f
refactor: delete entire apps/backend, clean all references
AndyMik90 Feb 22, 2026
a181728
memory system
AndyMik90 Feb 23, 2026
375ea49
new provider ui
AndyMik90 Feb 23, 2026
fded668
new provider auth and ui
AndyMik90 Feb 23, 2026
cd378d3
feat: global priority queue with cross-provider fallback and multi-pr…
AndyMik90 Feb 23, 2026
8072829
feat: enhance provider account management with Codex support
AndyMik90 Feb 23, 2026
b0e0efc
provider settings changes
AndyMik90 Feb 23, 2026
c1ebe39
multi-provider ui
AndyMik90 Feb 24, 2026
f119ded
feat: concrete per-provider presets and cross-provider tab
AndyMik90 Feb 24, 2026
c969d12
fix: pre-PR validation fixes — xhigh thinking level, state management…
AndyMik90 Feb 24, 2026
1f88235
refactor: move Claude Code badge from sidebar to terminal toolbar
AndyMik90 Feb 24, 2026
1710cce
fix: Codex API integration — instructions, store, model routing, XSta…
AndyMik90 Feb 24, 2026
bf2e320
fix: pipeline validation fixes + denylist security model
AndyMik90 Feb 24, 2026
3f21860
fix: Codex pipeline halt + UI model display for non-Anthropic providers
AndyMik90 Feb 25, 2026
f8ca624
task logs
AndyMik90 Feb 26, 2026
227de79
structured output for all providers with zod validation
AndyMik90 Feb 26, 2026
912909a
codex usage monitoring
AndyMik90 Feb 26, 2026
2eb73bb
fix: pre-PR validation fixes for Vercel AI SDK migration
AndyMik90 Feb 26, 2026
aff98f8
provider and auth improvements
AndyMik90 Feb 27, 2026
6ef9c61
harness changes
AndyMik90 Feb 27, 2026
1b4aaaf
updates to provider features
AndyMik90 Feb 27, 2026
22aafc6
pr update
AndyMik90 Mar 3, 2026
77ea89d
websearch/browser
AndyMik90 Mar 3, 2026
ec556e2
z-ai and account settings
AndyMik90 Mar 4, 2026
468fa40
upgrading model usage with cross provider
AndyMik90 Mar 4, 2026
256455f
usageindication
AndyMik90 Mar 8, 2026
1937fc3
Optimize usage monitoring: reduce API calls, fix false needs-reauth
AndyMik90 Mar 9, 2026
363049d
usage+worktree+harness
AndyMik90 Mar 9, 2026
a2b1466
oauth+structuredoutput
AndyMik90 Mar 9, 2026
fac0c4a
husky fixes
AndyMik90 Mar 9, 2026
06a0dd2
onboarding and memorycleanup
AndyMik90 Mar 9, 2026
dd55f37
memorycleanup
AndyMik90 Mar 9, 2026
7830be4
new spec system
AndyMik90 Mar 11, 2026
5ef3d7c
fixes
AndyMik90 Mar 11, 2026
3cf8705
Merge branch 'develop' into auto-claude/237-migrate-claude-agent-sdk-…
AndyMik90 Mar 11, 2026
fd497f5
fix: resolve CodeQL high and medium security alerts
AndyMik90 Mar 11, 2026
570dc36
fix: resolve remaining 7 CodeQL high-severity TOCTOU race conditions
AndyMik90 Mar 11, 2026
1454613
chore: trigger CodeQL re-evaluation
AndyMik90 Mar 11, 2026
690509c
fix: eliminate TOCTOU by using fd-based file operations throughout
AndyMik90 Mar 11, 2026
5c61a29
fix: resolve remaining TOCTOU alerts in roadmap, test, and bump-version
AndyMik90 Mar 11, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
662 changes: 662 additions & 0 deletions AUTH_RESEARCH.md

Large diffs are not rendered by default.

111 changes: 71 additions & 40 deletions CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

This file provides guidance to Claude Code when working with this repository.

Auto Claude is an autonomous multi-agent coding framework that plans, builds, and validates software for you. It's a monorepo with a Python backend (CLI + agent logic) and an Electron/React frontend (desktop UI).
Auto Claude is an autonomous multi-agent coding framework that plans, builds, and validates software for you. It's a monorepo with an Electron/React frontend (desktop UI + TypeScript AI agent layer) and a Python backend (CLI utilities + Graphiti memory sidecar).

> **Deep-dive reference:** [ARCHITECTURE.md](shared_docs/ARCHITECTURE.md) | **Frontend contributing:** [apps/frontend/CONTRIBUTING.md](apps/frontend/CONTRIBUTING.md)

Expand Down Expand Up @@ -30,11 +30,11 @@ Auto Claude is a desktop application (+ CLI) where users describe a goal and AI

## Critical Rules

**Claude Agent SDK only** — All AI interactions use `claude-agent-sdk`. NEVER use `anthropic.Anthropic()` directly. Always use `create_client()` from `core.client`.
**Vercel AI SDK only** — All AI interactions use the Vercel AI SDK v6 (`ai` package) via the TypeScript agent layer in `apps/frontend/src/main/ai/`. NEVER use `@anthropic-ai/sdk` or `anthropic.Anthropic()` directly. Use `createProvider()` from `ai/providers/factory.ts` and `streamText()`/`generateText()` from the `ai` package. Provider-specific adapters (e.g., `@ai-sdk/anthropic`, `@ai-sdk/openai`) are managed through the provider registry.

**i18n required** — All frontend user-facing text MUST use `react-i18next` translation keys. Never hardcode strings in JSX/TSX. Add keys to both `en/*.json` and `fr/*.json`.

**Platform abstraction** — Never use `process.platform` directly. Import from `apps/frontend/src/main/platform/` or `apps/backend/core/platform/`. CI tests all three platforms.
**Platform abstraction** — Never use `process.platform` directly. Import from `apps/frontend/src/main/platform/`. CI tests all three platforms.

**No time estimates** — Never provide duration predictions. Use priority-based ordering instead.

Expand Down Expand Up @@ -68,29 +68,31 @@ To fully clear all PR review data so reviews run fresh, delete/reset these three
```
autonomous-coding/
├── apps/
│ ├── backend/ # Python backend/CLI — ALL agent logic
│ │ ├── core/ # client.py, auth.py, worktree.py, platform/
│ │ ├── security/ # Command allowlisting, validators, hooks
│ │ ├── agents/ # planner, coder, session management
│ │ ├── qa/ # reviewer, fixer, loop, criteria
│ │ ├── spec/ # Spec creation pipeline
│ │ ├── cli/ # CLI commands (spec, build, workspace, QA)
│ │ ├── context/ # Task context building, semantic search
│ │ ├── runners/ # Standalone runners (spec, roadmap, insights, github)
│ │ ├── services/ # Background services, recovery orchestration
│ │ ├── integrations/ # graphiti/, linear, github
│ │ ├── project/ # Project analysis, security profiles
│ │ ├── merge/ # Intent-aware semantic merge for parallel agents
│ ├── backend/ # Python backend — Graphiti memory sidecar + CLI utilities
│ │ ├── core/ # worktree.py, platform/
│ │ ├── integrations/ # graphiti/ (MCP sidecar)
│ │ └── prompts/ # Agent system prompts (.md)
│ └── frontend/ # Electron desktop UI
│ └── src/
│ ├── main/ # Electron main process
│ │ ├── ai/ # TypeScript AI agent layer (Vercel AI SDK v6)
│ │ │ ├── providers/ # Multi-provider registry + factory (9+ providers)
│ │ │ ├── tools/ # Builtin tools (Read, Write, Edit, Bash, Glob, Grep, etc.)
│ │ │ ├── security/ # Bash validator, command parser, path containment
│ │ │ ├── config/ # Agent configs (25+ types), phase config, model resolution
│ │ │ ├── session/ # streamText() agent loop, error classification, progress
│ │ │ ├── agent/ # Worker thread executor + bridge
│ │ │ ├── orchestration/ # Build pipeline (planner → coder → QA)
│ │ │ ├── runners/ # Utility runners (insights, roadmap, PR review, etc.)
│ │ │ ├── mcp/ # MCP client integration
│ │ │ ├── client/ # Client factory convenience constructors
│ │ │ └── auth/ # Token resolution (reuses claude-profile/)
│ │ ├── agent/ # Agent queue, process, state, events
│ │ ├── claude-profile/ # Multi-profile credentials, token refresh, usage
│ │ ├── terminal/ # PTY daemon, lifecycle, Claude integration
│ │ ├── platform/ # Cross-platform abstraction
│ │ ├── ipc-handlers/# 40+ handler modules by domain
│ │ ├── services/ # SDK session recovery, profile service
│ │ ├── services/ # Session recovery, profile service
│ │ └── changelog/ # Changelog generation and formatting
│ ├── preload/ # Electron preload scripts (electronAPI bridge)
│ ├── renderer/ # React UI
Expand All @@ -117,18 +119,15 @@ autonomous-coding/
```bash
npm run install:all # Install all dependencies from root
# Or separately:
cd apps/backend && uv venv && uv pip install -r requirements.txt
cd apps/frontend && npm install
```

### Testing

| Stack | Command | Tool |
|-------|---------|------|
| Backend | `apps/backend/.venv/bin/pytest tests/ -v` | pytest |
| Frontend unit | `cd apps/frontend && npm test` | Vitest |
| Frontend E2E | `cd apps/frontend && npm run test:e2e` | Playwright |
| All backend | `npm run test:backend` (from root) | pytest |

### Releases
```bash
Expand All @@ -138,13 +137,51 @@ git push && gh pr create --base main # PR to main triggers release

See [RELEASE.md](RELEASE.md) for full release process.

## Backend Development

### Claude Agent SDK Usage

Client: `apps/backend/core/client.py` — `create_client()` returns a configured `ClaudeSDKClient` with security hooks, tool permissions, and MCP server integration.

Model and thinking level are user-configurable (via the Electron UI settings or CLI override). Use `phase_config.py` helpers to resolve the correct values
## AI Agent Layer (`apps/frontend/src/main/ai/`)

All AI agent logic lives in TypeScript using the Vercel AI SDK v6. This replaces the previous Python `claude-agent-sdk` integration.

### Architecture Overview

- **Provider Layer** (`providers/`) — Multi-provider support via `createProviderRegistry()`. Supports Anthropic, OpenAI, Google, Bedrock, Azure, Mistral, Groq, xAI, and Ollama. Provider-specific transforms handle thinking token normalization and prompt caching.
- **Session Runtime** (`session/`) — `runAgentSession()` uses `streamText()` with `stopWhen: stepCountIs(N)` for agentic tool-use loops. Includes error classification (429/401/400) and progress tracking.
- **Worker Threads** (`agent/`) — Agent sessions run in `worker_threads` to avoid blocking the Electron main process. The `WorkerBridge` relays `postMessage()` events to the existing `AgentManagerEvents` interface.
- **Build Orchestration** (`orchestration/`) — Full planner → coder → QA pipeline. Parallel subagent execution via `Promise.allSettled()`.
- **Tools** (`tools/`) — 8 builtin tools (Read, Write, Edit, Bash, Glob, Grep, WebFetch, WebSearch) defined with Zod schemas via AI SDK `tool()`.
- **Security** (`security/`) — Bash validator, command parser, and path containment ported from Python with identical allowlist behavior.
- **Config** (`config/`) — `AGENT_CONFIGS` registry (25+ agent types), phase-aware model resolution, thinking budgets.

### Key Patterns

```typescript
// Agent session using streamText()
import { streamText, stepCountIs } from 'ai';

const result = streamText({
model: provider,
system: systemPrompt,
messages: conversationHistory,
tools: toolRegistry.getToolsForAgent(agentType),
stopWhen: stepCountIs(1000),
onStepFinish: ({ toolCalls, text, usage }) => {
progressTracker.update(toolCalls, text);
},
});

// Tool definition with Zod schema
import { tool } from 'ai';
import { z } from 'zod';

const readTool = tool({
description: 'Read a file from the filesystem',
inputSchema: z.object({
file_path: z.string(),
offset: z.number().optional(),
limit: z.number().optional(),
}),
execute: async ({ file_path, offset, limit }) => { /* ... */ },
});
```

### Agent Prompts (`apps/backend/prompts/`)

Expand All @@ -162,13 +199,13 @@ Each spec in `.auto-claude/specs/XXX-name/` contains: `spec.md`, `requirements.j

### Memory System (Graphiti)

Graph-based semantic memory in `integrations/graphiti/`. Configured through the Electron app's onboarding/settings UI (CLI users can alternatively set `GRAPHITI_ENABLED=true` in `.env`). See [ARCHITECTURE.md](shared_docs/ARCHITECTURE.md#memory-system) for details.
Graph-based semantic memory accessed via MCP sidecar (`integrations/graphiti/`). The Python Graphiti sidecar remains; the AI layer connects to it via `createMCPClient` from `@ai-sdk/mcp`. Configured through the Electron app's onboarding/settings UI. See [ARCHITECTURE.md](shared_docs/ARCHITECTURE.md#memory-system) for details.

## Frontend Development

### Tech Stack

React 19, TypeScript (strict), Electron 39, Zustand 5, Tailwind CSS v4, Radix UI, xterm.js 6, Vite 7, Vitest 4, Biome 2, Motion (Framer Motion)
React 19, TypeScript (strict), Electron 39, Vercel AI SDK v6, Zustand 5, Tailwind CSS v4, Radix UI, xterm.js 6, Vite 7, Vitest 4, Biome 2, Motion (Framer Motion)

### Path Aliases (tsconfig.json)

Expand Down Expand Up @@ -214,9 +251,9 @@ Main ↔ Renderer communication via Electron IPC:

The frontend manages agent lifecycle end-to-end:
- **`agent-queue.ts`** — Queue routing, prioritization, spec number locking
- **`agent-process.ts`** — Spawns and manages agent subprocess communication
- **`agent-process.ts`** — Spawns worker threads via `WorkerBridge` for agent execution
- **`agent-state.ts`** — Tracks running agent state and status
- **`agent-events.ts`** — Agent lifecycle events and state transitions
- **`agent-events.ts`** — Agent lifecycle events and state transitions (structured events from worker threads)

### Claude Profile System (`src/main/claude-profile/`)

Expand All @@ -242,9 +279,6 @@ Full PTY-based terminal integration:
- **Pre-commit:** Husky + lint-staged runs Biome on staged `.ts/.tsx/.js/.jsx/.json`
- **Testing:** Vitest + React Testing Library + jsdom

### Backend
- **Linting:** Ruff
- **Testing:** pytest (`apps/backend/.venv/bin/pytest tests/ -v`)

## i18n Guidelines

Expand All @@ -269,7 +303,7 @@ When adding new UI text: add keys to ALL language files, use `namespace:section.

Supports Windows, macOS, Linux. CI tests all three.

**Platform modules:** `apps/frontend/src/main/platform/` and `apps/backend/core/platform/`
**Platform modules:** `apps/frontend/src/main/platform/`

| Function | Purpose |
|----------|---------|
Expand All @@ -285,17 +319,14 @@ Never hardcode paths. Use `findExecutable()` and `joinPaths()`. See [ARCHITECTUR
QA agents can interact with the running Electron app via Chrome DevTools Protocol:

1. Start app: `npm run dev:debug` (debug mode for AI self-validation via Electron MCP)
2. Set `ELECTRON_MCP_ENABLED=true` in `apps/backend/.env`
3. Run QA: `python run.py --spec 001 --qa`
2. Enable Electron MCP in settings
3. QA runs automatically through the TypeScript agent pipeline

Tools: `take_screenshot`, `click_by_text`, `fill_input`, `get_page_structure`, `send_keyboard_shortcut`, `eval`. See [ARCHITECTURE.md](shared_docs/ARCHITECTURE.md#end-to-end-testing) for full capabilities.

## Running the Application

```bash
# CLI only
cd apps/backend && python run.py --spec 001

# Desktop app
npm start # Production build + run
npm run dev # Development mode with HMR
Expand Down
Loading