The flagship product of Autonomi. Multi-agent autonomous startup system for Claude Code, OpenAI Codex CLI, and Google Gemini CLI. Takes PRD to fully deployed product with minimal human intervention.
# Launch Claude Code with autonomous permissions
claude --dangerously-skip-permissions
# Then invoke:
# "Loki Mode" or "Loki Mode with PRD at path/to/prd"SKILL.md # Slim core skill (~266 lines) - progressive disclosure
providers/ # Multi-provider support (v5.0.0)
claude.sh # Claude Code - full features
codex.sh # OpenAI Codex CLI - degraded mode
gemini.sh # Google Gemini CLI - degraded mode
loader.sh # Provider loader utility
memory/ # Complete memory system (v5.15.0)
engine.py # Core memory engine
schemas.py # Pydantic schemas
storage.py # Storage backend
retrieval.py # Task-aware retrieval
consolidation.py # Episodic-to-semantic pipeline
token_economics.py # Token usage tracking
embeddings.py # Vector embeddings (optional)
vector_index.py # Vector search index
layers/ # Progressive disclosure implementation
skills/ # On-demand skill modules (v3.0 architecture)
00-index.md # Module selection rules and routing
model-selection.md # Task tool, parallelization, thinking modes
providers.md # Multi-provider documentation
quality-gates.md # 10-gate system, velocity-quality balance
healing.md # Legacy system healing (Amazon AGI Lab patterns)
testing.md # Playwright, E2E, property-based testing
production.md # HN patterns, CI/CD, context management
troubleshooting.md # Common issues, red flags, fallbacks
agents.md # 41 agent types, structured prompting
artifacts.md # Generation, code transformation
patterns-advanced.md # OptiMind, k8s-valkey, Constitutional AI
parallel-workflows.md # Git worktrees, parallel streams, auto-merge
github-integration.md # GitHub issue import, PR creation, notifications
references/ # Detailed documentation (21 files)
legacy-healing-patterns.md # Amazon AGI Lab: friction, adapters, archaeology
openai-patterns.md # OpenAI Agents SDK: guardrails, tripwires, handoffs
lab-research-patterns.md # DeepMind + Anthropic: Constitutional AI, debate
production-patterns.md # HN 2025: What actually works in production
advanced-patterns.md # 2025 research patterns (MAR, Iter-VF, GoalAct)
tool-orchestration.md # ToolOrchestra-inspired efficiency & rewards
memory-system.md # Episodic/semantic memory architecture
quality-control.md # Code review, anti-sycophancy, guardrails
agent-types.md # 41 specialized agent definitions
sdlc-phases.md # Full SDLC workflow
task-queue.md # Queue system, circuit breakers
core-workflow.md # RARV cycle, autonomy rules
deployment.md # Cloud deployment instructions
business-ops.md # Business operation workflows
mcp-integration.md # MCP server capabilities
competitive-analysis.md # Auto-Claude, MemOS, Dexter comparison
confidence-routing.md # Model selection by confidence
cursor-learnings.md # Cursor scaling patterns
prompt-repetition.md # Haiku prompt optimization
agents.md # Agent dispatch patterns
events/ # Unified Event Bus (v5.17.0)
bus.py # Python event bus
bus.ts # TypeScript event bus
emit.sh # Bash helper for emitting events
docs/ # Architecture documentation
SYNERGY-ROADMAP.md # 5-pillar tool integration architecture
autonomy/ # Runtime and autonomous execution
context-tracker.py # Context window usage tracking
notification-checker.py # Notification trigger evaluation
templates/ # 21 PRD templates (saas, cli, discord-bot, etc.)
benchmarks/ # SWE-bench and HumanEval benchmarks
Every iteration follows: Reason -> Act -> Reflect -> Verify
- Opus: Planning and architecture ONLY (system design, high-level decisions)
- Sonnet: Development and functional testing (implementation, integration tests)
- Haiku: Unit tests, monitoring, and simple tasks - use extensively for parallelization
- Claude Code: Full features (subagents, parallel, Task tool, MCP)
- OpenAI Codex CLI: Degraded mode (sequential only, no Task tool)
- Google Gemini CLI: Degraded mode (sequential only, no Task tool)
# Provider selection
./autonomy/run.sh --provider codex ./prd.md
loki start --provider gemini ./prd.md
LOKI_PROVIDER=codex loki start ./prd.md- Static analysis (CodeQL, ESLint)
- 3-reviewer parallel system (blind review)
- Anti-sycophancy checks (devil's advocate on unanimous approval)
- Severity-based blocking (Critical/High/Medium = BLOCK)
- Test coverage gates (>80% unit, 100% pass)
- Backward compatibility gate (healing mode - behavioral preservation, v6.67.0)
- Inspired by: Amazon AGI Lab's "How Agentic AI Helps Heal Systems We Can't Replace"
- CLI:
loki heal <path> [--phase archaeology|stabilize|isolate|modernize|validate] - Principles: Friction-as-semantics, failure-first learning, universal adapters, incremental healing, institutional knowledge preservation
- Artifacts:
.loki/healing/(friction-map.json, failure-modes.json, institutional-knowledge.md) - Review:
legacy-healing-auditorspecialist added to code review pool - Gate: Gate 10 backward compatibility check (blocks removal of unclassified friction)
- Hooks:
hook_pre_healing_modify(),hook_post_healing_modify(),hook_healing_phase_gate() - Memory:
FrictionPointandFailureModeschemas for healing-specific memory entries - Skill:
skills/healing.md| Reference:references/legacy-healing-patterns.md
- Episodic: Specific interaction traces (
.loki/memory/episodic/) - Semantic: Generalized patterns (
.loki/memory/semantic/) - Procedural: Learned skills (
.loki/memory/skills/) - Progressive Disclosure: 3-layer loading (index, timeline, full details)
- Token Economics: Discovery vs read token tracking
- Vector Search: Optional embedding-based similarity (sentence-transformers)
- CLI:
loki memory index|timeline|consolidate|economics|retrieve|episode|pattern|skill|vectors - API: REST endpoints at
/api/memory/* - Implementation:
memory/Python package with RARV integration
- Efficiency: Task cost tracking (
.loki/metrics/efficiency/) - Rewards: Outcome/efficiency/preference signals (
.loki/metrics/rewards/)
| File | Lines | Role |
|---|---|---|
autonomy/loki |
10,820 | CLI (74 cmd_ functions, dispatch at line 7400) |
autonomy/run.sh |
8,766 | Orchestration engine (RARV loop) |
autonomy/completion-council.sh |
1,403 | Completion detection (council voting) |
dashboard/server.py |
4,482 | FastAPI (100+ endpoints, WebSocket) |
memory/retrieval.py |
1,565 | Task-aware memory retrieval |
memory/storage.py |
1,396 | File-based memory backend |
memory/engine.py |
1,297 | Memory orchestrator |
memory/consolidation.py |
951 | Episodic-to-semantic pipeline |
mcp/server.py |
1,439 | MCP server (15 tools) |
providers/loader.sh |
184 | Provider loader |
| Function | Location | Purpose |
|---|---|---|
cmd_start() |
loki:485 |
Start autonomous execution |
main() (CLI) |
loki:7400 |
CLI dispatch |
main() (runner) |
run.sh:8234 |
Runner entry point |
run_autonomous() |
run.sh:7233 |
Main iteration loop |
build_prompt() |
run.sh:6899 |
Prompt construction |
save_state() |
run.sh:6787 |
Persist state |
council_should_stop() |
completion-council.sh:1283 |
Completion decision |
run_code_review() |
run.sh:4935 |
3-reviewer code review |
create_checkpoint() |
run.sh:5483 |
Snapshot state |
store_episode_trace() |
run.sh:6626 |
Memory storage bridge |
check_human_intervention() |
run.sh:7897 |
PAUSE/STOP/INPUT signals |
detect_complexity() |
run.sh:1182 |
Auto-detect project complexity |
get_rarv_tier() |
run.sh:1311 |
Map iteration to model tier |
check_budget_limit() |
run.sh:6125 |
Budget circuit breaker |
is_rate_limited() |
run.sh:5940 |
Rate limit detection |
cmd_heal() |
loki:8603 |
Legacy system healing |
hook_pre_healing_modify() |
migration-hooks.sh:280 |
Friction safety gate |
hook_post_healing_modify() |
migration-hooks.sh:320 |
Characterization test verification |
hook_healing_phase_gate() |
migration-hooks.sh:375 |
Healing phase transition gate |
A PRD enters via loki start (line 485), which execs run.sh. The run_autonomous() loop (line 7233) builds prompts via build_prompt() (line 6899) injecting RARV instructions, SDLC phases, memory context, queue tasks, and checklist status. The provider is invoked (Claude via -p flag, Codex via exec --full-auto with CODEX_MODEL_REASONING_EFFORT env var, Gemini via positional prompt with --approval-mode=yolo). Post-iteration, the system runs checklist verification, app runner management, playwright smoke tests, and code review. Completion is determined by a council vote (council_should_stop at completion-council.sh:1283), completion promise text, or max iterations. All components communicate through .loki/ filesystem state files.
See .claude/projects/-Users-lokesh-git-loki-mode/memory/CODEBASE-KNOWLEDGE-GRAPH.md for complete reference.
Before documenting ANY feature, installation method, or capability:
- Verify it exists - Check files, run commands, test endpoints
- Run feedback loop - Use Task tool with Opus to review claims for accuracy
- Be factual only - Never document features that don't work yet
- Mark planned features - Use "Coming Soon" or "Planned" labels for unimplemented features
Example verification:
# Before documenting "npm install -g loki-mode"
npm view loki-mode # Does package exist on registry?
# Before documenting a CLI command
which loki && loki --help # Does command exist?
# Before documenting a file path
ls -la path/to/file # Does file exist?Feedback loop pattern:
Task tool -> subagent_type: "general-purpose" or model: "opus"
Prompt: "Review the following claims for factual accuracy.
Verify each statement is true and working.
Flag anything that cannot be verified."
Before reporting ANY task as done, run ALL cleanup steps below. No exceptions.
-
Kill spawned processes (dashboard servers, test runners, etc.):
lsof -ti:57374 | xargs kill -9 2>/dev/null || true pkill -f "loki-run-" 2>/dev/null || true
-
Remove temp files:
rm -rf /tmp/loki-* /tmp/test-* /tmp/package /tmp/*.tgz 2>/dev/null || true
-
Verify cleanup (MUST run, not optional):
ps -ef | grep -E "(loki|test)" | grep -v grep || echo "Clean" ls /tmp/loki-* /tmp/test-* 2>&1 | grep -v "No such file" || echo "Clean"
-
Report cleanup status to user in task completion message
When user says "commit" or "commit and push", follow this exact sequence:
- Run
git diff --statto show changed files - List each file with a 1-line description of the change
- Suggest commit message in a code block
- STOP and WAIT for user approval before executing
git commit - Stage files individually by name (never
git add -Aorgit add .) - Only after user confirms, commit and push if requested
- Keep under 500 lines (currently ~266)
- Reference detailed docs in
references/instead of inlining - Update version in header AND footer
- Update CHANGELOG.md with new version entry
Follows semantic versioning: MAJOR.MINOR.PATCH
- Current: v6.80.1
- MAJOR bump for architecture changes (v6.0.0 = dual-mode architecture, loki run)
- MINOR bump for new features (v5.23.0 = Dashboard File-Based API)
- PATCH bump for fixes (v5.22.1 = session.json phantom state)
- CRITICAL: NEVER use emojis - Not in code, documentation, commit messages, README, or any output
- No emoji exceptions - This includes website content, markdown files, and all text
- If you see emojis anywhere in the codebase, remove them immediately
- Clear, concise comments only when necessary
- Follow existing patterns in codebase
When releasing a new version, follow ALL steps below. Nothing should be skipped.
Update the version string in every file listed below. Search for the old version and replace with the new one.
Core version files (MUST update):
VERSION # Single line: X.Y.Z
package.json # "version": "X.Y.Z"
SKILL.md # Header (line ~6) AND footer (last line)
Dockerfile # LABEL version="X.Y.Z"
Dockerfile.sandbox # LABEL version="X.Y.Z"
vscode-extension/package.json # "version": "X.Y.Z"
CLAUDE.md # Version Numbering section (Current: vX.Y.Z)
Module version files (MUST update):
dashboard/__init__.py # __version__ = "X.Y.Z"
mcp/__init__.py # __version__ = "X.Y.Z"
Documentation (MUST update):
CHANGELOG.md # Add new version entry at top
docs/INSTALLATION.md # Version header (line ~5)
wiki/Home.md # Current Version line
wiki/_Sidebar.md # Version line
wiki/API-Reference.md # Example version in responses
Docker image tags in docs (update on MAJOR/MINOR bumps):
README.md # Docker example tags (lines ~81, ~380)
docs/INSTALLATION.md # Docker image tags (7+ occurrences)
docker-compose.yml # Version comment (line 1)
The dashboard frontend MUST be rebuilt before any release. The build script writes directly to both dashboard-ui/dist/ and dashboard/static/ -- no manual copy needed.
cd dashboard-ui && npm ci && npm run build:all && cd ..Verify the built file exists and is reasonably sized (>100KB):
ls -la dashboard/static/index.htmlNote: npm publish also runs prepublishOnly which triggers this build automatically. The CI workflows build it explicitly as well. The build-standalone.js script writes to both locations in a single step.
# Shell script validation
bash -n autonomy/run.sh
bash -n autonomy/loki
# Python syntax validation
python3 -c "import ast, os; [ast.parse(open(f'dashboard/{f}').read()) for f in os.listdir('dashboard') if f.endswith('.py')]"
# JSON validation
python3 -c "import json; json.load(open('package.json')); json.load(open('vscode-extension/package.json')); print('JSON OK')"
# E2E dashboard tests (requires dashboard running on port 57374)
cd dashboard-ui && npx playwright test && cd ..This step prevents broken releases. Every single release MUST pass these checks BEFORE committing.
# 1. Verify npm tarball contains expected files
# If web-app/dist/ or dashboard/static/ are missing, the release is broken.
npm pack --dry-run 2>&1 | grep -E "web-app/dist|dashboard/static" || echo "FAIL: expected files missing from tarball"
# 2. Verify built artifacts exist in git (not just locally)
git ls-files web-app/dist/index.html | grep -q . || echo "FAIL: web-app/dist/ not tracked in git"
git ls-files dashboard/static/index.html | grep -q . || echo "FAIL: dashboard/static/ not tracked in git"
# 3. Local install test -- install from tarball like a real user
npm pack && npm install -g ./loki-mode-*.tgz
loki --version # should show new version
loki web --no-open & # should start without "Web app not built" error
sleep 3
curl -s http://127.0.0.1:57374/ | grep -q "Loki" && echo "PASS: web app serves" || echo "FAIL: web app broken"
curl -s http://127.0.0.1:57374/api/status | python3 -c "import json,sys; json.load(sys.stdin); print('PASS: API responds')" 2>/dev/null || echo "FAIL: API broken"
loki web stop
npm install -g loki-mode # restore previous version
rm -f loki-mode-*.tgz
# 4. If ANY check above fails, DO NOT release. Fix the root cause first.Why this exists: v6.25.0-v6.26.5 shipped 6 broken patches because we tested locally from the repo but never verified the npm tarball or a fresh global install. .gitignore excluded web-app/dist/ so CI never had the files. This checklist catches that class of bug before it reaches users.
git add -A
git commit -m "release: vX.Y.Z - description"
git push origin mainIMPORTANT: Do NOT manually create tags. The GitHub Actions workflow automatically:
- Creates the git tag
- Creates the GitHub Release with artifacts
- Publishes to npm (includes
dashboard/static/index.html) - Builds and pushes Docker image (includes
dashboard/with deps) - Updates Homebrew tap
- Publishes VSCode extension (includes dashboard IIFE bundle)
# Watch workflow progress
gh run list --limit 1
gh run watch <run-id>
# npm - verify dashboard is included
npm view loki-mode version
npm pack loki-mode --dry-run 2>&1 | grep dashboard/static
# Docker - verify dashboard works
docker pull asklokesh/loki-mode:X.Y.Z
docker run --rm asklokesh/loki-mode:X.Y.Z loki version
# Homebrew
brew update && brew info loki-mode
# VSCode extension
# Check marketplace or: code --list-extensions --show-versions | grep loki
# GitHub Release
gh release view vX.Y.ZEvery release MUST include these artifacts across ALL channels:
| Channel | Dashboard API (server.py) | Dashboard Frontend (static/) | Memory System | Skills/References |
|---|---|---|---|---|
| npm | dashboard/*.py |
dashboard/static/index.html |
memory/ |
skills/, references/ |
| Docker | COPY dashboard/ |
Built in Dockerfile or committed | memory/ |
skills/, references/ |
| Homebrew | Full tarball | Full tarball | Full tarball | Full tarball |
| VSCode | N/A (connects to API) | media/loki-dashboard.js (IIFE bundle) |
N/A | N/A |
| Release | Skill-only zip | N/A | N/A | references/ |
All credentials are stored as GitHub repository secrets and used by the workflow:
NPM_TOKEN: npm publish tokenDOCKERHUB_USERNAME/DOCKERHUB_TOKEN: Docker Hub credentialsHOMEBREW_TAP_TOKEN: PAT for homebrew-tap updates
# Run benchmarks
./benchmarks/run-benchmarks.sh humaneval --execute --loki
./benchmarks/run-benchmarks.sh swebench --execute --lokiBuilt on 2025 research from three major AI labs:
OpenAI:
- Agents SDK (guardrails, tripwires, handoffs, tracing)
- AGENTS.md / Agentic AI Foundation (AAIF) standards
Google DeepMind:
- SIMA 2 (self-improvement, hierarchical reasoning)
- Gemini Robotics (VLA models, planning)
- Dreamer 4 (world model training)
- Scalable Oversight via Debate
Anthropic:
- Constitutional AI (principles-based self-critique)
- Alignment Faking Detection (sleeper agent probes)
- Claude Code Best Practices (Explore-Plan-Code)
Academic:
- CONSENSAGENT (anti-sycophancy)
- GoalAct (hierarchical planning)
- A-Mem/MIRIX (memory systems)
- Multi-Agent Reflexion (MAR)
- NVIDIA ToolOrchestra (efficiency metrics)
See references/openai-patterns.md, references/lab-research-patterns.md, and references/advanced-patterns.md.