A reference architecture for AI agent orchestration, trust measurement, and tool integration. Designed to be studied, forked, and adapted -- not contributed to directly. All code changes in this repository are authored by AI agents under human oversight.
This repo demonstrates how to run a council of AI agents (Claude, Gemini, Codex, OpenCode, Crush) across a shared codebase with board-driven task delegation, automated PR review, security hardening, and containerized tooling. It also includes standalone research packages for sleeper agent detection, autonomous economic agent simulation, cross-platform runtime injection, and an embeddable OS framework targeting PSP hardware, UE5, and Raspberry Pi.
Use this repo to learn how to:
- Orchestrate multiple AI agents with a GitHub Projects v2 work queue
- Measure and enforce trust boundaries for autonomous agents (wrapper guards, iteration limits, claim tracking)
- Integrate 18 MCP servers spanning code quality, content creation, 3D graphics, video editing, and speech synthesis
- Build hardened CI/CD pipelines for agent-authored code (15-stage pipeline, security scanning, multi-arch Docker builds)
- Detect sleeper agent behaviors via residual stream analysis and linear probes
- Inject into closed-source applications for modding, debugging, or AI agent embodiment
Important: This is an advanced template designed for experienced developers working with autonomous AI agents. Before diving in, we strongly recommend:
Read the AI Safety Training Guide - Essential concepts for safe human-AI collaboration, including deception detection, scalable oversight, and control protocols
Take an AI Safety course at BlueDot Impact - Free, rigorous training programs covering AI safety fundamentals, governance, and alignment
Working with AI agents introduces risks that differ fundamentally from traditional software. Understanding these risks isn't optional - it's a prerequisite for responsible development.
This repository contains dual-use research and tooling. The maintainer provides no guidance, consultation, or feature development -- whether solicited or unsolicited, compensated or uncompensated. This policy exists as a legal protection given the nature of the codebase.
- No feature requests will be accepted. Money does not change this.
- No guidance or consulting will be provided on usage, adaptation, or deployment of any component.
- No external contributions are accepted. See CONTRIBUTING.md.
- The maintainer does not seek or engage with community interaction. Public comments, issues filed by external parties, events, and news surrounding this repository or its components may be ignored without response to maintain neutrality and legal distance.
- No endorsement is implied. The existence of code in this repository does not constitute encouragement, recommendation, or endorsement of any particular use.
This repository is released under a public domain dedication. You may fork and adapt it freely. The maintainer assumes no obligation to any downstream user for any reason.
This project follows a container-first approach:
- All tools and CI/CD operations run in Docker containers for maximum portability
- Zero external dependencies - runs on any Linux system with Docker
- Self-hosted infrastructure - no cloud costs, full control over runners
- Single maintainer design - optimized for individual developer productivity, no contributors model
- Modular MCP architecture - Separate specialized servers for different functionalities
New to the template? Check out our Template Quickstart Guide for step-by-step customization instructions!
-
Prerequisites: Linux system with Docker (v20.10+) and Docker Compose (v2.0+)
-
Clone and setup
git clone https://github.com/AndrewAltimit/template-repo cd template-repo # Build the Rust CLI tools (optional - pre-built binaries available in releases) cd tools/rust/board-manager && cargo build --release cd ../github-agents-cli && cargo build --release
-
Set API keys (if using AI features)
export OPENROUTER_API_KEY="your-key-here" # For OpenCode/Crush export GEMINI_API_KEY="your-key-here" # For Gemini
-
Use with Claude Code: MCP servers are configured in
.mcp.jsonand auto-started by Claude. See MCP Configuration for essential vs full setups. -
Run CI/CD operations
automation-cli ci run full # Full pipeline
For detailed setup, see CLAUDE.md and Template Quickstart Guide.
Six AI agents for development and automation. See AI Agents Documentation for details.
| Agent | Provider | Use Case | Documentation |
|---|---|---|---|
| Claude Code | Anthropic | Primary development assistant | Setup Guide |
| Codex | OpenAI | Code generation | Setup Guide |
| OpenCode | OpenRouter | Code generation | AI Code Agents |
| Crush | OpenRouter | Code generation | AI Code Agents |
| Gemini | Code review (limited tool use) | Setup Guide | |
| GitHub Copilot | GitHub | PR review suggestions | - |
All code generation agents (Codex, OpenCode, Crush) provide equivalent functionality - choose based on your API access.
Security: Keyword triggers, user allow list, secure token management. See Security Model
Safety Training: Essential AI safety concepts for human-AI collaboration. See Human Training Guide
Sleeper Agents: Create and evaluate sleeper agents in order to detect misalignment and probe for deception. See Sleeper Agents Package
AI agents autonomously manage the development lifecycle from issue creation through PR merge:
Issue Created → Admin Approval → Agent Claims → PR Created → AI Review → Human Merge
The Flow:
- Issue Creation - Issues are created manually or by agents via
backlog-refinement.yml, automatically added to the GitHub Projects board - Admin Approval - An authorized user comments
[Approved][Claude](or another agent name) to authorize work - Agent Claims -
board-agent-worker.ymlfinds approved issues, the agent claims the issue and creates a working branch - Implementation - The agent implements the fix/feature and opens a PR
- AI Review -
pr-validation.ymltriggers Gemini code review;pr-review-monitor.ymllets agents iterate on feedback - Human Merge - Admin reviews and merges the PR
Security Model:
- Approval Required - Agents cannot work on issues without explicit
[Approved][Agent]comment - Authorized Users Only - Only users listed in
.agents.yaml→security.agent_adminscan approve - Pattern Validation - Must use
[Action][Agent]format (e.g.,[Approved][Claude]) to prevent false positives - Claim Tracking - Agents post claim comments with timestamps to prevent conflicts
See Security Documentation for the complete security model.
Technical reports and guides exploring AI risks, safety frameworks, and philosophical questions. PDFs are automatically built from LaTeX source and published with each release.
Scenario-based projection reports analyzing potential futures involving advanced AI systems. See Projections Documentation.
| Report | Topic | Source | |
|---|---|---|---|
| AI Agents Political Targeting | Political violence risk | Download | LaTeX |
| AI Agents WMD Proliferation | WMD proliferation risk | Download | LaTeX |
| AI Agents Espionage Operations | Intelligence tradecraft | Download | LaTeX |
| AI Agents Economic Actors | Autonomous economic actors | Download | LaTeX |
| AI Agents Financial Integrity | Money laundering & corruption | Download | LaTeX |
| AI Agents Institutional Erosion | IC monopoly erosion & verification pivot | Download | LaTeX |
| Guide | Description | Source | |
|---|---|---|---|
| Agentic Workflow Handout | AI agent pipeline architecture and workflows | Download | LaTeX |
| Sleeper Agents Framework | AI backdoor detection using residual stream analysis | Download | LaTeX |
| AgentCore Memory Integration | Multi-provider AI memory system | Download | LaTeX |
| Virtual Character System | AI agent embodiment platform | Download | LaTeX |
| BioForge CRISPR Automation | Agent-driven biological automation platform | Download | LaTeX |
| Secure Terminal Briefcase | Tamper-responsive hardware security system with PQC recovery | Download | LaTeX |
Philosophical explorations of minds, experience, and intelligence. See Philosophy Papers Documentation.
| Paper | Topic | Source | |
|---|---|---|---|
| Architectural Qualia | What Is It Like to Be an LLM? | Download | LaTeX |
Standalone packages addressing different aspects of AI agent development, safety, and security:
| Package | Purpose | Documentation |
|---|---|---|
| Sleeper Agents | Research-validated detection framework for hidden backdoors in LLMs, based on Anthropic's research on deceptive AI that persists through safety training | README | PDF Guide |
| Economic Agents | Rust-based simulation framework demonstrating autonomous AI economic capability - agents that earn money, form companies, hire sub-agents, and seek investment. For governance research and policy development | README |
| Injection Toolkit | Cross-platform Rust framework for runtime integration - DLL injection (Windows), LD_PRELOAD (Linux), shared memory IPC, and overlay rendering. For game modding, debugging tools, and AI agent embodiment | README | Architecture |
| Tamper Briefcase | Tamper-responsive Raspberry Pi briefcase with dual-sensor detection, LUKS2 cryptographic wipe, and hybrid PQC recovery USB. For secure physical transport of field-deployable agent terminals | README | Hardware Docs |
| OASIS_OS | Embeddable OS framework in Rust -- skinnable shell with scene-graph UI, command interpreter, VFS, and plugin system. Renders on PSP hardware (sceGu), desktop (SDL2), and UE5 (render target via FFI). Containerized PPSSPP testing with NVIDIA GPU passthrough | README | Design Doc |
Rust CLI Tools (in tools/rust/):
| Tool | Purpose | Documentation |
|---|---|---|
| github-agents-cli | Issue/PR monitoring, refinement, code analysis, and agent execution | README |
| board-manager | GitHub Projects v2 board operations - claim, release, status updates | README |
| git-guard | Git CLI wrapper requiring sudo for dangerous operations (force push, --no-verify) | README |
| gh-validator | GitHub CLI wrapper for automatic secret masking | README |
| pr-monitor | Dedicated PR monitoring for admin/review feedback during development | README |
| markdown-link-checker | Fast concurrent markdown link validator for CI/CD pipelines | README |
| code-parser | Parse and apply code blocks from AI agent responses | README |
| mcp-code-quality | Rust MCP server for code quality tools (formatting, linting, testing) | README |
# Install Python packages
pip install -e ./packages/sleeper_agents
# Build Rust packages (requires Rust toolchain)
cd packages/injection_toolkit && cargo build --release
cd packages/economic_agents && cargo build --release
cd tools/rust/github-agents-cli && cargo build --release
cd tools/rust/board-manager && cargo build --release- 18 MCP Servers - Code quality, content creation, AI assistance, 3D graphics, video editing, speech synthesis, and more
- 6 AI Agents - Autonomous development workflow from issue to merge
- 5 Packages - Sleeper agent detection, economic agent simulation, runtime injection, tamper-responsive briefcase, CRISPR automation
- Container-First Architecture - Maximum portability and consistency
- Self-Hosted CI/CD - Zero-cost GitHub Actions infrastructure
- Company Integration - Corporate proxy builds for enterprise AI APIs (Docs)
For enterprise environments requiring custom certificates, customize automation/corporate-proxy/shared/scripts/install-corporate-certs.sh. This script runs during Docker builds for all containers. See the customization guide for details.
.
├── .github/workflows/ # GitHub Actions workflows
├── docker/ # Docker configurations
├── packages/ # Installable packages
│ ├── sleeper_agents/ # AI backdoor detection framework (Python)
│ ├── economic_agents/ # Autonomous economic agents (Rust)
│ ├── injection_toolkit/ # Runtime injection framework (Rust)
│ └── tamper_briefcase/ # Tamper-responsive briefcase system (Rust)
├── tools/
│ ├── mcp/ # 18 MCP servers (see MCP Servers section)
│ ├── rust/ # Rust CLI tools
│ │ ├── github-agents-cli/ # Issue/PR monitoring, refinement, analysis
│ │ ├── board-manager/ # GitHub Projects board operations
│ │ ├── git-guard/ # Git wrapper requiring sudo for dangerous ops
│ │ ├── gh-validator/ # Secret masking for GitHub CLI
│ │ ├── pr-monitor/ # PR feedback monitoring
│ │ ├── markdown-link-checker/ # Fast link validation for CI/CD
│ │ ├── code-parser/ # Parse code blocks from AI responses
│ │ └── mcp-code-quality/ # Rust MCP server for code quality
│ └── cli/ # Agent runners and utilities
├── automation/ # CI/CD and automation scripts
├── tests/ # Test files
├── docs/ # Documentation
└── config/ # Configuration files
- Code Quality - Formatting, linting, auto-formatting
- Content Creation - Manim animations, LaTeX, TikZ diagrams
- Gaea2 - Terrain generation (Documentation)
- Blender - 3D content creation, rendering, physics simulation (Documentation)
- Gemini - AI consultation (containerized and host modes available)
- Codex - AI-powered code generation and completion
- OpenCode - Code generation via OpenRouter
- Crush - Code generation via OpenRouter
- Meme Generator - Create memes with templates
- ElevenLabs Speech - Advanced TTS with v3 model, 50+ audio tags, 74 languages (Documentation)
- Video Editor - AI-powered video editing with transcription and scene detection (Documentation)
- Virtual Character - AI agent embodiment in virtual worlds (VRChat, Blender, Unity) (Documentation)
- GitHub Board - GitHub Projects v2 board management, work claiming, agent coordination (Documentation)
- AI Toolkit - LoRA training interface (remote: 192.168.0.222:8012)
- ComfyUI - Image generation interface (remote: 192.168.0.222:8013)
- AgentCore Memory - Multi-provider AI memory (AWS AgentCore or ChromaDB) (Documentation)
- Reaction Search - Semantic search for anime reaction images (Rust)
- Desktop Control - Cross-platform desktop automation for Linux and Windows (Documentation)
- STDIO Mode (local MCPs): Configured in
.mcp.json, auto-started by Claude - HTTP Mode (remote MCPs): Run the MCP using docker compose on the remote node.
See MCP Architecture Documentation and STDIO vs HTTP Modes for details.
For complete tool listings, see MCP Tools Reference
See .env.example for all available options.
.mcp.json- MCP server configuration for Claude Codedocker-compose.yml- Container services configurationCLAUDE.md- Project-specific Claude Code instructions (root directory)AGENTS.md- Universal AI agent configuration and guidelines (root directory)docs/agents/project-context.md- Context for AI reviewers
All Python operations run in Docker containers:
# Run CI operations via automation-cli
automation-cli ci run format # Check formatting
automation-cli ci run lint-basic # Basic linting
automation-cli ci run test # Run tests
automation-cli ci run full # Full CI pipeline
# Run specific tests
docker compose run --rm python-ci pytest tests/test_mcp_tools.py -v- Pull Request Validation - Automatic Gemini AI review
- Continuous Integration - Full CI pipeline
- Code Quality - Multi-stage linting (containerized)
- Automated Testing - Unit and integration tests
- Security Scanning - Bandit and safety checks
All workflows run on self-hosted runners for zero-cost operation.
- AGENTS.md - Universal AI agent configuration and guidelines
- CLAUDE.md - Claude-specific instructions and commands
- MCP Architecture - Modular server design
- AI Agents Documentation - AI agents overview
- Template Quickstart Guide - Customize the template for your needs
- Self-Hosted Runner Setup
- GitHub Environments Setup
- Containerized CI
This project is released under the Unlicense (public domain dedication).
For jurisdictions that do not recognize public domain: As a fallback, this project is also available under the MIT License.
This repository and all associated documentation, code, research papers, and tools are provided as-is with no warranty of any kind. The maintainer makes no representations regarding the suitability, legality, or safety of any component for any purpose. Users assume all responsibility for their use of this material.
The maintainer is not responsible for any use or misuse of the contents of this repository. No advisory, consulting, support, or guidance relationship is created by the publication of this code. The maintainer expressly disclaims any obligation to respond to inquiries, feature requests, bug reports, or other communications from any party.
Portions of this repository contain dual-use security research and tooling. Publication of this material is for defensive research, education, and policy analysis purposes. The maintainer does not endorse, encourage, or facilitate any unlawful use.
