Generated: 2026-03-10
Commit: See git log -1
Branch: main
This root AGENTS.md covers project-wide conventions. Module-specific guides:
control-plane/state-engine/AGENTS.md- Event sourcing state kernelcontrol-plane/fractal-gateway/AGENTS.md- eBPF security gatewaycontrol-plane/fractal-gateway-ebpf/AGENTS.md- XDP packet filtering 🆕control-plane/formal-verifier/AGENTS.md- TLA+ formal verification 🆕control-plane/teardown-ctrl/AGENTS.md- Cascading cleanup controller 🆕
orchestration/manager/AGENTS.md- DAG topological executionorchestration/scheduler/AGENTS.md- Worker dispatch and warm poolorchestration/evaluator/AGENTS.md- Output validation and rollback
memory-bus/ingestion/AGENTS.md- SLM-powered intent extractionmemory-bus/vector-kv/AGENTS.md- Vector + KV storage with compression
execution-layer/sandbox-daemon/AGENTS.md- Firecracker MicroVM lifecycleexecution-layer/stateful-repl/AGENTS.md- Persistent terminal sessions
plugins/AGENTS.md- Plugin architecture (core + marketplace)
security-audit/AGENTS.md- Automated security scanning
observability-ui/web-dashboard/AGENTS.md- Next.js observability UI
- Go: 1.25+
- Rust: 1.75+
- Node.js: 20+
- Docker Desktop with WSL2 (Windows)
# Rust (Control Plane & Execution Layer)
cd control-plane && cargo build --release
cd control-plane/state-engine && cargo build
# Go (Memory Bus & Orchestration)
cd memory-bus && go build -o bin/ingestion ./ingestion
cd orchestration && go build -o bin/manager ./manager
# Frontend (Observability UI)
cd observability-ui/web-dashboard && npm install && npm run build# Run all tests
.\test-all.ps1 # Windows PowerShell
./test-all.sh # Linux/macOS
# Rust tests
cd control-plane && cargo test
cd control-plane/state-engine && cargo test -- --nocapture
# Go tests with coverage
cd memory-bus && go test -v -coverprofile=coverage.out ./...
go tool cover -html=coverage.out -o coverage.html
# Frontend tests
cd observability-ui/web-dashboard && npm run lint# Rust
cargo clippy --all-targets --all-features -- -D warnings
# Go
golangci-lint run ./...
# Frontend
npm run lint- Zero-compromise security: Never bypass eBPF sandbox or IAM policies
- Event sourcing: All state changes must be append-only with version tracking
- Performance + isolation: Balance nanosecond performance with strict isolation
- Bilingual docs: Chinese for explanations, English for code/comments
// Standard library first
use std::sync::Arc;
use thiserror::Error;
// External crates
use redis::AsyncCommands;
use sqlx::{PgPool, Row};
// Internal modules
use crate::models::{Snapshot, StateEvent};- Use
thiserrorfor custom error types - Never use
.unwrap()or.expect()in production code - Prefer
Result<T, EngineError>with explicit error variants - Use
?operator for error propagation
#[derive(Error, Debug)]
pub enum EngineError {
#[error("Redis error: {0}")]
Redis(#[from] redis::RedisError),
#[error("PostgreSQL error: {0}")]
Postgres(#[from] sqlx::Error),
}- Structs/Enums: PascalCase (
StateEngine,StateEvent) - Functions/Methods: snake_case (
append_event,get_latest_snapshot) - Constants: UPPER_SNAKE_CASE (
REDIS_CACHE_TTL_SECS) - Files: snake_case (
engine.rs,state_event.rs)
- Use English for all comments and documentation
- Include doc comments (
///) for public APIs - Explain why, not just what
// Standard library first
import (
"encoding/json"
"log"
"sync"
)
// External packages
import (
"github.com/google/uuid"
)
// Internal packages
import (
"sma-os/memory-bus/models"
)- Always check errors explicitly
- Use descriptive error messages with context
- Never ignore errors with
_
if err != nil {
log.Printf("[Component] Failed to process: %v", err)
return err
}- Structs/Types: PascalCase (
TaskNode,DAGManager) - Functions: PascalCase for export, camelCase for private
- Constants: ALL_CAPS with underscores
- Files: snake_case (
ingestion_test.go)
// React and Next.js first
import { useState, useCallback } from "react";
import { motion } from "framer-motion";
// Third-party libraries
import ReactFlow from "reactflow";
import "reactflow/dist/style.css";
// Internal components
import { DagNode } from "@/components/DagNode";"use client";
export interface DagViewerProps {
initialNodes: Node[];
initialEdges: Edge[];
}
export default function DagViewer({ initialNodes }: DagViewerProps) {
// Hooks first
const [nodes, setNodes] = useNodesState(initialNodes);
// Event handlers
const onNodeClick = useCallback((node: Node) => {
// Handler logic
}, []);
// Render
return <div>{/* JSX */}</div>;
}- Components: PascalCase (
DagViewer,StateNode) - Functions/Variables: camelCase (
onNodeClick,isLoading) - Constants: UPPER_SNAKE_CASE (
MAX_RETRY_COUNT) - CSS: kebab-case (
.dag-viewer,.state-node)
SMA-OS/
├── control-plane/ # Rust: State kernel, eBPF, formal verification
│ ├── state-engine/ # Event sourcing with Redis/PostgreSQL
│ ├── fractal-gateway/ # Resource isolation and auth
│ └── teardown-ctrl/ # Cascading cleanup controller
├── orchestration/ # Go: DAG orchestration and scheduling
│ ├── manager/ # Topological task execution
│ ├── scheduler/ # Worker dispatch
│ └── evaluator/ # Output validation
├── execution-layer/ # Rust: Firecracker MicroVM management
│ ├── sandbox-daemon/ # VM lifecycle management
│ └── stateful-repl/ # Persistent terminals
├── memory-bus/ # Go: Structured memory with LLM fallback
│ ├── ingestion/ # SLM-powered intent extraction
│ └── vector-kv/ # Vector + KV storage
└── observability-ui/ # Next.js: Real-time DAG visualization
// All state changes are events appended to a log
pub async fn append_event(&self, event: StateEvent) -> Result<(), EngineError> {
// 1. Write to Redis for fast recovery
// 2. Persist to PostgreSQL for durability
// 3. Trigger snapshot every 1000 events
}// Topological sort with concurrent worker dispatch
func (dm *DAGManager) Execute() error {
// 1. Compute in-degrees
// 2. Enqueue zero in-degree nodes
// 3. Dispatch workers concurrently
// 4. Decrement in-degrees on completion
}#[cfg(test)]
mod tests {
#[tokio::test]
async fn test_append_and_replay() {
// Use testcontainers for isolated DB tests
// Test event append + state recovery
}
}func TestProcessInput_ValidInput(t *testing.T) {
// Mock external APIs with httptest.Server
// Test both success and error paths
}- Check
REDIS_URLenvironment variable - Ensure Docker container is running:
docker ps | grep redis
- Verify
DATABASE_URLis correct - Run migrations manually:
sqlx migrate run --database-url <url>
- Clear cache:
cargo clean && cargo update - Check for conflicting tokio features
- Clear module cache:
go clean -modcache && go mod tidy
# Start all dependencies
docker-compose up -d postgres redis clickhouse weaviate jaeger prometheus
# Run services
cargo run --bin state-engine
go run ./memory-bus/ingestion
npm run dev --prefix observability-ui/web-dashboard- Use Kubernetes Helm charts for Enterprise mode
- Configure resource limits and network policies
- Enable OpenTelemetry tracing to Jaeger
- Never commit
.envfiles - Use.env.exampleas template - API keys in environment variables only - DeepSeek, database URLs, etc.
- eBPF sandbox is mandatory - No bypassing for "convenience"
- Audit all external dependencies - Use
cargo auditandnpm audit
- Architecture questions: See
AI_DEVELOPER_GUIDE.md - API documentation: Check inline doc comments
- Debugging: Enable tracing with
RUST_LOG=debug - Performance: Check Jaeger traces for latency analysis
| Module | Purpose | Language | Complexity |
|---|---|---|---|
control-plane/state-engine |
Event sourcing with Redis/PostgreSQL | Rust | High |
control-plane/fractal-gateway |
Resource isolation and auth | Rust | Medium |
control-plane/fractal-gateway-ebpf |
XDP packet filtering | Rust | High |
control-plane/formal-verifier |
TLA+ formal verification | TLA+ | High |
control-plane/teardown-ctrl |
Cascading cleanup controller | Rust | Medium |
orchestration/manager |
Topological task execution | Go | Medium |
orchestration/scheduler |
Worker dispatch | Go | Low |
orchestration/evaluator |
Output validation | Go | Low |
memory-bus/ingestion |
SLM-powered intent extraction | Go | Medium |
memory-bus/vector-kv |
Vector + KV storage | Go | Low |
execution-layer/sandbox-daemon |
VM lifecycle management | Rust | Medium |
execution-layer/stateful-repl |
Persistent terminals | Rust | Low |
observability-ui/web-dashboard |
Real-time DAG visualization | TypeScript | Medium |
chaos-tests |
Chaos engineering framework | Rust | High |
benchmarks |
Performance benchmarking suite | Rust/Go | Medium |
sma-proto |
gRPC Protocol definitions | Protobuf | Low |
plugins |
Plugin architecture and marketplace | Rust | Medium |
security-audit |
Automated security scanning | Rust | Low |
- Event sourcing state engine with Redis/PostgreSQL
- eBPF security gateway
- DAG orchestration layer
- Observability UI with real-time DAG visualization
- Firecracker MicroVM integration
- eBPF probe deployment
- Chaos engineering tests
- Performance benchmarks (P99 latency < 10ms)
- Documentation completeness (>90%)
- Horizontal scaling (1000+ concurrent agents)
- Redis cluster with failover
- Connection pooling & rate limiting
- Multi-region deployment
- Automated failover and recovery
- Advanced monitoring and alerting
- Security audit and penetration testing
- Plugin architecture for custom executors
- Marketplace for pre-built agent templates
- Community-driven module registry
- Enterprise support and SLA
| Quarter | Focus | Milestone |
|---|---|---|
| Q2 2026 | Core completion | Firecracker + eBPF production ready |
| Q3 2026 | Performance | 1000+ agents, P99 < 10ms |
| Q4 2026 | Enterprise | Multi-region, HA, security audit |
| Q1 2027 | Ecosystem | Plugin system, marketplace launch |