Skip to content

Latest commit

 

History

History
218 lines (159 loc) · 7.63 KB

File metadata and controls

218 lines (159 loc) · 7.63 KB

Safety Policy

Aar's safety system is a layered defense that controls what the agent can read, write, and execute. It operates at three levels: denied-by-default patterns, policy decisions per tool call, and human approval gates.

How it works

Every tool call passes through the SafetyPolicy engine before execution:

Tool call → SafetyPolicy.check_tool() → ALLOW / DENY / ASK
                                              ↓
                                        if ASK → ApprovalCallback → APPROVED / DENIED

The policy evaluates rules in this order:

  1. Read-only mode — if enabled, all write and execute side-effects are denied immediately
  2. Approval requirements — if require_approval_for_writes or require_approval_for_execute is set, matching tools return ASK
  3. Path rules — explicit PathRule entries (first match wins)
  4. Denied paths — glob patterns that block file access
  5. Allowed paths — if set, only matching paths are permitted (whitelist)
  6. Command rules — explicit CommandRule entries for shell commands (first match wins)
  7. Denied commands — substring patterns that block dangerous shell commands
  8. If nothing matches, the tool call is ALLOWED

Built-in defaults

Denied paths (SafetyConfig)

These glob patterns are always blocked for file read/write tools, regardless of transport or flags:

Category Patterns
Unix system files /etc/shadow, /etc/passwd, /etc/sudoers, /etc/sudoers.d/**
Environment files **/.env, **/.env.*
Credentials **/credentials, **/credentials.*, **/secrets, **/secrets.*
Key material **/*.pem, **/*.key, **/*.p12, **/*.pfx
SSH **/.ssh/**, **/id_rsa, **/id_dsa, **/id_ecdsa, **/id_ed25519
Cloud providers **/.aws/**, **/.azure/**, **/.config/gcloud/**
Package manager tokens **/.netrc, **/.npmrc, **/.pypirc

Denied commands (PolicyConfig)

These substring patterns block dangerous shell commands:

Category Patterns
Filesystem destruction rm -rf /, rm -rf /*, rm -rf ~, mkfs, dd if=, > /dev/sda
System control shutdown, reboot, halt, poweroff, init 0, init 6
Fork bomb :(){:|:&};:
Permission escalation chmod 777, chmod -R 777
Remote code execution curl|sh, curl | sh, curl|bash, curl | bash, wget|sh, wget | sh, wget|bash, wget | bash
Reverse shell nc -e, ncat -e
History wipe history -c

Both lists can be extended via configuration (CLI flags, config files, or programmatic API) but not reduced below the defaults through CLI flags alone.

Per-transport defaults

aar chat and aar tui (interactive)

Workspace sandbox ON by default:

Setting Default Effect
--require-approval on Prompts before every write or shell command
--restrict-to-cwd on File tools can only access cwd/**

This creates a two-layer defense:

  • File tools are restricted to the current directory by allowed_paths
  • Bash bypasses allowed_paths (it runs arbitrary commands), so it requires human approval instead

Disable the sandbox for trusted workflows:

aar chat --no-require-approval --no-restrict-to-cwd

aar run (automation)

Permissive by default:

Setting Default Effect
--require-approval off No approval prompts
--restrict-to-cwd off File tools can access any non-denied path

Opt in to the sandbox for untrusted tasks:

aar run "task" --require-approval --restrict-to-cwd

aar serve (web API)

The web server accepts an optional safety field in the request body to override safety settings per request:

{
  "prompt": "do something",
  "safety": {
    "read_only": true,
    "require_approval_for_writes": true
  }
}

By default, the web transport has no approval callback — any tool call that returns ASK is auto-denied (no human present). To add a human gate, implement an external approval flow via SSE events.

The --read-only flag is available on aar serve for global read-only mode.

CLI flags reference

Flag Available on Description
--read-only chat, run, tui, serve Block all write and execute tools
--require-approval / --no-require-approval chat, run, tui Prompt before write/execute tools
--restrict-to-cwd / --no-restrict-to-cwd chat, run, tui Restrict file tools to cwd/**
--denied-paths TEXT chat, run, tui Comma-separated globs appended to defaults
--allowed-paths TEXT chat, run, tui Comma-separated globs (overrides --restrict-to-cwd)
--config PATH chat, run, tui, serve Load full AgentConfig from a JSON file

Configuration file

Create a JSON file matching the AgentConfig schema:

{
  "provider": {
    "name": "anthropic",
    "model": "claude-sonnet-4-6"
  },
  "safety": {
    "read_only": false,
    "require_approval_for_writes": true,
    "require_approval_for_execute": true,
    "denied_paths": ["**/.env", "**/secrets/**"],
    "allowed_paths": ["/home/user/project/**"],
    "sandbox": "subprocess"
  },
  "max_steps": 30,
  "timeout": 120.0
}

Load with:

aar chat --config aar.json

Or programmatically:

from agent.core.config import load_config
from pathlib import Path

config = load_config(Path("aar.json"))

Precedence (lowest to highest): PolicyConfig defaults -> config file (--config) -> explicit CLI flags

Approval callback

When a policy decision is ASK, the ApprovalCallback is invoked. The callback receives the tool spec and the tool call, and returns one of:

Result Meaning
APPROVED Allow this specific call
DENIED Block this specific call
APPROVED_ALWAYS Allow this call and all future calls to the same tool

The CLI and TUI transports use a terminal prompt:

+----------------------------------+
| Approval Required                |
|   bash                           |
|     command: rm -rf build/       |
+----------------------------------+
Allow? [y]es / [n]o / [a]lways:

Custom approval callback

from agent.safety.permissions import ApprovalResult

async def my_callback(spec, tool_call) -> ApprovalResult:
    # Your logic — Slack notification, web UI, auto-approve known tools, etc.
    return ApprovalResult.APPROVED

agent = Agent(config=config, approval_callback=my_callback)

Sandbox modes

Mode Description
local Direct subprocess execution (default)
subprocess Isolated execution with ulimit resource limits and restricted environment variables

Set via SafetyConfig(sandbox="subprocess") or in a config file.

The subprocess sandbox applies:

  • Memory limit (sandbox_max_memory_mb, default 512 MB)
  • Restricted environment variables (strips sensitive vars)
  • Command timeout (ToolConfig.command_timeout, default 30s)

Architecture

The safety system has three components:

  • agent/safety/policy.pySafetyPolicy evaluates tool calls against PolicyConfig rules, returning ALLOW/DENY/ASK
  • agent/safety/permissions.pyPermissionManager handles ASK decisions by calling the approval callback and caching APPROVED_ALWAYS results
  • agent/safety/sandbox.pyLocalSandbox and SubprocessSandbox control how shell commands are actually executed

These are composed by ToolExecutor (agent/tools/execution.py), which is the single entry point for all tool execution in the agent loop.