Skip to content
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions docs/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@ Documentation for the Siren prompt injection research tool.
## Getting Started

- **[Configuration Guide](configuration.md)** - Basic usage and configuration
- **[Docker Backends](docker_backends.md)** - Running with Local Docker
- **[Usage Limits](usage_limits.md)** - Resource limits and cost controls
- **[Plugins](plugins/README.md)** - Adding custom agents, attacks, environments

Expand Down
161 changes: 161 additions & 0 deletions docs/docker_backends.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,161 @@
# Docker Execution Backends

The Prompt Siren Workbench supports Docker execution backends for running containerized tasks.

## Overview

Docker execution backends handle the creation, management, and execution of Docker containers for sandbox environments.

## Local Docker Backend

The local Docker backend executes containers directly on your machine using the Docker daemon.

### Requirements

- Docker installed and running on your machine
- Docker socket accessible (typically `/var/run/docker.sock`)
- `DOCKER_HOST` environment variable set in your `.env` file

### Setup

1. **Install Docker** (if not already installed):
```bash
docker --version
```

2. **Configure environment** in `.env`:
```bash
DOCKER_HOST="unix:///var/run/docker.sock"
```

3. **Verify Docker is running**:
```bash
docker ps
```

### Usage

Local Docker is the default backend and requires no special configuration:

```bash
# Run with local Docker (default)
uv run --env-file .env prompt-siren run benign +dataset=swebench

# Or explicitly specify
uv run --env-file .env prompt-siren run benign +dataset=swebench \
sandbox_manager.config.docker_client=local
```

### Advantages

- **Fast**: No network latency, containers run locally
- **Easy debugging**: Direct access to containers via `docker` CLI
- **No quotas**: Limited only by your machine's resources

### Limitations

- Requires Docker daemon running
- Limited by local machine resources
- Cannot run on machines without Docker

## Hydra Configuration

You can set the Docker backend in your Hydra configuration file or via command-line overrides.

### In Configuration File

Create or modify `config.yaml`:

```yaml
defaults:
- _self_
- dataset: swebench

sandbox_manager:
type: local-docker
config:
docker_client: local
```
Then run without needing overrides:
```bash
uv run --env-file .env prompt-siren run benign --config-dir=./config
```

### Via Command-Line Overrides

Override the backend at runtime:

```bash
# Explicitly use local Docker
uv run --env-file .env prompt-siren run benign +dataset=swebench \
sandbox_manager.config.docker_client=local
```

## End-to-End Example

Here's a complete example of running SWE-bench with Docker:

```bash
# 1. Set up environment
cat > .env <<EOF
DOCKER_HOST="unix:///var/run/docker.sock"
AZURE_OPENAI_ENDPOINT="https://your-endpoint.azure-api.net"
AZURE_OPENAI_API_KEY="your-key"
OPENAI_API_VERSION="2025-04-01-preview"
EOF

# 2. Verify Docker is running
docker ps

# 3. Run with local Docker
uv run --env-file .env prompt-siren run benign +dataset=swebench \
agent.config.model=azure:gpt-4o \
'dataset.config.instance_ids=["django__django-11179"]'
```

## Troubleshooting

### Local Docker Issues

**Problem**: `Cannot connect to Docker daemon`
```bash
# Solution: Verify Docker is running
docker ps

# Check DOCKER_HOST is set correctly
echo $DOCKER_HOST
```

**Problem**: `Permission denied` accessing Docker socket
```bash
# Solution: Check socket permissions
ls -l /var/run/docker.sock

# May need to add your user to docker group (requires logout/login)
sudo usermod -aG docker $USER
```

## Plugin System

Local Docker is implemented as a plugin using the Docker client registry system. You can create custom Docker client plugins by:

1. Implementing the `AbstractDockerClient` protocol
2. Creating a factory function
3. Registering via entry points

See [custom_components.md](custom_components.md) for details on creating custom plugins.

## Performance Considerations

### Local Docker
- **Startup time**: Fast (~1-2 seconds per container)
- **Execution**: No network latency
- **Best for**: Development, testing, small datasets

## See Also

- [Sandbox Manager Documentation](sandbox_manager.md) - General sandbox manager concepts
- [Configuration Guide](configuration.md) - Hydra configuration details
- [Custom Components](custom_components.md) - Creating custom plugins
3 changes: 3 additions & 0 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -65,6 +65,9 @@ local-docker = "prompt_siren.sandbox_managers.docker:create_docker_sandbox_manag
bedrock = "prompt_siren.providers.bedrock:BedrockProvider"
llama = "prompt_siren.providers.llama:LlamaProvider"

[project.entry-points."prompt_siren.docker_clients"]
local = "prompt_siren.sandbox_managers.docker.local_client:create_local_docker_client"

[project.optional-dependencies]
agentdojo = ["agentdojo>=0.1.35"]
swebench = ["swebench", "jinja2>=3.1.6"]
Expand Down
8 changes: 5 additions & 3 deletions src/prompt_siren/registry_base.py
Original file line number Diff line number Diff line change
Expand Up @@ -212,13 +212,15 @@ def create_component(
component_type: str,
config: BaseModel | None,
context: ContextT | None = None,
**kwargs: Any,
) -> ComponentT:
"""Create a component instance for a given component type and config.

Args:
component_type: String identifier for the component type
config: Configuration object for the component, or None
context: Optional context object passed to factory (e.g., sandbox_manager for datasets)
**kwargs: Additional keyword arguments passed to the factory function.

Returns:
An instance of the component type
Expand All @@ -243,14 +245,14 @@ def create_component(

config_class, factory = self._registry[component_type]

# Component doesn't use config - call factory with no args
# Component doesn't use config - call factory with only kwargs
if config_class is None:
if config is not None:
raise ValueError(
f"{self._component_name.title()} type '{component_type}' doesn't accept config, "
f"but config was provided"
)
return factory()
return factory(**kwargs)

# Component uses config - validate and call with config and context
if config is None:
Expand All @@ -260,7 +262,7 @@ def create_component(
f"Config must be an instance of {config_class.__name__}, got {type(config).__name__}"
)

return factory(config, context)
return factory(config, context, **kwargs)

def get_registered_components(self) -> list[str]:
"""Get a list of all registered component types.
Expand Down
7 changes: 6 additions & 1 deletion src/prompt_siren/sandbox_managers/docker/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,4 +7,9 @@
DockerSandboxManager,
)

__all__ = ["DockerSandboxConfig", "DockerSandboxManager", "create_docker_sandbox_manager"]
__all__ = [
# Sandbox manager
"DockerSandboxConfig",
"DockerSandboxManager",
"create_docker_sandbox_manager",
]
Loading
Loading