Skip to content

Add sandlock as a lightweight local container backend #99

@congwang-mk

Description

@congwang-mk

ARES currently supports Docker and Daytona as container backends via the Container protocol. Both involve significant per-environment overhead -- Docker needs a daemon and ~200ms startup, Daytona needs a remote API and ~90ms.

For RL rollouts at scale (100K+ episodes), this overhead dominates. sandlock could serve as a third backend with ~5ms startup, running entirely locally with kernel-level isolation (Landlock + seccomp). No daemon, no remote API, no root.

It fits the existing Container protocol cleanly:

@dataclasses.dataclass(kw_only=True)
class SandlockContainer(containers.Container):
    image: str | None = None
    resources: containers.Resources | None = None
    default_workdir: str | None = None

    async def start(self, env: dict[str, str] | None = None) -> None:
        self._policy = Policy(
            fs_readable=["/usr", "/lib", "/lib64", "/etc"],
            fs_writable=[self.default_workdir or "/tmp/ares"],
            max_memory=f"{self.resources.memory}M" if self.resources and self.resources.memory else "512M",
            max_processes=self.resources.cpu or 10,
            clean_env=True,
            env=env or {},
        )

    async def stop(self) -> None:
        pass  # no daemon to stop

    async def exec_run(self, command, *, workdir=None, env=None, timeout_s=None) -> containers.ExecResult:
        result = Sandbox(self._policy).run(
            ["bash", "-lc", command],
            timeout=timeout_s,
        )
        return containers.ExecResult(output=result.stdout.decode(), exit_code=result.exit_code)

Comparison at 100K episodes:

Backend Startup 100K episodes overhead Requires
Docker ~200ms 5.6 hours Docker daemon
Daytona ~90ms 2.5 hours Remote API + account
sandlock ~5ms 8 minutes Linux 6.12+, pip install sandlock

Tradeoff: sandlock provides process-level isolation (not full container/VM), so it won't work for environments that need custom OS images. But for code execution rollouts where the host already has the right dependencies, it's a drop-in replacement that's 18-40x faster.

Happy to submit a PR implementing SandlockContainer if there's interest.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions