Conversation
Consolidate the experimental agent and rollout controller configuration into areal.api.cli_args so the trainer, examples, and tests share one configuration surface. This also wires RolloutControllerV2 into the v2 rollout path and updates examples and integrations to use the new controller APIs. Key changes: - move agent and rollout controller configs into areal.api.cli_args and remove duplicated controller config modules - rename and rewire experimental controllers around AgentController and RolloutControllerV2 - update examples and experimental tests for rollout v2 config, initialization, and versioning Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
Enable AgentController to manage inference-backed sessions and export trajectories for agent-service workflows. Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
Move the Claude agent-service example into the experimental examples package so it can be imported from the new nested module path. Key changes: - add experimental package markers - add claude worker implementation - add claude interactive launcher Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
Retire the old examples/agent_service layout and point the worker CLI at the new experimental Claude module path. Key changes: - update worker example import path - remove legacy claude example files Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
Describe the new AgentController runtime APIs and document the relocated experimental Claude and tau2 examples. Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
Introduce the tau2 agent and workflow modules that bridge the agent service with inference-backed customer-service episodes. Key changes: - add tau2 AgentRunnable implementation - add tau2 workflow session orchestration Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
Add a runnable tau2 rollout entrypoint and config for validating the experimental agent service against inference-backed trajectories. Key changes: - add tau2 rollout config - add direct rollout driver - support async batch gather validation control Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
Add focused workflow coverage for exported tau2 trajectories and reward reporting in the experimental agent-service example. Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
There was a problem hiding this comment.
Code Review
This pull request refactors the experimental agent and inference services by renaming controllers to AgentController and RolloutControllerV2, consolidating configurations into areal/api/cli_args.py, and introducing new runtime APIs for RL workflows. It also adds a Tau2 benchmark example and updates documentation and tests accordingly. Feedback includes recommendations to optimize HTTP performance by reusing aiohttp sessions, removing redundant deepcopy calls in the inference controller, and correcting the documentation to show that the env parameter is optional rather than required.
| async with aiohttp.ClientSession(timeout=client_timeout) as session: | ||
| async with session.post(url, json=payload, headers=headers) as resp: |
There was a problem hiding this comment.
Creating a new aiohttp.ClientSession for every request is inefficient as it prevents connection pooling. It is recommended to reuse a single session across multiple requests, especially since step() is called for every turn in a conversation. Note that aiohttp.ClientSession should be created within an async context and properly closed during cleanup.
| sglang_config = copy.deepcopy(sglang_config) | ||
| for k, v in server_args.items(): |
There was a problem hiding this comment.
The deepcopy call is redundant here because sglang_config was just instantiated as a fresh object on line 364. You can iterate over server_args directly on the new instance.
| sglang_config = copy.deepcopy(sglang_config) | |
| for k, v in server_args.items(): | |
| for k, v in server_args.items(): |
| vllm_config = copy.deepcopy(vllm_config) | ||
| for k, v in server_args.items(): |
| | `health_poll_interval` | float | `5.0` | Seconds between pair health polls; 0 disables health monitoring. | | ||
| | `drain_timeout` | float | `30.0` | Seconds to wait for active sessions to drain before force-killing a pair. | | ||
| | `log_level` | string | `"info"` | Log level for spawned agent-service micro-services. | | ||
| | `env` | `dict` | **Required** | Extra environment variables passed to all forked child processes. | |
There was a problem hiding this comment.
The env parameter is incorrectly marked as **Required** in the documentation. The implementation in areal/api/cli_args.py (line 2006) provides a default_factory=dict, making it optional.
| | `env` | `dict` | **Required** | Extra environment variables passed to all forked child processes. | | |
| | `env` | `dict` | `{}` | Extra environment variables passed to all forked child processes. | |
| | `health_poll_interval` | float | `5.0` | Seconds between pair health polls; 0 disables health monitoring. | | ||
| | `drain_timeout` | float | `30.0` | Seconds to wait for active sessions to drain before force-killing a pair. | | ||
| | `log_level` | string | `"info"` | Log level for spawned agent-service micro-services. | | ||
| | `env` | `dict` | **Required** | Extra environment variables passed to all forked child processes. | |
There was a problem hiding this comment.
The env parameter is incorrectly marked as **Required** in the documentation. The implementation in areal/api/cli_args.py (line 2006) provides a default_factory=dict, making it optional.
| | `env` | `dict` | **Required** | Extra environment variables passed to all forked child processes. | | |
| | `env` | `dict` | `{}` | Extra environment variables passed to all forked child processes. | |
Normalize wrapped lines in the agent service README and controller so pre-commit leaves the service module clean. Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
Apply mdformat and ruff formatting to the experimental agent service examples so the example package stays pre-commit clean. Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
Reflow the tau2 workflow test to match Ruff formatting after the repository-wide pre-commit run. Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
Description
Add runtime-oriented agent service APIs plus experimental Claude and tau2 examples, including an async tau2 rollout driver validated against a local SGLang-served Qwen3-0.6B model.
Related Issue
Fixes #(issue)
Type of Change
Checklist
pre-commit run --all-files)./docs/build_all.sh)main/review-prcommand/create-prBreaking Change Details (if applicable):
N/A
Additional Context
Summary
AgentControllerruntime session APIs and align agent/inference service controller configsasyncio.gatherValidation
uv run pytest tests/experimental/agent_service/test_controller.py tests/experimental/agent_service/test_tau2_workflow.pyPYTHONPATH=/AReaL-dev TAU2_DATA_DIR=/root/tau2-bench/data AREAL_MAX_BATCHES=3 CUDA_VISIBLE_DEVICES=0 python examples/experimental/agent_service/tau2/run_rollout.py --config examples/experimental/agent_service/tau2/config.yaml cluster.n_gpus_per_node=1 cluster.fileroot=/tmp/areal_result_tau2_rollout_gather cluster.name_resolve.nfs_record_root=/tmp/areal_name_resolve_tau2_gather rollout.backend=sglang:d1 train_dataset.batch_size=2 econfig.user_llm_base_url=http://127.0.0.1:8000/v1/ model_path=Qwen/Qwen3-0.6BCUDA_VISIBLE_DEVICES=1 python -m sglang.launch_server --model-path Qwen/Qwen3-0.6B --host 0.0.0.0 --port 8000 --tool-call-parser qwen25 --reasoning-parser qwen3 --mem-fraction-static 0.7 --dp-size 1Stacking note
This draft PR depends on #1265 and should be merged only after #1265 lands.