feat(service): add runtime agent service APIs and tau2 examples by nuzant · Pull Request #1266 · areal-project/AReaL

nuzant · 2026-04-26T03:25:13Z

Description

Add runtime-oriented agent service APIs plus experimental Claude and tau2 examples, including an async tau2 rollout driver validated against a local SGLang-served Qwen3-0.6B model.

Related Issue

Fixes #(issue)

Type of Change

Checklist

I have read the Contributing Guide
Pre-commit hooks pass (pre-commit run --all-files)
Relevant tests pass; new tests added for new functionality
Documentation updated (if applicable; built with ./docs/build_all.sh)
Branch is up to date with main
Self-reviewed via /review-pr command
This PR was created by a coding agent via /create-pr
This PR is a breaking change

Breaking Change Details (if applicable):

N/A

Additional Context

Summary

add AgentController runtime session APIs and align agent/inference service controller configs
add experimental Claude and tau2 agent-service examples, docs, and tau2 workflow coverage
update the tau2 rollout driver to process each batch concurrently with asyncio.gather

Validation

uv run pytest tests/experimental/agent_service/test_controller.py tests/experimental/agent_service/test_tau2_workflow.py
PYTHONPATH=/AReaL-dev TAU2_DATA_DIR=/root/tau2-bench/data AREAL_MAX_BATCHES=3 CUDA_VISIBLE_DEVICES=0 python examples/experimental/agent_service/tau2/run_rollout.py --config examples/experimental/agent_service/tau2/config.yaml cluster.n_gpus_per_node=1 cluster.fileroot=/tmp/areal_result_tau2_rollout_gather cluster.name_resolve.nfs_record_root=/tmp/areal_name_resolve_tau2_gather rollout.backend=sglang:d1 train_dataset.batch_size=2 econfig.user_llm_base_url=http://127.0.0.1:8000/v1/ model_path=Qwen/Qwen3-0.6B
user LLM served separately with CUDA_VISIBLE_DEVICES=1 python -m sglang.launch_server --model-path Qwen/Qwen3-0.6B --host 0.0.0.0 --port 8000 --tool-call-parser qwen25 --reasoning-parser qwen3 --mem-fraction-static 0.7 --dp-size 1

Stacking note

This draft PR depends on #1265 and should be merged only after #1265 lands.

Consolidate the experimental agent and rollout controller configuration into areal.api.cli_args so the trainer, examples, and tests share one configuration surface. This also wires RolloutControllerV2 into the v2 rollout path and updates examples and integrations to use the new controller APIs. Key changes: - move agent and rollout controller configs into areal.api.cli_args and remove duplicated controller config modules - rename and rewire experimental controllers around AgentController and RolloutControllerV2 - update examples and experimental tests for rollout v2 config, initialization, and versioning Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>

Enable AgentController to manage inference-backed sessions and export trajectories for agent-service workflows. Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>

Move the Claude agent-service example into the experimental examples package so it can be imported from the new nested module path. Key changes: - add experimental package markers - add claude worker implementation - add claude interactive launcher Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>

Retire the old examples/agent_service layout and point the worker CLI at the new experimental Claude module path. Key changes: - update worker example import path - remove legacy claude example files Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>

Describe the new AgentController runtime APIs and document the relocated experimental Claude and tau2 examples. Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>

Introduce the tau2 agent and workflow modules that bridge the agent service with inference-backed customer-service episodes. Key changes: - add tau2 AgentRunnable implementation - add tau2 workflow session orchestration Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>

Add a runnable tau2 rollout entrypoint and config for validating the experimental agent service against inference-backed trajectories. Key changes: - add tau2 rollout config - add direct rollout driver - support async batch gather validation control Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>

Add focused workflow coverage for exported tau2 trajectories and reward reporting in the experimental agent-service example. Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>

gemini-code-assist

Code Review

This pull request refactors the experimental agent and inference services by renaming controllers to AgentController and RolloutControllerV2, consolidating configurations into areal/api/cli_args.py, and introducing new runtime APIs for RL workflows. It also adds a Tau2 benchmark example and updates documentation and tests accordingly. Feedback includes recommendations to optimize HTTP performance by reusing aiohttp sessions, removing redundant deepcopy calls in the inference controller, and correcting the documentation to show that the env parameter is optional rather than required.

gemini-code-assist · 2026-04-26T03:26:44Z

+        async with aiohttp.ClientSession(timeout=client_timeout) as session:
+            async with session.post(url, json=payload, headers=headers) as resp:


Creating a new aiohttp.ClientSession for every request is inefficient as it prevents connection pooling. It is recommended to reuse a single session across multiple requests, especially since step() is called for every turn in a conversation. Note that aiohttp.ClientSession should be created within an async context and properly closed during cleanup.

gemini-code-assist · 2026-04-26T03:26:44Z

+                    sglang_config = copy.deepcopy(sglang_config)
                    for k, v in server_args.items():


The deepcopy call is redundant here because sglang_config was just instantiated as a fresh object on line 364. You can iterate over server_args directly on the new instance.

Suggested change

sglang_config = copy.deepcopy(sglang_config)

for k, v in server_args.items():

for k, v in server_args.items():

gemini-code-assist · 2026-04-26T03:26:44Z

+                    vllm_config = copy.deepcopy(vllm_config)
+                    for k, v in server_args.items():


The deepcopy call is redundant here because vllm_config was just instantiated as a fresh object on line 399.

Suggested change

vllm_config = copy.deepcopy(vllm_config)

for k, v in server_args.items():

for k, v in server_args.items():

gemini-code-assist · 2026-04-26T03:26:44Z

+| `health_poll_interval` | float   | `5.0`                 | Seconds between pair health polls; 0 disables health monitoring.          |
+| `drain_timeout`        | float   | `30.0`                | Seconds to wait for active sessions to drain before force-killing a pair. |
+| `log_level`            | string  | `"info"`              | Log level for spawned agent-service micro-services.                       |
+| `env`                  | `dict`  | **Required**          | Extra environment variables passed to all forked child processes.         |


The env parameter is incorrectly marked as **Required** in the documentation. The implementation in areal/api/cli_args.py (line 2006) provides a default_factory=dict, making it optional.

Suggested change

| `env` | `dict` | **Required** | Extra environment variables passed to all forked child processes. |

| `env` | `dict` | `{}` | Extra environment variables passed to all forked child processes. |

gemini-code-assist · 2026-04-26T03:26:44Z

+| `health_poll_interval` | float   | `5.0`                 | Seconds between pair health polls; 0 disables health monitoring.          |
+| `drain_timeout`        | float   | `30.0`                | Seconds to wait for active sessions to drain before force-killing a pair. |
+| `log_level`            | string  | `"info"`              | Log level for spawned agent-service micro-services.                       |
+| `env`                  | `dict`  | **Required**          | Extra environment variables passed to all forked child processes.         |


The env parameter is incorrectly marked as **Required** in the documentation. The implementation in areal/api/cli_args.py (line 2006) provides a default_factory=dict, making it optional.

Suggested change

| `env` | `dict` | **Required** | Extra environment variables passed to all forked child processes. |

| `env` | `dict` | `{}` | Extra environment variables passed to all forked child processes. |

Normalize wrapped lines in the agent service README and controller so pre-commit leaves the service module clean. Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>

Apply mdformat and ruff formatting to the experimental agent service examples so the example package stays pre-commit clean. Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>

Reflow the tau2 workflow test to match Ruff formatting after the repository-wide pre-commit run. Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>

nuzant and others added 9 commits April 25, 2026 17:45

chore: format

bab951a

nuzant mentioned this pull request Apr 26, 2026

feat(service): add tau2 agent+inference service rollout example #1226

Closed

16 tasks

gemini-code-assist Bot reviewed Apr 26, 2026

View reviewed changes

nuzant and others added 3 commits April 26, 2026 03:29

style(service): format agent service docs and controller

c550f95

Normalize wrapped lines in the agent service README and controller so pre-commit leaves the service module clean. Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>

style(examples): format experimental agent service examples

f64719f

Apply mdformat and ruff formatting to the experimental agent service examples so the example package stays pre-commit clean. Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>

style: format tau2 workflow test

858c2ae

Reflow the tau2 workflow test to match Ruff formatting after the repository-wide pre-commit run. Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(service): add runtime agent service APIs and tau2 examples#1266

feat(service): add runtime agent service APIs and tau2 examples#1266
nuzant wants to merge 12 commits into
mainfrom
mzy/iase

nuzant commented Apr 26, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot Apr 26, 2026

Uh oh!

gemini-code-assist Bot Apr 26, 2026

Uh oh!

gemini-code-assist Bot Apr 26, 2026

Uh oh!

gemini-code-assist Bot Apr 26, 2026

Uh oh!

gemini-code-assist Bot Apr 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

		async with aiohttp.ClientSession(timeout=client_timeout) as session:
		async with session.post(url, json=payload, headers=headers) as resp:

		sglang_config = copy.deepcopy(sglang_config)
		for k, v in server_args.items():

		vllm_config = copy.deepcopy(vllm_config)
		for k, v in server_args.items():

	\| `env` \| `dict` \| Required \| Extra environment variables passed to all forked child processes. \|
	\| `env` \| `dict` \| `{}` \| Extra environment variables passed to all forked child processes. \|

Conversation

nuzant commented Apr 26, 2026

Description

Related Issue

Type of Change

Checklist

Additional Context

Summary

Validation

Stacking note

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot Apr 26, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Apr 26, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Apr 26, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Apr 26, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Apr 26, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant