Skip to content

feat(service): add runtime agent service APIs and tau2 examples#1266

Draft
nuzant wants to merge 12 commits into
mainfrom
mzy/iase
Draft

feat(service): add runtime agent service APIs and tau2 examples#1266
nuzant wants to merge 12 commits into
mainfrom
mzy/iase

Conversation

@nuzant
Copy link
Copy Markdown
Collaborator

@nuzant nuzant commented Apr 26, 2026

Description

Add runtime-oriented agent service APIs plus experimental Claude and tau2 examples, including an async tau2 rollout driver validated against a local SGLang-served Qwen3-0.6B model.

Related Issue

Fixes #(issue)

Type of Change

  • 🐛 Bug fix
  • ✨ New feature
  • 💥 Breaking change
  • 📝 Documentation update
  • ♻️ Refactoring
  • ⚡ Performance improvement
  • ✅ Test coverage improvement

Checklist

  • I have read the Contributing Guide
  • Pre-commit hooks pass (pre-commit run --all-files)
  • Relevant tests pass; new tests added for new functionality
  • Documentation updated (if applicable; built with ./docs/build_all.sh)
  • Branch is up to date with main
  • Self-reviewed via /review-pr command
  • This PR was created by a coding agent via /create-pr
  • This PR is a breaking change

Breaking Change Details (if applicable):

N/A

Additional Context

Summary

  • add AgentController runtime session APIs and align agent/inference service controller configs
  • add experimental Claude and tau2 agent-service examples, docs, and tau2 workflow coverage
  • update the tau2 rollout driver to process each batch concurrently with asyncio.gather

Validation

  • uv run pytest tests/experimental/agent_service/test_controller.py tests/experimental/agent_service/test_tau2_workflow.py
  • PYTHONPATH=/AReaL-dev TAU2_DATA_DIR=/root/tau2-bench/data AREAL_MAX_BATCHES=3 CUDA_VISIBLE_DEVICES=0 python examples/experimental/agent_service/tau2/run_rollout.py --config examples/experimental/agent_service/tau2/config.yaml cluster.n_gpus_per_node=1 cluster.fileroot=/tmp/areal_result_tau2_rollout_gather cluster.name_resolve.nfs_record_root=/tmp/areal_name_resolve_tau2_gather rollout.backend=sglang:d1 train_dataset.batch_size=2 econfig.user_llm_base_url=http://127.0.0.1:8000/v1/ model_path=Qwen/Qwen3-0.6B
  • user LLM served separately with CUDA_VISIBLE_DEVICES=1 python -m sglang.launch_server --model-path Qwen/Qwen3-0.6B --host 0.0.0.0 --port 8000 --tool-call-parser qwen25 --reasoning-parser qwen3 --mem-fraction-static 0.7 --dp-size 1

Stacking note

This draft PR depends on #1265 and should be merged only after #1265 lands.

nuzant and others added 9 commits April 25, 2026 17:45
Consolidate the experimental agent and rollout controller configuration into areal.api.cli_args so the trainer, examples, and tests share one configuration surface. This also wires RolloutControllerV2 into the v2 rollout path and updates examples and integrations to use the new controller APIs.

Key changes:
- move agent and rollout controller configs into areal.api.cli_args and remove duplicated controller config modules
- rename and rewire experimental controllers around AgentController and RolloutControllerV2
- update examples and experimental tests for rollout v2 config, initialization, and versioning

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
Enable AgentController to manage inference-backed sessions and export trajectories for agent-service workflows.

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
Move the Claude agent-service example into the experimental examples package so it can be imported from the new nested module path.

Key changes:
- add experimental package markers
- add claude worker implementation
- add claude interactive launcher

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
Retire the old examples/agent_service layout and point the worker CLI at the new experimental Claude module path.

Key changes:
- update worker example import path
- remove legacy claude example files

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
Describe the new AgentController runtime APIs and document the relocated experimental Claude and tau2 examples.

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
Introduce the tau2 agent and workflow modules that bridge the agent service with inference-backed customer-service episodes.

Key changes:
- add tau2 AgentRunnable implementation
- add tau2 workflow session orchestration

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
Add a runnable tau2 rollout entrypoint and config for validating the experimental agent service against inference-backed trajectories.

Key changes:
- add tau2 rollout config
- add direct rollout driver
- support async batch gather validation control

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
Add focused workflow coverage for exported tau2 trajectories and reward reporting in the experimental agent-service example.

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request refactors the experimental agent and inference services by renaming controllers to AgentController and RolloutControllerV2, consolidating configurations into areal/api/cli_args.py, and introducing new runtime APIs for RL workflows. It also adds a Tau2 benchmark example and updates documentation and tests accordingly. Feedback includes recommendations to optimize HTTP performance by reusing aiohttp sessions, removing redundant deepcopy calls in the inference controller, and correcting the documentation to show that the env parameter is optional rather than required.

Comment on lines +410 to +411
async with aiohttp.ClientSession(timeout=client_timeout) as session:
async with session.post(url, json=payload, headers=headers) as resp:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Creating a new aiohttp.ClientSession for every request is inefficient as it prevents connection pooling. It is recommended to reuse a single session across multiple requests, especially since step() is called for every turn in a conversation. Note that aiohttp.ClientSession should be created within an async context and properly closed during cleanup.

Comment on lines +366 to 367
sglang_config = copy.deepcopy(sglang_config)
for k, v in server_args.items():
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The deepcopy call is redundant here because sglang_config was just instantiated as a fresh object on line 364. You can iterate over server_args directly on the new instance.

Suggested change
sglang_config = copy.deepcopy(sglang_config)
for k, v in server_args.items():
for k, v in server_args.items():

Comment on lines +401 to +402
vllm_config = copy.deepcopy(vllm_config)
for k, v in server_args.items():
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The deepcopy call is redundant here because vllm_config was just instantiated as a fresh object on line 399.

Suggested change
vllm_config = copy.deepcopy(vllm_config)
for k, v in server_args.items():
for k, v in server_args.items():

Comment thread docs/en/cli_reference.md
| `health_poll_interval` | float | `5.0` | Seconds between pair health polls; 0 disables health monitoring. |
| `drain_timeout` | float | `30.0` | Seconds to wait for active sessions to drain before force-killing a pair. |
| `log_level` | string | `"info"` | Log level for spawned agent-service micro-services. |
| `env` | `dict` | **Required** | Extra environment variables passed to all forked child processes. |
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The env parameter is incorrectly marked as **Required** in the documentation. The implementation in areal/api/cli_args.py (line 2006) provides a default_factory=dict, making it optional.

Suggested change
| `env` | `dict` | **Required** | Extra environment variables passed to all forked child processes. |
| `env` | `dict` | `{}` | Extra environment variables passed to all forked child processes. |

Comment thread docs/zh/cli_reference.md
| `health_poll_interval` | float | `5.0` | Seconds between pair health polls; 0 disables health monitoring. |
| `drain_timeout` | float | `30.0` | Seconds to wait for active sessions to drain before force-killing a pair. |
| `log_level` | string | `"info"` | Log level for spawned agent-service micro-services. |
| `env` | `dict` | **Required** | Extra environment variables passed to all forked child processes. |
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The env parameter is incorrectly marked as **Required** in the documentation. The implementation in areal/api/cli_args.py (line 2006) provides a default_factory=dict, making it optional.

Suggested change
| `env` | `dict` | **Required** | Extra environment variables passed to all forked child processes. |
| `env` | `dict` | `{}` | Extra environment variables passed to all forked child processes. |

nuzant and others added 3 commits April 26, 2026 03:29
Normalize wrapped lines in the agent service README and controller so pre-commit leaves the service module clean.

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
Apply mdformat and ruff formatting to the experimental agent service examples so the example package stays pre-commit clean.

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
Reflow the tau2 workflow test to match Ruff formatting after the repository-wide pre-commit run.

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant