diff --git a/README.md b/README.md index 0d7fc9658..eeb65f09f 100644 --- a/README.md +++ b/README.md @@ -1,286 +1,210 @@
-# AWorld: Rich Environments, Intelligent Agents, Continuous Evolution +# AWorld: Agentic Craft for Your World

-*"Self-awareness: the hardest problem isn't solving within limits, it's discovering one's own limitations"* +*"The Next Frontier for AI is Your Expertise"* [![Twitter Follow][twitter-image]][twitter-url] [![WeChat QR Code][wechat-image]][wechat-url] [![Discord][discord-image]][discord-url] [![License: MIT][license-image]][license-url] [![DeepWiki][deepwiki-image]][deepwiki-url] -[![arXiv][arxiv-image]][arxiv-url] [![Tutorial][tutorial-image]][tutorial-url] -[![Playground][playground-image]][playground-url] + +

[中文版](./README_zh.md) | -[Installation](#installation) | -[Environments](#online-access-to-complex-environments) | -[Agent](#efficient-agent-construction) | -[Experience](#experience-to-samples) | -[Training](#training) | -[Architecture](#architecture-design-principles) | +[Automation](#your-journey-with-aworld-cli) | +[Manual](#total-control-manually-crafting-agent-systems) | [Evolution](#evolution) | [Contributing](#contributing) | +

--- -**AWorld (Agent World)** builds intelligent agents and rich environments where they operate, pushing the frontiers of AI capabilities and enabling continuous evolution. This project provides the fundamental recipe for agentic learning: [Environment Access](#online-access-to-complex-environments), [Agent Construction](#efficient-agent-construction), [Experience Retrieval](#experience-to-samples), and [Model Training](#training). What makes AWorld powerful is that agents can use these same components to automatically improve themselves. +

+For all its power, general AI hits a wall of context. It's a wall built from the nuanced workflows, domain-specific data, and hard-won intuition that define your world. From scientific research, financial analysis, to complex engineering, generic models can't climb this wall. They can't speak your language. + +The AWorld Thesis is that the true scaling of AI is achieved by enabling experts like you to build a gate in that wall. + +AWorld with its CLI mode is the platform designed for this. We provide the fundamental recipe for you, the expert, to infuse your knowledge and craft unique insights into fleets of autonomous agents. This is how we move beyond generic promise to specific, robust applications that navigate your world with precision. +

-![](./readme_assets/aworld_loop.png) > 💡 Visit our [homepage](https://www.aworldagents.com/) for more details, or try our online [environments](https://www.aworldagents.com/environments) and [agents](https://playground.aworldagents.com/). -# Installation -> [!TIP] -> Python>=3.11 +# Your Journey with AWorld-CLI +The journey from an idea to an evolved, autonomous agent begins at your fingertips. + + +## Install and Activate + +Create a .env file in the AWorld/aworld-cli to configure the base model for both the AWorld Agent and any agents it creates. Add the following content: +```bash +LLM_MODEL_NAME="your_model_name, Claude-Sonnet-4 or above suggested" +LLM_PROVIDER="openai" +LLM_API_KEY="your_model_api_key" +LLM_BASE_URL="your_base_url" +``` + +**Install and Enter AWorld-CLI** ```bash git clone https://github.com/inclusionAI/AWorld && cd AWorld -pip install -e . +conda create -n aworld_env python=3.11 -y && conda activate aworld_env + +pip install -e . && cd aworld-cli && pip install -e . + +aworld-cli ``` -# Online Access to Complex Environments -Provisioning rich environments is hard—packages conflict, APIs need keys, concurrency must scale. We make it painless with three access modes: -1. Use our default hosted setup (tooling with usage costs includes a limited free tier). -2. Bring your own API keys for unrestricted access (coming soon). -3. Pull our Docker images and run everything on your own infrastructure (coming soon). - -```python -import os -import asyncio -from aworld.sandbox import Sandbox - -INVITATION_CODE = os.environ.get("INVITATION_CODE", "") - -mcp_config = { - "mcpServers": { - "gaia_server": { - "type": "streamable-http", - "url": "https://playground.aworldagents.com/environments/mcp", - "timeout": 600, - "sse_read_timeout": 600, - "headers": { - "ENV_CODE": "gaia", - "Authorization": f"Bearer {INVITATION_CODE}", - } - } - } -} -async def _list_tools(): - sand_box = Sandbox(mcp_config=mcp_config, mcp_servers=["gaia_server"]) - return await sand_box.mcpservers.list_tools() +## Create Your Agent +

+Instantly scaffold an agent from a natural language description of your task, such as "create an agent that can generate HTML report". AWorld-CLI handles the boilerplate, so you can focus on the logic. -if __name__ == "__main__": - tools = asyncio.run(_list_tools()) - print(tools) -``` +

-![](./readme_assets/how_to_access_env.gif) - -# Efficient Agent Construction -In Aworld, an agent is simply a model enhanced with tools. To spin one up, you only need: -1. a model endpoint (for training, a vLLM service works great) -2. an online environment to call (use our hosted options or plug in your own MCP toolchain) -That’s it—no heavyweight scaffolding required. - -```python -from aworld.agents.llm_agent import Agent -from aworld.runner import Runners - -# refer the section above for details -mcp_config = {...} - -searcher = Agent( - name="Search Agent", - system_prompt="You specialize at searching.", - mcp_config=mcp_config -) - -if __name__ == "__main__": - result = Runners.sync_run( - input="Use google search tool to answer the question: the news about AI today.", - agent=searcher - ) - print(f"answer: {result.answer}") -``` -Remember to plug in your LLM credentials first. -```bash -# Set LLM credentials -export LLM_MODEL_NAME="gpt-4" -export LLM_API_KEY="your-api-key-here" -export LLM_BASE_URL="https://api.openai.com/v1" -``` +***Let AWorld Agent make an agent for you*** +![](./readme_assets/aworld_cli_demo_step1.gif) -## Complex Agent System Construction +

+This command generates a fully operational agent file referencing our carefully curated Verified Skills as the solid foundation and a global configuration, ready for immediate execution. -Real-world problems often need more than a single agent. AWorld gives you flexible build paths: -1. design automated workflows end to end [Docs](https://inclusionai.github.io/AWorld/Quickstart/workflow_construction/) -2. spin up MCP-enabled agents [Docs](https://inclusionai.github.io/AWorld/Quickstart/agent_construction/) -3. orchestrate multi-agent systems (MAS) [Docs](https://inclusionai.github.io/AWorld/Quickstart/multi-agent_system_construction/) +Once it's generated, your agent is a permanent, reusable tool in your ~/.agents folder. +

-Want to see it live? Load a pre-built DeepResearch team in the AWorld [Playground](https://playground.aworldagents.com/), inspect the source, and run it end to end. -![](./readme_assets/playground_gaiateam.gif) -# Experience to samples -Our runtime captures every step across offline and online runs. Each task yields a complete trajectory—every LLM call, action, and reward—so you can synthesize training samples, audit performance, and iterate with confidence. - -## Complete Task Trajectories -Tasks unfold over many LLM calls. The framework captures every step, giving you a full trajectory. - -```python -import asyncio -from aworld.runner import Runners -from aworld.core.task import Task -from aworld.logs.util import logger -import json - -# refer the section above for agent constrution -searcher = Agent(...) - -if __name__ == "__main__": - async def test_complete_trajectory(): - task = Task( - input="Use google search tool to answer the question: the news about AI today.", - agent=searcher - ) - - responses = await Runners.run_task(task) - resp = responses[task.id] - logger.info(f"task answer: {resp.answer}") - logger.info(f"task trajectory: {json.dumps(resp.trajectory, ensure_ascii=False)}") - asyncio.run(test_complete_trajectory()) -``` +### Verified Skills: The DNA for Automated Agent Creation +
+Our library of Verified Skills is more than a collection of blueprints; it's a gene pool of expert capabilities. +
-## Single-Step Introspection -Need finer control? Call `step()` to inspect one action/response pair at a time. This lets you inject intermediate rewards during training, enabling richer, more flexible learning signals. - -```python -import os -import asyncio -from aworld.runner import Runners -from aworld.core.task import Task -from aworld.logs.util import logger -import json -from aworld.config import TaskConfig, TaskRunMode - -# refer the section above for agent constrution -searcher = Agent(...) - -if __name__ == "__main__": - async def test_single_step_introspection(): - task = Task( - input="Use google search tool to answer the question: the news about AI today.", - agent=searcher, - conf=TaskConfig( - resp_carry_context=True, - run_mode=TaskRunMode.INTERACTIVE - ) - ) - - trajectory_log = os.path.join(os.path.dirname(__file__), "trajectory_log.txt") - is_finished = False - step = 1 - while not is_finished: - with open(trajectory_log, "a", encoding="utf-8") as traj_file: - is_finished, observation, response = await Runners.step(task) - traj_file.write(f"Step {step}\n") - traj_file.write(json.dumps(response.trajectory, ensure_ascii=False, indent=2)) - traj_file.write("\n\n") - step += 1 - asyncio.run(test_single_step_introspection()) -``` +
-# Training -Once agents can roam across environments, AWorld closes the loop with two complementary training modes that drive continuous improvement. +

+When you automate the creation of a new agent, AWorld-CLI doesn't start from scratch. It intelligently references these battle-tested Skills for robutsness, and simultaneously learns from your custom skills in the ~/agents folder. This dual inheritance ensures every agent is not only reliable from the start, adapted to your requirements. +

-## Model Training -Plug any mainstream LLM trainer—AReal, Swift, Verl, Slime, etc.—into the runtime to update model parameters directly. Adapters are lightweight, so you can reuse the same environment and agent code across trainers. + + + + + + + + + + + + + + + + + + + + + +
SkillsDescription
🚀 PPT AgentCreates polished presentations from documents, outlines, or data.
🧠 DeepSearch AgentConducts comprehensive, multi-source research on a topic and synthesizes a structured report.
-```python -from datasets import load_dataset -from aworld.agents.llm_agent import Agent -from aworld.config import AgentConfig -from train.trainer.agent_trainer import AgentTrainer -from train.examples.train_gaia_with_aworld_verl.metrics.gaia_reward_function import gaia_reward_func +## Run Your Agent +

+Prompt the AWorld Agent to execute your newly created agent on a task and watch it work, such as "Let the html agent generate an html report introducing Beckham". Every call, action, and observation is captured in a detailed trajectory log, saved right to your local directory. +

-# refer the section above for details -mcp_config = {...} +***Let the created agent do your job*** +![](./readme_assets/aworld_cli_demo_step2.gif) + -# Configure agent to use Verl as the model service (adapts inference format automatically) -agent_config = AgentConfig( - llm_provider="verl" -) -searcher = Agent( - name="Search Agent", - system_prompt="You specialize at searching.", - mcp_config=mcp_config, - conf=agent_config -) +## Evolve Your Agent +

+If the agent's performance isn't perfect in your opinion, you have a spectrum of powerful options for refinement. -train_dataset = load_dataset("", split="train") -test_dataset = load_dataset("", split="test") +**Manual Evolution** +

+You are the expert. Open the generated Python file and fine-tune the prompts, logic, or tool usage directly. You have full control. +

-trainer = AgentTrainer( - agent=agent, - config=custom_train_config, - reward_func=gaia_reward_func, - train_dataset=train_dataset, - test_dataset=test_dataset -) +**Exciting: AI-Assisted Evolution** +

+This is where AWorld truly shines! Prompt AWorld with your expertise and desired changes, such as "help me optimize the html agent so that it can browse web, download and insert image into the html". It then tasks our built-in Optimizer Agent—a specialized code agent—to act as your AI pair programmer. Because all agents you create extend from a unified AWorld base class, the Optimizer Agent has a global understanding of the agent's structure. This allows it to reason about and precisely modify the agent's code to implement your logic, evolving its capabilities far beyond simple prompt tuning. +

-trainer.train() -``` -> 💡 Check the [real case](./train/examples/train_gaia_with_aworld_verl/main.py) which includes the full training config to run agentic training. -## Meta-Learning -Beyond weights, you can meta-learn whole agent systems. Spin up role-specific agents that critique, rewrite prompts, refine workflow, or adjust strategies for a target agent, then iterate the team (e.g., our Gaia demo). +***AI evolve your agent to make it more professional*** +![](./readme_assets/aworld_cli_demo_step3.gif) + + +**Our Vista: Self-Evolution** +

+This is the future. Instead of you providing explicit prompts, the system automatically detects sub-optimal performance based on a reward signal (e.g., failed validation, deviation from a verified Skill). It then triggers an autonomous optimization loop, evolving the agent on its own. This is evaluation-driven evolution, where the agent gains true self-awareness and improves without constant human intervention. +

+ +Once you're satisfied with your optimized agent, it is permanent and reusable in your ~/agents folder. +

+ + +# Total Control: Manually Crafting Agent Systems +

+In AWorld, an agent is a model enhanced with tools. But real-world problems often demand more than a single agent. To solve this, AWorld gives you full control with flexible build paths, allowing you to manually craft complex, multi-agent systems for collaboration. +

+ +1. design automated workflows end to end [Docs](https://inclusionai.github.io/AWorld/Quickstart/workflow_construction/) + +2. spin up MCP-enabled agents [Docs](https://inclusionai.github.io/AWorld/Quickstart/agent_construction/) + +3. orchestrate multi-agent systems (MAS) [Docs](https://inclusionai.github.io/AWorld/Quickstart/multi-agent_system_construction/) + + +# Playground: See a Multi-Agent System in Action +Launch our official DeepResearch team in the AWorld [Playground](https://playground.aworldagents.com/) to see AI collaboration live. Inspect its source, run it end-to-end, and get inspired. + +![](./readme_assets/playground_gaiateam.gif) + +**From User to Creator: Get Your Agent Featured!** -![](./readme_assets/mas_meta_learning.png) +Ready to build your own? Use the aworld-cli to forge an agent with your unique expertise, captured in its skill.md file. -# Architecture Design Principles -This framework is engineered to be highly adaptable, enabling researchers and developers to explore and innovate across multiple domains, thereby advancing the capabilities and applications of multi-agent systems. +To get your creation featured, simply submit a Pull Request with your skill.md to: +AWorld/examples/Custom_Skills/ -## Concepts & Framework -| Concepts | Description | -| :-------------------------------------- | ------------ | -| [`agent`](./aworld/core/agent/base.py) | Define the foundational classes, descriptions, output parsing, and multi-agent collaboration (swarm) logic for defining, managing, and orchestrating agents in the AWorld system. | -| [`runner`](./aworld/runners) | Contains runner classes that manage the execution loop for agents in environments, handling episode rollouts and parallel training/evaluation workflows. | -| [`task`](./aworld/core/task.py) | Define the base Task class that encapsulates environment objectives, necessary tools, and termination conditions for agent interactions. | -| [`swarm`](./aworld/core/agent/swarm.py) | Implement the SwarmAgent class managing multi-agent coordination and emergent group behaviors through decentralized policies. | -| [`sandbox`](./aworld/sandbox) | Provide a controlled runtime with configurable scenarios for rapid prototyping and validation of agent behaviors. | -| [`tools`](./aworld/tools) | Offer a flexible framework for defining, adapting, and executing tools for agent-environment interaction in the AWorld system. | -| [`context`](./aworld/core/context) | Feature a comprehensive context management system for AWorld agents, enabling complete state tracking, configuration management, prompt optimization, multi-task state handling, and dynamic prompt templating throughout the agent lifecycle. | -| [`memory`](./aworld/memory) | Implement an extensible memory system for agents, supporting short-term and long-term memory, summarization, retrieval, embeddings, and integration.| -| [`trace`](./aworld/trace) | Feature an observable tracing framework for AWorld, enabling distributed tracing, context propagation, span management, and integration with popular frameworks and protocols to monitor and analyze agent, tool, and task execution.| +

+We'll showcase the best community agents here in the Playground. Let your expertise evolve into a professional agent, gain recognition, and empower the entire community to experience the amazing tools you've built. +

+ # Evolution -Our mission: AWorld handles the complexity, you focus on innovation. This section showcases cutting-edge multi-agent systems built with AWorld, advancing toward AGI. +

+AWorld's mission is to handle the complexity so you can focus on innovation. This section showcases cutting-edge multi-agent systems built with AWorld, advancing toward AGI. + #### Agent Benchmarking @@ -457,11 +381,17 @@ Our mission: AWorld handles the complexity, you focus on innovation. This sectio *Kaiwen He, Zhiwei Wang, Chenyi Zhuang, Jinjie Gu* +

+ # Contributing -We warmly welcome developers to join us in building and improving AWorld! Whether you're interested in enhancing the framework, fixing bugs, or adding new features, your contributions are valuable to us. +

+Our roadmap includes expanding our AI for Science & Business initiative, deepening our self-evolution capabilities, and growing our library of community-contributed Skills. + +We warmly welcome developers, researchers, and domain experts to join us. Whether you're enhancing the framework or contributing a Skill from your field of expertise, your work is valuable. For academic citations or wish to contact us, please use the following BibTeX entry: +

```bibtex @misc{yu2025aworldorchestratingtrainingrecipe, @@ -475,8 +405,8 @@ For academic citations or wish to contact us, please use the following BibTeX en } ``` -# Star History -![](https://api.star-history.com/svg?repos=inclusionAI/AWorld&type=Date) + diff --git a/README_zh.md b/README_zh.md index f4af343d4..ba42b91bf 100644 --- a/README_zh.md +++ b/README_zh.md @@ -1,294 +1,242 @@
-# AWorld:丰富的环境、高效的智能体、持续的进化 +# AWorld:为你的领域打造智能体

-*“自我意识:最难的问题不在于在局限内求解,而在于发现自身的局限”* +*「AI 的下一站,是你的专业能力」* [![Twitter Follow][twitter-image]][twitter-url] [![WeChat QR Code][wechat-image]][wechat-url] [![Discord][discord-image]][discord-url] [![License: MIT][license-image]][license-url] [![DeepWiki][deepwiki-image]][deepwiki-url] -[![arXiv][arxiv-image]][arxiv-url] [![Tutorial][tutorial-image]][tutorial-url] -[![Playground][playground-image]][playground-url] + +

[English](./README.md) | -[安装](#安装) | -[环境](#复杂环境在线访问) | -[智能体](#高效的智能体构建) | -[经验](#经验到样本) | -[训练](#训练) | -[架构](#架构设计原则) | -[演进](#演进) | -[贡献](#贡献) | +[自动化](#your-journey-with-aworld-cli) | +[手动构建](#total-control-manually-crafting-agent-systems) | +[演进](#evolution) | +[参与贡献](#contributing) | + +

-**AWorld (Agent World)** 构建智能体(Agent)及其运行的丰富环境,旨在拓展 AI 能力的前沿并实现持续进化。本项目提供了 Agentic Learning(智能体学习)的基础配方:[环境访问](#复杂环境在线访问)、[智能体构建](#高效的智能体构建)、[经验获取](#经验到样本) 和 [模型训练](#训练)。AWorld 的强大之处在于,智能体可以利用这些相同的组件来自动提升自己。 +--- -![](./readme_assets/aworld_loop.png) +

+通用 AI 再强,也会撞上「语境之墙」——这堵墙由细粒度工作流、领域数据和长期积累的直觉砌成,构成了你的专业世界。从科研、金融到复杂工程,通用模型翻不过这道墙,也说不了你的「行话」。 -> 💡 访问我们的 [主页](https://www.aworldagents.com/) 了解更多详情,或者尝试我们的在线 [环境](https://www.aworldagents.com/environments) 和 [智能体](https://playground.aworldagents.com/)。 +AWorld 的论点是:AI 的真正扩展,来自让像你这样的专家在这堵墙上开一扇门。 +AWorld-CLI 就是为此设计的平台。我们提供一套基础「配方」,让你把知识和洞察注入一支支自主智能体,从通用承诺走向在你领域里精准可用的应用。 +

-# 安装 -> [!TIP] -> Python>=3.11 -```bash -git clone https://github.com/inclusionAI/AWorld && cd AWorld + -pip install -e . -``` +> 💡 更多信息请访问[官网](https://www.aworldagents.com/),或体验在线[环境](https://www.aworldagents.com/environments)与[智能体](https://playground.aworldagents.com/)。 -# 复杂环境在线访问 -配置丰富的环境并非易事——依赖包冲突、API 需要密钥、并发需要扩展、网路配置等。我们通过三种访问模式让这一切变得轻松无痛: -1. 使用我们默认的托管设置(针对有使用成本的工具,我们提供有限免费额度)。 -2. 自带 API 密钥以获得无限制次数工具使用(即将推出)。 -3. 拉取我们的 Docker 镜像并在您自己的基础设施上部署运行(即将推出)。 - -```python -import os -import asyncio -from aworld.sandbox import Sandbox - -INVITATION_CODE = os.environ.get("INVITATION_CODE", "") - -mcp_config = { - "mcpServers": { - "gaia_server": { - "type": "streamable-http", - "url": "https://playground.aworldagents.com/environments/mcp", - "timeout": 600, - "sse_read_timeout": 600, - "headers": { - "ENV_CODE": "gaia", - "Authorization": f"Bearer {INVITATION_CODE}", - } - } - } -} -async def _list_tools(): - sand_box = Sandbox(mcp_config=mcp_config, mcp_servers=["gaia_server"]) - return await sand_box.mcpservers.list_tools() + +# 开启你的 AWorld-CLI 之旅 +从深思熟虑到可进化的自主智能体,从你指尖开始。 -if __name__ == "__main__": - tools = asyncio.run(_list_tools()) - print(tools) -``` -![](./readme_assets/how_to_access_env.gif) - -# 高效的智能体构建 -在 AWorld 中,智能体被简洁的定义成一个工具增强的模型。要启动一个智能体,您只需要: -1. 一个模型服务(对于训练,vLLM/SGLang服务效果就很好) -2. 一个可调用的在线环境(使用我们的托管选项或接入您自己的 MCP 工具链) -就是这样——无需繁重的脚手架代码。 - -```python -from aworld.agents.llm_agent import Agent -from aworld.runner import Runners - -# 详情请参阅上一节 -mcp_config = {...} - -searcher = Agent( - name="Search Agent", - system_prompt="You specialize at searching.", - mcp_config=mcp_config -) - -if __name__ == "__main__": - result = Runners.sync_run( - input="Use google search tool to answer the question: the news about AI today.", - agent=searcher - ) - print(f"answer: {result.answer}") +## 安装与激活 + +在 AWorld/aworld-cli 下创建 .env,配置 AWorld Agent 及其所创建智能体的基础模型,例如: +```bash +LLM_MODEL_NAME="your_model_name, Claude-Sonnet-4 or above suggested" +LLM_PROVIDER="openai" +LLM_API_KEY="your_model_api_key" +LLM_BASE_URL="your_model_base_url" ``` -记得先配置您的 LLM 凭证。 +**安装并进入 AWorld-CLI:** ```bash -# 设置 LLM 凭证 -export LLM_MODEL_NAME="gpt-4" -export LLM_API_KEY="your-api-key-here" -export LLM_BASE_URL="https://api.openai.com/v1" +git clone https://github.com/inclusionAI/AWorld && cd AWorld + +conda create -n aworld_env python=3.11 -y && conda activate aworld_env + +pip install -e . && cd aworld-cli && pip install -e . + +aworld-cli ``` -## 复杂智能体系统构建 -现实世界的问题通常需要构建复杂的智能体系统。AWorld 为您提供了灵活的构建模式: -1. 设计端到端的自动化工作流 [文档](https://inclusionai.github.io/AWorld/Quickstart/workflow_construction/) -2. 构建支持 MCP 的智能体 [文档](https://inclusionai.github.io/AWorld/Quickstart/agent_construction/) -3. 编排多智能体系统 (MAS) [文档](https://inclusionai.github.io/AWorld/Quickstart/multi-agent_system_construction/) +## 创建智能体 +

+通过自然语言描述您的任务,例如“创建一个能生成HTML报告的智能体”,即可快速为您的任务搭建好智能体。AWorld-CLI 会处理所有样板代码,让您可以专注于核心逻辑。 -想看实际效果?可在 AWorld [Playground](https://playground.aworldagents.com/) 中加载我们预构建的 DeepResearch 智能体系统,检查源代码,并端到端运行它。 -![](./readme_assets/playground_gaiateam.gif) +

-# 经验到样本 -我们的运行时(Runtime)会捕获离线和在线运行中的每一个步骤。每个任务都会产生一条完整的轨迹——包含每一次 LLM 调用、动作和奖励——因此您可以用于样本合成、性能评估、并高置信地进行迭代。 + +***让 AWorld Agent 为你构建智能体*** +![](./readme_assets/aworld_cli_demo_step1.gif) -## 完整的任务轨迹 -任务是通过许多次 LLM 调用展开的。框架会捕获每一步,为您提供完整的轨迹。 +

+该命令会生成可直接运行的智能体文件,以我们精选的 Verified Skills 为底座,并挂载全局配置,生成后即可执行。 -```python -import asyncio -from aworld.runner import Runners -from aworld.core.task import Task -from aworld.logs.util import logger -import json +智能体一旦生成,会持久保存在 ~/.agents 目录,可重复使用。 +

-# 智能体构建请参考上一节 -searcher = Agent(...) -if __name__ == "__main__": - async def test_complete_trajectory(): - task = Task( - input="Use google search tool to answer the question: the news about AI today.", - agent=searcher - ) +### Verified Skills:自动化创建智能体的「基因库」 +
+Verified Skills 不仅是模板集合,更是经过验证的专家能力池。 +
- responses = await Runners.run_task(task) - resp = responses[task.id] - logger.info(f"task answer: {resp.answer}") - logger.info(f"task trajectory: {json.dumps(resp.trajectory, ensure_ascii=False)}") - asyncio.run(test_complete_trajectory()) -``` +
-## 单步内省 (Single-Step Introspection) -需要更精细的控制?调用 `step()` 来逐次检查动作/响应数据对。这允许您在训练期间注入中间奖励,从而实现更丰富、更灵活的学习信号。 - -```python -import os -import asyncio -from aworld.runner import Runners -from aworld.core.task import Task -from aworld.logs.util import logger -import json -from aworld.config import TaskConfig, TaskRunMode - -# 智能体构建请参考上一节 -searcher = Agent(...) - -if __name__ == "__main__": - async def test_single_step_introspection(): - task = Task( - input="Use google search tool to answer the question: the news about AI today.", - agent=searcher, - conf=TaskConfig( - resp_carry_context=True, - run_mode=TaskRunMode.INTERACTIVE - ) - ) - - trajectory_log = os.path.join(os.path.dirname(__file__), "trajectory_log.txt") - is_finished = False - step = 1 - while not is_finished: - with open(trajectory_log, "a", encoding="utf-8") as traj_file: - is_finished, observation, response = await Runners.step(task) - traj_file.write(f"Step {step}\n") - traj_file.write(json.dumps(response.trajectory, ensure_ascii=False, indent=2)) - traj_file.write("\n\n") - step += 1 - asyncio.run(test_single_step_introspection()) -``` +

+自动化创建新智能体时,AWorld-CLI 不会从零开始,而是智能引用这些久经考验的 Skills(见演进),以确保其稳健性,同时也会从您位于 ~/agents 文件夹中的自定义 Skills 中学习。这种双重继承机制,确保了每个智能体不仅从诞生之初就稳定可靠,适应您的特定需求。 +

+ + + + + + + + + + + + + + + + + + + + + + +
技能描述
🚀 PPT 智能体根据文档、大纲或数据,创建精美的演示文稿。
🧠 DeepSearch 智能体对指定主题进行全面、多源的研究,并整合生成一份结构化的报告。
-# 训练 -一旦智能体能够在环境中探索,AWorld 能通过两种互补的训练模式形成进化的闭环,推动持续改进。 +## 运行智能体 +

+向 AWorld Agent 发出指令,让它用你刚创建的智能体执行任务,比如“让html智能体制作一个贝克汉姆的介绍报告”;每次调用、动作与观测都会写入详细轨迹日志,保存在本地目录。 +

-## 模型训练 -将任何主流 LLM 训练框架——AReal、Swift、Verl、Slime 等——接入运行时,直接更新模型参数。适配器非常轻量,因此您可以在不同的训练器之间复用相同的环境和智能体代码。 -```python -from datasets import load_dataset -from aworld.agents.llm_agent import Agent -from aworld.config import AgentConfig + +***让新创建的智能体为你工作*** +![](./readme_assets/aworld_cli_demo_step2.gif) -from train.trainer.agent_trainer import AgentTrainer -from train.examples.train_gaia_with_aworld_verl.metrics.gaia_reward_function import gaia_reward_func +## 进化智能体 +

+若智能体的表现未达预期,你可以用多种方式迭代改进它。 +**手动进化** +

+你是专家。直接打开生成的智能体 Python 文件,按需调整提示词、逻辑或工具使用,完全可控。 +

-# 详情请参阅上一节 -mcp_config = {...} +**一颗赛艇:AI 辅助进化** +

+这正是 AWorld 的真正强大之处!用您期望的修改来提示 AWorld,例如“帮我优化这个HTML智能体,让它能够浏览网页、下载并插入图片到HTML中”。然后,它会将任务分配给我们内置的 Optimizer Agent(优化器智能体)——一个专门的代码智能体,来作为您的 AI 结对程序员。因为您创建的所有智能体都继承自一个统一的 AWorld 基类,所以 Optimizer Agent 对智能体的结构有全局的理解能力。这使其能够对智能体的代码进行推理并进行精确修改,以实现您的逻辑,使其能力进化——这远非简单的提示词调优所能及。 +

-# 配置智能体使用 Verl 作为模型服务(自动适配推理格式) -agent_config = AgentConfig( - llm_provider="verl" -) -searcher = Agent( - name="Search Agent", - system_prompt="You specialize at searching.", - mcp_config=mcp_config, - conf=agent_config -) + -train_dataset = load_dataset("", split="train") -test_dataset = load_dataset("", split="test") -trainer = AgentTrainer( - agent=agent, - config=custom_train_config, - reward_func=gaia_reward_func, - train_dataset=train_dataset, - test_dataset=test_dataset -) +***AI 优化你的智能体并让它更专业*** +![](./readme_assets/aworld_cli_demo_step3.gif) -trainer.train() -``` -> 💡 查看 [真实案例](./train/examples/train_gaia_with_aworld_verl/main.py),其中包含运行智能体训练所需的完整训练配置。 -## 元学习 (Meta-Learning) -除了更新模型权重之外,您还可以对整个智能体系统进行元学习。启动特定角色的智能体,让它们针对目标智能体进行更新、重写提示词、优化工作流或调整策略,然后迭代团队(如下图所示)。 + + +**愿景:自进化** +

+未来形态:无需你写具体提示,系统根据奖励信号(如校验失败、偏离某 Verified Skill)自动发现次优表现,触发自主优化循环,让智能体在评估驱动下自进化,减少持续人工干预。 +

+ +优化满意后,智能体会持久保存在 ~/.agents,可重复使用。 +

+ -![](./readme_assets/mas_meta_learning.png) + +# 完全掌控:手动构建智能体系统 +

+在 AWorld 中,智能体即「模型 + 工具」。但真实场景常需多智能体协作。为此,AWorld 提供灵活构建路径,让你手动搭建复杂多智能体系统。 +

-# 架构设计原则 -本框架旨在具有高度适应性,使研究人员和开发人员能够跨多个领域进行探索和创新,从而提升多智能体系统的能力和应用。 +1. 端到端设计自动化工作流 [文档](https://inclusionai.github.io/AWorld/Quickstart/workflow_construction/) -## 概念与框架 -| 概念 | 描述 | -| :-------------------------------------- | ------------ | -| [`agent`](./aworld/core/agent/base.py) | 定义基础类、描述、输出解析以及多智能体协作(swarm)逻辑,用于定义、管理和编排 AWorld 系统中的智能体。 | -| [`runner`](./aworld/runners) | 包含管理智能体在环境中的执行循环的运行器类,处理剧集回放(episode rollouts)和并行训练/评估工作流。 | -| [`task`](./aworld/core/task.py) | 定义基础任务类,封装了智能体交互所需的环境目标、必要工具和终止条件。 | -| [`swarm`](./aworld/core/agent/swarm.py) | 实现 SwarmAgent 类,通过去中心化策略管理多智能体协调和涌现的群体行为。 | -| [`sandbox`](./aworld/sandbox) | 提供带有可配置场景的受控运行时,用于快速原型设计和验证智能体行为。 | -| [`tools`](./aworld/tools) | 提供灵活的框架,用于定义、适配和执行 AWorld 系统中的智能体-环境交互工具。 | -| [`context`](./aworld/core/context) | 为 AWorld 智能体提供全面的上下文管理系统,实现完整的状态跟踪、配置管理、提示词优化、多任务状态处理以及贯穿智能体生命周期的动态提示词模板。 | -| [`memory`](./aworld/memory) | 为智能体实现可扩展的记忆系统,支持短期和长期记忆、摘要、检索、嵌入(embeddings)和集成。| -| [`trace`](./aworld/trace) | 为 AWorld 提供可观测的追踪框架,支持分布式追踪、上下文传播、Span 管理,并与流行框架和协议集成,以监控和分析智能体、工具及任务的执行。| +2. 启动支持 MCP 的智能体 [文档](https://inclusionai.github.io/AWorld/Quickstart/agent_construction/) +3. 编排多智能体系统 (MAS) [文档](https://inclusionai.github.io/AWorld/Quickstart/multi-agent_system_construction/) + + +想直接体验?在 AWorld [Playground](https://playground.aworldagents.com/) 加载预置 DeepResearch 团队,查看源码并端到端运行。 + +# MAS演练场: 即刻运行,亲眼见证 + +在 AWorld [Playground](https://playground.aworldagents.com/) 启动官方 DeepResearch 团队,实时观摩 AI 协作。你可以检视其源码、运行全过程,并从中获取灵感。 + +![](./readme_assets/playground_gaiateam.gif) -## 特性 -| 智能体构建 | 拓扑编排 | 环境 | -|:------------------------------|:-----------------------------------------------------------------------------------------|:-------------------------------| -| ✅ 集成 MCP 服务 | ✅ 封装的运行时 | ✅ 运行时状态管理 | -| ✅ 支持多模型提供商 | ✅ 灵活的 MAS 模式 | ✅ 高并发支持 | -| ✅ 高度自定义构建 | ✅ 清晰的状态追踪 | ✅ 分布式训练 | -| ✅ [支持智能体技能](https://github.com/inclusionAI/AWorld/tree/main/examples/skill_agent) | ✅ [支持交互式终端](https://github.com/inclusionAI/AWorld/tree/main/examples/aworld_cli_demo) 🚀 | | +**从用户到创造者:让你的智能体登上舞台!** +准备好构建你自己的智能体了吗?使用 aworld-cli 将你的专业知识铸造成一个强大的智能体,并将其核心能力定义在 skill.md 文件中。 +想让你的作品登上这个舞台?只需提交一个 Pull Request,将你的 skill.md 添加至: +AWorld/examples/Custom_Skills/ + +我们会在这里展示最出色的社区智能体,让你的杰作大放异彩,赋能整个社区! + + + + + + # 演进 -我们的使命:把复杂繁琐的任务留给 AWorld,您来负责创新。本节展示了利用 AWorld 开发的几个创新项目,以证明框架本身的有效性。 +

+AWorld 的目标是扛住复杂度,让你专注创新。本节展示基于 AWorld 构建的前沿多智能体成果,向 AGI 迈进。 +

+ -#### 智能体打榜 +#### 智能体评测 - + @@ -296,14 +244,14 @@ trainer.train() - - + - + - + - + - + - - +
类别成就成果 表现 关键创新 日期
🤖 智能体 + 🤖 Agent
Try Online
- GAIA Benchmark
卓越表现
+ GAIA Benchmark
Excellence

GAIA @@ -312,13 +260,13 @@ trainer.train()
Pass@1: 67.89
Pass@3: 83.49 -
(109 任务) +
(109 tasks) Code
- 多智能体系统
稳定性与编排 + Multi-agent system
stability & orchestration
Paper @@ -327,68 +275,68 @@ trainer.train()
2025/08/06
🧠 推理🧠 Reasoning - IMO 2025
解题
+ IMO 2025
Problem Solving

IMO
- 6小时内解决
5/6 道题 + 5/6 problems
solved in 6 hours
Code
多智能体协作
优于单个模型
Multi-agent collaboration
beats solo models
2025/07/25
🖼️ 多模态🖼️ Multi-Modal - OSWorld
排名第一
+ OSWorld
Rank 1st

OSWorld
- 58.0%
成功率 + 58.0%
Success Rate
Code
工具越多越好吗?The more tools the better? 2025/09/18
🖼️ 多模态🖼️ Multi-Modal - VisualWebArena 九月排名第一 + VisualWebArena Rank 1st in September
VWA
- 36.5%
成功率 + 36.5%
Success Rate
Code
自动化工具生成
+
Automated tool generation
Paper
2025/09/25
🔍 深度搜索🔍 Deep-Search - Xbench 卓越表现 + Xbench Excellence
xbench @@ -402,62 +350,73 @@ trainer.train()
- AWorld 拥有自己的上下文引擎:Amni。 + AWorld has its own context engine: Amni. 2025/10/23
-#### 数据合成 (Data Synthesis) +#### 数据合成 + +1. **FunReason-MT Technical Report: Overcoming the Complexity Barrier in Multi-Turn Function Calling** arxiv, 2025. [paper](https://arxiv.org/abs/2510.24645), [code](https://github.com/inclusionAI/AWorld-RL), [model](https://huggingface.co/Bingguang/FunReason-MT), [dataset](https://huggingface.co/datasets/Bingguang/FunReason-MT) -1. **FunReason-MT Technical Report: Overcoming the Complexity Barrier in Multi-Turn Function Calling** arxiv, 2025. [论文](https://arxiv.org/abs/2510.24645), [代码](https://github.com/inclusionAI/AWorld-RL), [模型](https://huggingface.co/Bingguang/FunReason-MT), [数据集](https://huggingface.co/datasets/Bingguang/FunReason-MT) + *Zengzhuang Xu, Bingguang Hao, Zechuan Wang, Yuntao Wen, Maolin Wang, etc.* + +2. **From Failure to Mastery: Generating Hard Samples for Tool-use Agents** arxiv, 2026. [paper](https://arxiv.org/abs/2601.01498), [code](https://github.com/inclusionAI/AWorld-RL), [model](https://huggingface.co/Bingguang/FunReason-MT), [dataset](https://huggingface.co/datasets/Bingguang/FunReason-MT) - *Zengzhuang Xu, Bingguang Hao, Zechuan Wang, Yuntao Wen, Maolin Wang, 等* + *Bingguang Hao, Zengzhuang Xu, Yuntao Wen, Xinyi Xu, Yang Liu, etc.* -#### 模型训练 (Model Training) +#### 模型训练 -1. **AWorld: Orchestrating the Training Recipe for Agentic AI.** arxiv, 2025. [论文](https://arxiv.org/abs/2508.20404), [代码](https://github.com/inclusionAI/AWorld/tree/main/train), [模型](https://huggingface.co/inclusionAI/Qwen3-32B-AWorld) +1. **AWorld: Orchestrating the Training Recipe for Agentic AI.** arxiv, 2025. [paper](https://arxiv.org/abs/2508.20404), [code](https://github.com/inclusionAI/AWorld/tree/main/train), [model](https://huggingface.co/inclusionAI/Qwen3-32B-AWorld) - *Chengyue Yu, Siyuan Lu, Chenyi Zhuang, Dong Wang, Qintong Wu, 等* + *Chengyue Yu, Siyuan Lu, Chenyi Zhuang, Dong Wang, Qintong Wu, etc.* -2. **FunReason: Enhancing Large Language Models' Function Calling via Self-Refinement Multiscale Loss and Automated Data Refinement.** arxiv, 2025. [论文](https://arxiv.org/abs/2505.20192), [模型](https://huggingface.co/Bingguang/FunReason) +2. **FunReason: Enhancing Large Language Models' Function Calling via Self-Refinement Multiscale Loss and Automated Data Refinement.** arxiv, 2025. [paper](https://arxiv.org/abs/2505.20192), [model](https://huggingface.co/Bingguang/FunReason) - *Bingguang Hao, Maolin Wang, Zengzhuang Xu, Cunyin Peng, 等* + *Bingguang Hao, Maolin Wang, Zengzhuang Xu, Cunyin Peng, etc.* -3. **Exploring Superior Function Calls via Reinforcement Learning.** arxiv, 2025. [论文](https://arxiv.org/abs/2508.05118), [代码](https://github.com/BingguangHao/RLFC) +3. **Exploring Superior Function Calls via Reinforcement Learning.** arxiv, 2025. [paper](https://arxiv.org/abs/2508.05118), [code](https://github.com/BingguangHao/RLFC) - *Bingguang Hao, Maolin Wang, Zengzhuang Xu, Yicheng Chen, 等* + *Bingguang Hao, Maolin Wang, Zengzhuang Xu, Yicheng Chen, etc.* -4. **RAG-R1 : Incentivize the Search and Reasoning Capabilities of LLMs through Multi-query Parallelism.** arxiv, 2025. [论文](https://arxiv.org/abs/2507.02962), [代码](https://github.com/inclusionAI/AgenticLearning), [模型](https://huggingface.co/collections/endertzw/rag-r1-68481d7694b3fca8b809aa29) +4. **RAG-R1 : Incentivize the Search and Reasoning Capabilities of LLMs through Multi-query Parallelism.** arxiv, 2025. [paper](https://arxiv.org/abs/2507.02962), [code](https://github.com/inclusionAI/AgenticLearning), [model](https://huggingface.co/collections/endertzw/rag-r1-68481d7694b3fca8b809aa29) *Zhiwen Tan, Jiaming Huang, Qintong Wu, Hongxuan Zhang, Chenyi Zhuang, Jinjie Gu* -5. **V2P: From Background Suppression to Center Peaking for Robust GUI Grounding Task.** arxiv, 2025. [论文](https://arxiv.org/abs/2508.13634), [代码](https://github.com/inclusionAI/AgenticLearning/tree/main/V2P) +5. **V2P: From Background Suppression to Center Peaking for Robust GUI Grounding Task.** arxiv, 2025. [paper](https://arxiv.org/abs/2508.13634), [code](https://github.com/inclusionAI/AgenticLearning/tree/main/V2P) *Jikai Chen, Long Chen, Dong Wang, Leilei Gan, Chenyi Zhuang, Jinjie Gu* -6. **Don’t Just Fine-tune the Agent, Tune the Environment** arxiv, 2025. [论文](https://arxiv.org/abs/2510.10197) +6. **Don't Just Fine-tune the Agent, Tune the Environment** arxiv, 2025. [paper](https://arxiv.org/abs/2510.10197) - *Siyuan Lu, Zechuan Wang, Hongxuan Zhang, Qintong Wu, Leilei Gan, Chenyi Zhuang, 等* + *Siyuan Lu, Zechuan Wang, Hongxuan Zhang, Qintong Wu, Leilei Gan, Chenyi Zhuang, etc.* -#### 元学习 (Meta Learning) +#### 元学习 -1. **Profile-Aware Maneuvering: A Dynamic Multi-Agent System for Robust GAIA Problem Solving by AWorld.** arxiv, 2025. [论文](https://arxiv.org/abs/2508.09889), [代码](https://github.com/inclusionAI/AWorld/blob/main/examples/gaia/README_GUARD.md) +1. **Profile-Aware Maneuvering: A Dynamic Multi-Agent System for Robust GAIA Problem Solving by AWorld.** arxiv, 2025. [paper](https://arxiv.org/abs/2508.09889), [code](https://github.com/inclusionAI/AWorld/blob/main/examples/gaia/README_GUARD.md) *Zhitian Xie, Qintong Wu, Chengyue Yu, Chenyi Zhuang, Jinjie Gu* -2. **Recon-Act: A Self-Evolving Multi-Agent Browser-Use System via Web Reconnaissance, Tool Generation, and Task Execution.** arxiv, 2025. [论文](https://arxiv.org/pdf/2509.21072), [代码](https://github.com/inclusionAI/AWorld/tree/main/examples/visualwebarena) +2. **Recon-Act: A Self-Evolving Multi-Agent Browser-Use System via Web Reconnaissance, Tool Generation, and Task Execution.** arxiv, 2025. [paper](https://arxiv.org/pdf/2509.21072), [code](https://github.com/inclusionAI/AWorld/tree/main/examples/visualwebarena) *Kaiwen He, Zhiwei Wang, Chenyi Zhuang, Jinjie Gu* +

-# 贡献 -我们热烈欢迎开发者加入我们,共同构建和改进 AWorld!无论您是想增强框架功能、修复 Bug 还是添加新特性,您的贡献对我们都非常宝贵。 -如需学术引用或希望联系我们,请使用以下 BibTeX 条目: + +# 参与贡献 +

+我们的愿景包括:拓展 AI for Science & Business、深化自进化能力、扩充社区贡献的 Skills 库。 + +我们欢迎开发者、研究者与领域专家加入——无论是改进框架,还是贡献你所在领域的 Skill,都很有价值。 + +学术引用或联系我们,请使用以下 BibTeX: +

```bibtex @misc{yu2025aworldorchestratingtrainingrecipe, @@ -471,9 +430,6 @@ trainer.train() } ``` -# Star History -![](https://api.star-history.com/svg?repos=inclusionAI/AWorld&type=Date) - [arxiv-image]: https://img.shields.io/badge/Paper-arXiv-B31B1B?style=for-the-badge&logo=arxiv&logoColor=white @@ -493,9 +449,9 @@ trainer.train() [deepwiki-url]: https://deepwiki.com/inclusionAI/AWorld [discord-url]: https://discord.gg/b4Asj2ynMw [license-url]: https://opensource.org/licenses/MIT -[twitter-url]: https://x.com/InclusionAI666 +[twitter-url]: https://x.com/AWorldAgents [wechat-url]: https://raw.githubusercontent.com/inclusionAI/AWorld/main/readme_assets/aworld_wechat.png -[arxiv-url]: https://arxiv.org/abs/2508. +[arxiv-url]: https://arxiv.org/abs/2508.20404 [tutorial-url]: https://inclusionai.github.io/AWorld/ [playground-url]: https://playground.aworldagents.com/ @@ -503,8 +459,6 @@ trainer.train() [funreason-code-url]: https://github.com/BingguangHao/FunReason [funreason-model-url]: https://huggingface.co/Bingguang/FunReason [funreason-paper-url]: https://arxiv.org/pdf/2505.20192 - - [deepsearch-code-url]: https://github.com/inclusionAI/AgenticLearning @@ -527,4 +481,4 @@ trainer.train() [Paper]: https://img.shields.io/badge/Paper-4ECDC4 - \ No newline at end of file + diff --git a/aworld-cli/README.md b/aworld-cli/README.md index d52d7c8c7..55dd7ea30 100644 --- a/aworld-cli/README.md +++ b/aworld-cli/README.md @@ -1,6 +1,16 @@ # AWorld CLI -Command-line interface for interacting with AWorld agents. +AWorld CLI is a command-line tool for interacting with AWorld agents. + +## Features + +- **Interactive CLI**: Rich terminal interface for agent interaction +- **Agent Discovery**: Automatic discovery of agents using `@agent` decorator +- **Built-in Agents**: Automatically loads built-in agents from `inner_plugins/*/agents` directories (no configuration required) +- **Multiple Sources**: Support for local and remote agents +- **Streaming Output**: Real-time streaming of agent responses +- **Agent Priority**: Built-in agents → Local agents → Remote agents + ## Installation @@ -60,6 +70,50 @@ aworld-cli --agent-dir ./my_agents --task "Your task" --agent MyAgent aworld-cli --remote-backend http://localhost:8000 list ``` + +## Command-Line Interface + +### Interactive Mode + +```bash +# Start interactive mode (automatically loads built-in Aworld agent) +aworld-cli +``` + +### List Agents + +```bash +# List all available agents (including built-in agents) +aworld-cli list + +# Example output: +# 📦 Loading built-in agents from: .../inner_plugins/smllc/agents +# 📚 Loaded 2 global skill(s): text2agent, optimizer +# +# Available Agents +#╭────────┬─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┬─────────╮ +#│ Name │ Description │ Address │ +#├────────┼─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┼─────────┤ +#│ Aworld │ Aworld is a versatile AI assistant that can execute tasks directly or delegate to specialized agent teams. Use when you need: │ list │ +#│ │ (1) General-purpose task execution, (2) Complex multi-step problem solving, (3) Coordination of specialized agent teams, (4) │ │ +#│ │ Adaptive task handling that switches between direct execution and team delegation │ │ +#╰────────┴─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┴─────────╯ +``` + +### Direct Run Mode + +```bash +# Run a task with built-in Aworld agent +aworld-cli --task "Your task here" --agent Aworld --max-runs 5 + +# Use custom agents alongside built-in agents +aworld-cli --agent-dir ./my_agents --task "Your task" --agent MyAgent + +# Use remote agents +aworld-cli --remote-backend http://localhost:8000 --task "Your task" --agent RemoteAgent +``` + + ## Create Custom Agent Use the `@agent` decorator to register an agent: @@ -80,13 +134,38 @@ def build_my_swarm() -> Swarm: Place the file in the directory specified by `LOCAL_AGENTS_DIR` or use `--agent-dir` parameter. + + +## Agent Loading Priority + +1. 📦 **Built-in Agents** (`inner_plugins/*/agents`) - Always loaded first (no configuration required) + - Only loads `agents` directories from each plugin + - Skills are managed separately by `skill_registry` +2. 📂 **Local Agents** (`LOCAL_AGENTS_DIR` or `--agent-dir`) - User-configured local agents +3. 🌐 **Remote Agents** (`REMOTE_AGENTS_BACKEND` or `--remote-backend`) - Remote backend agents + +**Built-in Agents:** +- **Aworld**: A versatile AI assistant that can execute tasks directly or delegate to specialized agent teams + - Location: `inner_plugins/smllc/agents/` + - Supports direct execution with MCP tools and skills + - Can delegate complex tasks to agent teams + - Includes agent creation skills + + ## Environment Variables -- `LOCAL_AGENTS_DIR`: Local agent directories (semicolon-separated) -- `REMOTE_AGENTS_BACKEND`: Remote backend URLs (semicolon-separated) -- `SKILLS_PATH`: Skill source paths (local directories or GitHub URLs, semicolon-separated) +- `LOCAL_AGENTS_DIR`: Semicolon-separated list of local agent directories (in addition to built-in agents) +- `REMOTE_AGENTS_BACKEND`: Semicolon-separated list of remote backend URLs +- `SKILLS_PATH`: Semicolon-separated list of skill sources (local directories or GitHub URLs) + - Example: `SKILLS_PATH=./skills;https://github.com/user/repo;../custom-skills` +- `SKILLS_DIR`: Single skills directory (legacy, for backward compatibility) +- `SKILLS_CACHE_DIR`: Custom cache directory for GitHub skill repositories (default: ~/.aworld/skills) - `AWORLD_DISABLE_CONSOLE_LOG`: Disable console logging (set to 'true') +**Note:** Built-in agents from `inner_plugins/*/agents` directories are always loaded automatically, regardless of environment variable configuration. Only the `agents` subdirectories are scanned to avoid loading unnecessary files. + + + ## More Help ```bash diff --git a/aworld-cli/src/aworld_cli/README.md b/aworld-cli/src/aworld_cli/README.md deleted file mode 100644 index 448548a80..000000000 --- a/aworld-cli/src/aworld_cli/README.md +++ /dev/null @@ -1,79 +0,0 @@ -# AWorld CLI - -AWorld CLI is a command-line tool for interacting with AWorld agents. - -## Features - -- **Interactive CLI**: Rich terminal interface for agent interaction -- **Agent Discovery**: Automatic discovery of agents using `@agent` decorator -- **Built-in Agents**: Automatically loads built-in agents from `inner_plugins/*/agents` directories (no configuration required) -- **Multiple Sources**: Support for local and remote agents -- **Streaming Output**: Real-time streaming of agent responses -- **Agent Priority**: Built-in agents → Local agents → Remote agents - -## Command-Line Interface - -### Interactive Mode - -```bash -# Start interactive mode (automatically loads built-in Aworld agent) -aworld-cli -``` - -### List Agents - -```bash -# List all available agents (including built-in agents) -aworld-cli list - -# Example output: -# 📦 Loading built-in agents from: .../inner_plugins/smllc/agents -# 📚 Loaded 1 global skill(s): agent-creator -# -# Available Agents -# ┌─────────┬───────────────────────────────────┬────────────┬──────────────────────────┐ -# │ Name │ Description │ SourceType │ Address │ -# ├─────────┼───────────────────────────────────┼────────────┼──────────────────────────┤ -# │ Aworld │ Aworld - A versatile AI assistant │ LOCAL │ .../inner_plugins/smllc..│ -# └─────────┴───────────────────────────────────┴────────────┴──────────────────────────┘ -``` - -### Direct Run Mode - -```bash -# Run a task with built-in Aworld agent -aworld-cli --task "Your task here" --agent Aworld --max-runs 5 - -# Use custom agents alongside built-in agents -aworld-cli --agent-dir ./my_agents --task "Your task" --agent MyAgent - -# Use remote agents -aworld-cli --remote-backend http://localhost:8000 --task "Your task" --agent RemoteAgent -``` - -## Agent Loading Priority - -1. 📦 **Built-in Agents** (`inner_plugins/*/agents`) - Always loaded first (no configuration required) - - Only loads `agents` directories from each plugin - - Skills are managed separately by `skill_registry` -2. 📂 **Local Agents** (`LOCAL_AGENTS_DIR` or `--agent-dir`) - User-configured local agents -3. 🌐 **Remote Agents** (`REMOTE_AGENTS_BACKEND` or `--remote-backend`) - Remote backend agents - -**Built-in Agents:** -- **Aworld**: A versatile AI assistant that can execute tasks directly or delegate to specialized agent teams - - Location: `inner_plugins/smllc/agents/` - - Supports direct execution with MCP tools and skills - - Can delegate complex tasks to agent teams - - Includes agent creation skills - -## Environment Variables - -- `LOCAL_AGENTS_DIR`: Semicolon-separated list of local agent directories (in addition to built-in agents) -- `REMOTE_AGENTS_BACKEND`: Semicolon-separated list of remote backend URLs -- `SKILLS_PATH`: Semicolon-separated list of skill sources (local directories or GitHub URLs) - - Example: `SKILLS_PATH=./skills;https://github.com/user/repo;../custom-skills` -- `SKILLS_DIR`: Single skills directory (legacy, for backward compatibility) -- `SKILLS_CACHE_DIR`: Custom cache directory for GitHub skill repositories (default: ~/.aworld/skills) -- `AWORLD_DISABLE_CONSOLE_LOG`: Disable console logging (set to 'true') - -**Note:** Built-in agents from `inner_plugins/*/agents` directories are always loaded automatically, regardless of environment variable configuration. Only the `agents` subdirectories are scanned to avoid loading unnecessary files. diff --git a/aworld-cli/src/aworld_cli/console.py b/aworld-cli/src/aworld_cli/console.py index da8f439b8..6e8eb60a9 100644 --- a/aworld-cli/src/aworld_cli/console.py +++ b/aworld-cli/src/aworld_cli/console.py @@ -6,23 +6,27 @@ from prompt_toolkit.completion import NestedCompleter from prompt_toolkit.formatted_text import HTML from rich import box +from rich.color import Color from rich.panel import Panel -from rich.prompt import Prompt, Confirm +from rich.prompt import Prompt from rich.table import Table +from rich.text import Text -from .models import AgentInfo +from aworld.logs.util import logger from ._globals import console +from .core.skill_registry import get_skill_registry +from .models import AgentInfo +from .user_input import UserInputHandler -# ... existing imports ... - -from rich.text import Text -from rich.color import Color # ... existing imports ... class AWorldCLI: def __init__(self): self.console = console + self.user_input = UserInputHandler(console) + # Track whether this is the first session startup, used to control the display of motivational messages + self._is_first_session = True def _get_gradient_text(self, text: str, start_color: str, end_color: str) -> Text: """Create a Text object with a horizontal gradient.""" @@ -87,24 +91,14 @@ def display_agents(self, agents: List[AgentInfo], source_type: str = "LOCAL", so table = Table(title="Available Agents", box=box.ROUNDED) table.add_column("Name", style="magenta") table.add_column("Description", style="green") - table.add_column("SourceType", style="cyan") table.add_column("Address", style="blue") for agent in agents: desc = getattr(agent, "desc", "No description") or "No description" - # Always use agent's own source_type and source_location if they exist and are valid - # Fallback to provided parameters only if agent doesn't have these attributes - agent_source_type = getattr(agent, "source_type", None) + # Always use agent's own source_location if it exists and is valid + # Fallback to provided parameters only if agent doesn't have this attribute agent_source_location = getattr(agent, "source_location", None) - - # Use agent's source_type if it exists and is valid, otherwise use fallback - if agent_source_type and agent_source_type != "UNKNOWN" and agent_source_type.strip() != "": - # Use agent's own source_type - pass - else: - # Use fallback - agent_source_type = source_type - + # Use agent's source_location if it exists and is valid, otherwise use fallback if agent_source_location and agent_source_location.strip() != "": # Use agent's own source_location @@ -113,7 +107,7 @@ def display_agents(self, agents: List[AgentInfo], source_type: str = "LOCAL", so # Use fallback agent_source_location = source_location - table.add_row(agent.name, desc, agent_source_type, agent_source_location) + table.add_row(agent.name, desc, agent_source_location) self.console.print(table) @@ -138,7 +132,7 @@ def select_agent(self, agents: List[AgentInfo], source_type: str = "LOCAL", sour table.add_column("Name", style="magenta") table.add_column("Description", style="green") table.add_column("SourceType", style="cyan") - table.add_column("Address", style="blue") + table.add_column("Address", style="blue", overflow="wrap") for idx, agent in enumerate(agents, 1): desc = getattr(agent, "desc", "No description") or "No description" @@ -178,12 +172,12 @@ def select_agent(self, agents: List[AgentInfo], source_type: str = "LOCAL", sour # Fallback for non-terminal environments self.console.print("Select an agent number [default: 1]: ", end="") choice = input().strip() or "1" - + # Check for exit command if choice.lower() in ("exit", "quit", "q"): self.console.print("[yellow]Selection cancelled.[/yellow]") return None - + try: idx = int(choice) - 1 if 0 <= idx < len(agents): @@ -202,6 +196,253 @@ def select_team(self, teams: List[AgentInfo], source_type: str = "LOCAL", source """ return self.select_agent(teams, source_type, source_location) + def _visualize_team(self, executor_instance: Any): + """Visualize the structure of the current team in a full-width split-screen layout.""" + from rich.columns import Columns + try: + from rich.console import Group + except ImportError: + try: + from rich import Group + except ImportError: + # Fallback for older Rich versions + class Group: + """Fallback Group class for older Rich versions.""" + def __init__(self, *renderables): + self.renderables = renderables + + def __rich_console__(self, console, options): + for renderable in self.renderables: + yield renderable + from rich.layout import Layout + from rich.panel import Panel + from rich.align import Align + from rich import box + + # 1. Get swarm from executor + swarm = getattr(executor_instance, "swarm", None) + if not swarm: + self.console.print("[yellow]Current agent does not support visualization (not a swarm).[/yellow]") + return + + # 2. Get agent graph + graph = getattr(swarm, "agent_graph", None) + if not graph: + self.console.print("[yellow]No agent graph found in swarm.[/yellow]") + return + + # --- Gather Data --- + + # Goal + goal_text = "Run task" + if hasattr(executor_instance, "task"): + task = executor_instance.task + if hasattr(task, "input") and task.input: + goal_text = str(task.input) + elif hasattr(task, "name") and task.name: + goal_text = task.name + + if goal_text == "Run task" and hasattr(swarm, "task") and swarm.task: + goal_text = str(swarm.task) + + if len(goal_text) > 100: + goal_text = goal_text[:97] + "..." + + # Active skills + active_skill_names = set() + if hasattr(executor_instance, 'get_skill_status'): + try: + status = executor_instance.get_skill_status() + active_names = status.get('active_names', []) + if active_names: + active_skill_names = set(active_names) + except: + pass + + # Build Agent Panels + agent_panels = [] + if graph and graph.agents: + for agent in graph.agents.values(): + agent_skills = set() + if hasattr(agent, "skill_configs") and agent.skill_configs: + for skill_name in agent.skill_configs.keys(): + agent_skills.add(skill_name) + + agent_tools = set() + mcp_tools = set() + if hasattr(agent, "tools") and agent.tools: + for tool in agent.tools: + if isinstance(tool, dict) and "function" in tool: + agent_tools.add(tool["function"].get("name", "unknown")) + elif hasattr(tool, "name"): + agent_tools.add(tool.name) + + if hasattr(agent, "mcp_servers") and agent.mcp_servers: + for s in agent.mcp_servers: + mcp_tools.add(s) + + content_parts = [] + if agent_skills: + skills_list = [] + for s in list(agent_skills)[:5]: + if s in active_skill_names: + skills_list.append(f"[bold green]• {s}[/bold green]") + else: + skills_list.append(f"• {s}") + if len(agent_skills) > 5: + skills_list.append(f"[dim]...({len(agent_skills)-5})[/dim]") + content_parts.append(f"[bold cyan]Skills:[/bold cyan]\n" + "\n".join(skills_list)) + + tools_list = [] + built_in_list = ["Read", "Write", "Bash", "Grep"] + has_builtins = any(t in agent_tools for t in built_in_list) + if has_builtins: + tools_list.append("Built-in") + if mcp_tools: + tools_list.append(f"MCP: {len(mcp_tools)}") + custom_tools = [t for t in agent_tools if t not in built_in_list] + if custom_tools: + tools_list.append(f"Custom: {len(custom_tools)}") + + if tools_list: + content_parts.append(f"[bold yellow]Tools:[/bold yellow]\n" + ", ".join(tools_list)) + + agent_content = "\n".join(content_parts) if content_parts else "[dim]-[/dim]" + + agent_panel = Panel( + agent_content, + title=f"[bold]{agent.name()}[/bold]", + box=box.ROUNDED, + border_style="blue", + padding=(0, 1), + expand=True + ) + agent_panels.append(agent_panel) + + # --- Build Layout --- + + layout = Layout() + layout.split_row( + Layout(name="process", ratio=1), + Layout(name="team", ratio=1) + ) + + # --- Process Column (Left) --- + + goal_panel = Panel( + Align.center(f"[bold]GOAL[/bold]\n\"{goal_text}\""), + box=box.ROUNDED, + style="white", + border_style="green" + ) + + loop_panel = Panel( + Align.center("[bold]AGENT LOOP[/bold]\n[dim]observe → think → act → learn → repeat[/dim]"), + box=box.ROUNDED, + style="white" + ) + + hooks_panel = Panel( + Align.center("[bold]HOOKS[/bold]\n[dim]guard rails, logging, human-in-the-loop[/dim]"), + box=box.ROUNDED, + style="white" + ) + + output_panel = Panel( + Align.center("[bold]STRUCTURED OUTPUT[/bold]\n[dim]validated JSON matching your schema[/dim]"), + box=box.ROUNDED, + style="white", + border_style="green" + ) + + arrow_down = Align.center("│\n▼") + + process_content = Group( + goal_panel, + arrow_down, + loop_panel, + arrow_down, + hooks_panel, + arrow_down, + output_panel + ) + + layout["process"].update(Panel(process_content, title="Process Flow", box=box.ROUNDED)) + + # --- Team Column (Right) --- + + swarm_label = Align.center("[bold]SWARM[/bold]") + + # Use Columns for agents if there are many, or Stack if few + # Using Columns with expand=True to fill width + if len(agent_panels) > 1: + agents_display = Columns(agent_panels, expand=True, equal=True) + else: + agents_display = Group(*[Align.center(p) for p in agent_panels]) + + team_content = Group( + swarm_label, + Align.center("│\n▼"), + agents_display + ) + + layout["team"].update(Panel(team_content, title="Team Structure", box=box.ROUNDED)) + + # Print layout full width + self.console.print(layout) + + async def _esc_key_listener(self): + """ + Background listener for Esc key to interrupt currently executing tasks. + This function runs in the background, continuously listening for keyboard input. + """ + try: + from prompt_toolkit import Application + from prompt_toolkit.key_binding import KeyBindings + from prompt_toolkit.layout import Layout + from prompt_toolkit.layout.containers import Window + from prompt_toolkit.layout.controls import FormattedTextControl + from prompt_toolkit.formatted_text import FormattedText + + # Create a hidden window to capture Esc key + kb = KeyBindings() + + # Store reference to currently executing task + if not hasattr(self, '_current_executor_task'): + self._current_executor_task = None + + def handle_esc(event): + """Handle Esc key press""" + if hasattr(self, '_current_executor_task') and self._current_executor_task: + if not self._current_executor_task.done(): + self._current_executor_task.cancel() + self.console.print("\n[yellow]⚠️ Task interrupted by Esc key[/yellow]") + + kb.add("escape")(handle_esc) + + # Create an invisible control + control = FormattedTextControl( + text=FormattedText([("", "")]), + focusable=True + ) + + window = Window(content=control, height=0) + layout = Layout(window) + + # Create a hidden application to listen for keyboard + app = Application( + layout=layout, + key_bindings=kb, + full_screen=False, + mouse_support=False + ) + + # Run application in background + await asyncio.to_thread(app.run) + except Exception: + # If prompt_toolkit is not available or error occurs, fail silently + pass + async def run_chat_session(self, agent_name: str, executor: Callable[[str], Any], available_agents: List[AgentInfo] = None, executor_instance: Any = None) -> Union[bool, str]: """ Run an interactive chat session with an agent. @@ -274,10 +515,13 @@ async def run_chat_session(self, agent_name: str, executor: Callable[[str], Any] f"Type '/switch [agent_name]' to switch agent.\n" f"Type '/new' to create a new session.\n" f"Type '/restore' or '/latest' to restore to the latest session.\n" + f"Type '/skills' to list all available skills.\n" + f"Type '/agents' to list all available agents.\n" + f"Type '/test' to test user input functionality.\n" f"Use @filename to include images or text files (e.g., @photo.jpg or @document.txt)." ) self.console.print(Panel(help_text, style="blue")) - + # Check if we're in a real terminal (not IDE debugger or redirected input) is_terminal = sys.stdin.isatty() @@ -291,10 +535,14 @@ async def run_chat_session(self, agent_name: str, executor: Callable[[str], Any] '/new': None, '/restore': None, '/latest': None, + '/skills': None, + '/agents': None, + '/test': None, '/exit': None, '/quit': None, 'exit': None, 'quit': None + }) session = PromptSession(completer=completer) @@ -304,9 +552,21 @@ async def run_chat_session(self, agent_name: str, executor: Callable[[str], Any] if is_terminal and session: # Use prompt_toolkit for input with completion # We use HTML for basic coloring of the prompt - user_input = await asyncio.to_thread(session.prompt, HTML("You: ")) + # Show inspirational message only on first session + if self._is_first_session: + prompt_text = "✨ Create an agent to do anything!\nYou: " + # Mark as no longer first session after showing the message + self._is_first_session = False + else: + prompt_text = "You: " + + user_input = await asyncio.to_thread(session.prompt, HTML(prompt_text)) else: # Fallback to plain input() for non-terminal environments + if self._is_first_session: + self.console.print("[cyan]✨ Create an agent to do anything![/cyan]") + # Mark as no longer first session after showing the message + self._is_first_session = False self.console.print("[cyan]You[/cyan]: ", end="") user_input = await asyncio.to_thread(input) @@ -359,6 +619,365 @@ async def run_chat_session(self, agent_name: str, executor: Callable[[str], Any] continue else: return True # Return True to switch agent (show list) + + # Handle skills command + if user_input.lower() in ("/skills", "skills"): + try: + # First, load skills from plugin directories + from .runtime.cli import CliRuntime + from pathlib import Path + + runtime = CliRuntime() + runtime.cli = self # Set cli reference for console output + loaded_skills = await runtime._load_skills() + + # Display loading results from plugins + if loaded_skills: + total_loaded = sum(loaded_skills.values()) + if total_loaded > 0: + logger.info(f"[green]✅ Loaded {total_loaded} skill(s) from {len([k for k, v in loaded_skills.items() if v > 0])} plugin(s)[/green]") + else: + logger.info("[dim]No new skills loaded from plugins.[/dim]") + + # Get all skills from registry (including newly loaded ones) + registry = get_skill_registry() + all_skills = registry.get_all_skills() + + if not all_skills: + logger.info("[yellow]No skills available.[/yellow]") + continue + + # Separate skills into plugin and user skills + plugin_skills = {} + user_skills = {} + + for skill_name, skill_data in all_skills.items(): + skill_path = skill_data.get("skill_path", "") + # Determine if skill is from plugin or user + # Plugin skills: from inner_plugins or .aworld directories + if skill_path and ("inner_plugins" in skill_path or ".aworld" in skill_path): + plugin_skills[skill_name] = skill_data + else: + user_skills[skill_name] = skill_data + + # Helper function to create and display a skills table + def display_skills_table(skills_dict, title): + if not skills_dict: + return + + table = Table(title=title, box=box.ROUNDED) + table.add_column("Name", style="magenta") + table.add_column("Description", style="green") + + for skill_name, skill_data in sorted(skills_dict.items()): + desc = skill_data.get("description") or skill_data.get("desc") or "No description" + # Truncate description if too long + if len(desc) > 60: + desc = desc[:57] + "..." + + table.add_row(skill_name, desc) + + self.console.print(table) + self.console.print(f"[dim]Total: {len(skills_dict)} skill(s)[/dim]") + + # Display User skills first + if user_skills: + display_skills_table(user_skills, "User Skills") + self.console.print() # Add spacing between tables + + # Display Plugin skills + if plugin_skills: + display_skills_table(plugin_skills, "Plugin Skills") + + # Display overall total + if plugin_skills and user_skills: + self.console.print(f"[dim]Overall Total: {len(all_skills)} skill(s)[/dim]") + except Exception as e: + self.console.print(f"[red]Error loading skills: {e}[/red]") + continue + + # Handle test command + if user_input.lower() in ("/test", "test"): + try: + self.console.print("[bold cyan]🧪 User Input Test Function[/bold cyan]") + self.console.print() + + # Test options + test_options = [ + "1. Test text input", + "2. Test multi-select input", + "3. Test confirmation input", + "4. Test composite menu", + "5. Test single-select list", + "6. Exit test" + ] + + self.console.print("[bold]Please select a function to test:[/bold]") + for option in test_options: + self.console.print(f" {option}") + self.console.print() + + test_choice = await asyncio.to_thread( + Prompt.ask, + "[cyan]Please enter option number (1-6)[/cyan]", + default="1", + console=self.console + ) + + test_choice = test_choice.strip() + + if test_choice == "1": + # Test text input + self.console.print() + self.console.print("[bold green]📝 Test Text Input[/bold green]") + self.console.print("[dim]Please enter some text for testing...[/dim]") + text_input = await asyncio.to_thread( + self.user_input.text_input, + "[cyan]Please enter text[/cyan]" + ) + self.console.print(f"[green]✅ Your input text is: {text_input}[/green]") + + elif test_choice == "2": + # Test multi-select input + self.console.print() + self.console.print("[bold green]☑️ Test Multi-select Input[/bold green]") + test_items = ["Apple", "Banana", "Orange", "Grape", "Strawberry"] + selected_indices = await asyncio.to_thread( + self.user_input.select_multiple, + options=test_items, + title="Please select your favorite fruits (multiple selection)", + prompt="Enter option numbers (comma-separated, e.g., 1,3,5)" + ) + if selected_indices: + selected_items = [test_items[i] for i in selected_indices] + self.console.print(f"[green]✅ You selected: {', '.join(selected_items)}[/green]") + else: + self.console.print("[yellow]⚠️ No options selected[/yellow]") + + elif test_choice == "3": + # Test confirmation input + self.console.print() + self.console.print("[bold green]❓ Test Confirmation Input[/bold green]") + from rich.prompt import Confirm + confirmed = await asyncio.to_thread( + Confirm.ask, + "[cyan]Are you sure you want to continue?[/cyan]", + default=True, + console=self.console + ) + if confirmed: + self.console.print("[green]✅ You chose to confirm[/green]") + else: + self.console.print("[yellow]⚠️ You chose to cancel[/yellow]") + + elif test_choice == "4": + # Test composite menu + self.console.print() + self.console.print("[bold green]📋 Test Composite Menu[/bold green]") + + # Create test tabs + test_tabs = [ + { + 'type': 'multi_select', + 'name': 'product_type', + 'title': 'What is your product type?', + 'options': [ + {'label': 'Software/Application Product', + 'description': 'Mobile apps, web apps, desktop software and other digital products'}, + {'label': 'Hardware Device', 'description': 'Electronic devices, smart hardware, IoT products, etc.'}, + {'label': 'Service Platform', 'description': 'SaaS services, online platforms, cloud services, etc.'}, + {'label': 'Physical Product', 'description': 'Consumer goods, industrial products, daily necessities, etc.'}, + ] + }, + { + 'type': 'text_input', + 'name': 'product_name', + 'title': 'Product Name', + 'prompt': 'Please enter product name', + 'default': '', + 'placeholder': 'Search...' + }, + { + 'type': 'submit', + 'name': 'confirm', + 'title': 'Review your answers', + 'message': 'Ready to submit your answers?', + 'default': True + } + ] + + try: + results = await asyncio.to_thread( + self.user_input.composite_menu, + tabs=test_tabs, + title="Create Product Introduction PPT" + ) + + if results: + self.console.print() + self.console.print("[green]✅ Composite menu test completed[/green]") + self.console.print("[bold]Returned results:[/bold]") + for tab_name, value in results.items(): + self.console.print(f" [cyan]{tab_name}[/cyan]: {value}") + else: + self.console.print("[yellow]⚠️ User cancelled the operation[/yellow]") + except Exception as e: + self.console.print(f"[red]Error during test: {e}[/red]") + import traceback + self.console.print(f"[dim]{traceback.format_exc()}[/dim]") + + elif test_choice == "5": + # Test single-select list + self.console.print() + self.console.print("[bold green]📋 Test Single-select List[/bold green]") + + # Create test navigation bar items + nav_items = [ + {'label': 'PPT Theme', 'type': 'checkbox', 'checked': False, 'highlight': False}, + {'label': 'Template Style', 'type': 'checkbox', 'checked': False, 'highlight': False}, + {'label': 'Submit', 'type': 'button', 'highlight': True} + ] + + # Create test options + test_options = [ + {'label': 'Submit answers', 'description': ''}, + {'label': 'Cancel', 'description': ''} + ] + + selected_index = await asyncio.to_thread( + self.user_input.single_select, + options=test_options, + title="Review your answers", + warning="You have not answered all questions", + question="Ready to submit your answers?", + nav_items=nav_items + ) + + if selected_index is not None: + selected_option = test_options[selected_index]['label'] + self.console.print(f"[green]✅ You selected: {selected_option}[/green]") + else: + self.console.print("[yellow]⚠️ User cancelled the selection[/yellow]") + + elif test_choice == "6": + self.console.print("[dim]Exit test[/dim]") + else: + self.console.print(f"[red]Invalid option: {test_choice}[/red]") + + self.console.print() + except KeyboardInterrupt: + self.console.print("\n[yellow]Test cancelled[/yellow]") + except Exception as e: + # logger.error(f"Error during test: {e} {traceback.format_exc()}") + self.console.print(f"[red]Error during test: {e}[/red]\n{traceback.format_exc()}") + continue + + # Handle agents command + if user_input.lower() in ("/agents", "agents"): + try: + from .runtime.cli import CliRuntime + from .runtime.loaders import PluginLoader + from aworld_cli.core.agent_scanner import global_agent_registry + from pathlib import Path + import os + + built_in_agents = [] + user_agents = [] + base_path = os.path.expanduser( + os.environ.get('AGENTS_PATH', '~/.aworld/agents')) + + # Load Built-in agents from plugins using PluginLoader + try: + # Get built-in plugin directories + runtime = CliRuntime() + plugin_dirs = runtime.plugin_dirs + + # Load agents from each plugin using PluginLoader + for plugin_dir in plugin_dirs: + try: + loader = PluginLoader(plugin_dir, console=self.console) + # Load agents from plugin (this also loads skills internally) + plugin_agents = await loader.load_agents() + # Mark as Built-in agents + for agent in plugin_agents: + if not hasattr(agent, 'source_type') or not agent.source_type: + agent.source_type = "BUILT-IN" + built_in_agents.extend(plugin_agents) + except Exception as e: + logger.info(f"Failed to load Built-in agents from plugin {plugin_dir.name}: {e}") + import traceback + logger.debug(traceback.format_exc()) + except Exception as e: + logger.info(f"Failed to load Built-in agents from plugins: {e}") + + # Load User agents from AgentScanner default instance + try: + agent_list = await global_agent_registry.list_desc() + for item in agent_list: + # Handle both old format (4-tuple with version) and new format (3-tuple) + if len(item) == 4: + name, desc, path, version = item + else: + name, desc, path = item + version = None + agent_info = AgentInfo( + name=name, + desc=desc, + metadata={"version": version} if version else {}, + source_type="USER", + source_location=base_path + ) + user_agents.append(agent_info) + except Exception as e: + logger.info(f"Failed to load User agents from registry: {e}") + + # Log summary + total_agents = len(built_in_agents) + len(user_agents) + logger.info(f"Loaded {total_agents} agent(s): {len(built_in_agents)} from Built-in plugins, {len(user_agents)} from User registry ({base_path})") + + # Display Built-in agents in a separate table + if built_in_agents: + self.console.print("\n[bold cyan]Built-in Agents:[/bold cyan]") + self.display_agents(built_in_agents, source_type="BUILT-IN") + else: + self.console.print("[dim]No Built-in agents available.[/dim]") + + # Display User agents in a separate table + if user_agents: + self.console.print("\n[bold cyan]User Agents:[/bold cyan]") + self.display_agents(user_agents, source_type="USER", source_location=base_path) + else: + self.console.print("[dim]No User agents available.[/dim]") + + if not built_in_agents and not user_agents: + self.console.print("[yellow]⚠️ No agents available.[/yellow]") + except Exception as e: + logger.info(f"Error loading agents: {e}") + import traceback + logger.debug(traceback.format_exc()) + continue + + # Handle visualize command + if user_input.lower() in ("/visualize_trajectory", "visualize_trajectory"): + self._visualize_team(executor_instance) + continue + + # Handle sessions command + if user_input.lower() in ("/sessions", "sessions"): + if executor_instance: + # Debug: Print session related attributes + session_attrs = {k: v for k, v in executor_instance.__dict__.items() if 'session' in k.lower()} + # Also check if context has session info + if hasattr(executor_instance, 'context') and executor_instance.context: + context_session_attrs = {k: v for k, v in executor_instance.context.__dict__.items() if + 'session' in k.lower()} + session_attrs.update({f"context.{k}": v for k, v in context_session_attrs.items()}) + + if session_attrs: + self.console.print(f"[dim]Session Info: {session_attrs}[/dim]") + else: + self.console.print("[yellow]No executor instance available.[/yellow]") + continue # Print agent name before response self.console.print(f"[bold green]{agent_name}[/bold green]:") @@ -377,7 +996,7 @@ async def run_chat_session(self, agent_name: str, executor: Callable[[str], Any] self.console.print("\n[yellow]Session interrupted.[/yellow]") break except Exception as e: - self.console.print(f"[red]An unexpected error occurred:[/red] {e}") + self.console.print(f"[red]An unexpected error occurred:[/red] {e}\n{traceback.format_exc()}") return False diff --git a/aworld-cli/src/aworld_cli/core/agent_registry.py b/aworld-cli/src/aworld_cli/core/agent_registry.py index 6b3b37722..f3ea2f7df 100644 --- a/aworld-cli/src/aworld_cli/core/agent_registry.py +++ b/aworld-cli/src/aworld_cli/core/agent_registry.py @@ -3,14 +3,15 @@ Independent registry system that doesn't depend on aworldappinfra. """ import inspect -from typing import Dict, Iterable, Callable, Optional, List, Union, ClassVar, Awaitable, TYPE_CHECKING from threading import RLock +from typing import Dict, Iterable, Callable, Optional, List, Union, ClassVar, Awaitable, TYPE_CHECKING from pydantic import BaseModel, PrivateAttr, Field from aworld.core.agent.swarm import Swarm from aworld.core.context.amni import AmniContextConfig, AmniConfigFactory from aworld.core.context.base import Context +from aworld.logs.util import logger if TYPE_CHECKING: from aworld.core.agent.base import BaseAgent @@ -27,14 +28,14 @@ class LocalAgent(BaseModel): """Represents a local agent configuration with swarm and context components. - + A LocalAgent defines a complete agent setup including: - The swarm (agent group) that executes tasks - Context configuration for managing application state - Metadata for additional agent information - + The swarm can be provided as instances or callables/factories for lazy initialization. - + Example: >>> def build_swarm() -> Swarm: ... return Swarm(agent1, agent2) @@ -43,52 +44,60 @@ class LocalAgent(BaseModel): ... desc="A demo agent", ... swarm=build_swarm, ... context_config=AmniConfigFactory.create(), - ... metadata={"version": "1.0.0"} + ... metadata={"version": "1.0.0"}, + ... unique=True ... ) >>> swarm = await agent.get_swarm() """ - + name: str = None """Agent name identifier. Required for registration.""" - + desc: str = None """Agent description or purpose.""" - + + path: Optional[str] = Field(default=None, description="File path where the agent is defined") + """File path where the @agent decorator is located. + + This is automatically set by the @agent decorator based on the source file location + where the agent is defined. Used for tracking the source file of the agent. + """ + swarm: Union[Swarm, Callable[..., Swarm], Callable[..., Awaitable[Swarm]]] = Field( - default=None, - description="Swarm instance or callable", + default=None, + description="Swarm instance or callable", exclude=True ) - """Swarm instance or callable that returns a Swarm. - + """Swarm instance or callable that returns a Swarm. + Can be: - A Swarm instance - A synchronous callable that takes Context and returns Swarm - An async callable that takes Context and returns Awaitable[Swarm] - + If callable, will be invoked when get_swarm() is called to enable lazy initialization. """ - + context_config: AmniContextConfig = Field( - default_factory=AmniContextConfig, - description="Context config", + default_factory=AmniContextConfig, + description="Context config", exclude=True ) """Configuration for application context management.""" metadata: dict = None """Additional metadata dictionary for agent information (e.g., version, creator, etc.).""" - + hooks: Optional[List[str]] = Field(default=None, description="Executor hooks configuration") """Executor hooks configuration. - + List of hook names (registered with HookFactory). Each hook class must: 1. Inherit from ExecutorHook (or its subclasses like PostBuildContextHook) 2. Implement the point() method to return its hook point 3. Be registered with HookFactory using @HookFactory.register(name="HookName") - + Hooks are automatically grouped by their hook point (returned by hook.point() method). - + Hook points available: - pre_input_parse: Before parsing user input - post_input_parse: After parsing user input (e.g., image processing) @@ -99,7 +108,7 @@ class LocalAgent(BaseModel): - pre_run_task: Before running task - post_run_task: After running task - on_task_error: When task execution fails - + Example: >>> agent = LocalAgent( ... name="MyAgent", @@ -108,6 +117,33 @@ class LocalAgent(BaseModel): ... ) """ + register_dir: Optional[str] = Field(default=None, description="Directory where agent is registered") + """Directory path where the agent is registered from. + + This is automatically set by the @agent decorator based on the file location + where the agent is defined. Used for filtering agents by source directory. + """ + + unique: bool = Field(default=False, description="Whether this agent should be unique (only one instance allowed globally)") + """Flag indicating whether this agent should be unique in the registry. + + When set to True: + - Only the first registration of an agent with this name will succeed + - Subsequent registration attempts with the same name will be skipped + - Applies to all versions of agents with the same name + + When set to False (default): + - Multiple versions of agents with the same name can coexist + - Registration follows normal multi-version behavior + + Example: + >>> agent = LocalAgent( + ... name="GlobalAgent", + ... unique=True, # Only one GlobalAgent allowed globally + ... ... + ... ) + """ + async def get_swarm(self, context: Context = None) -> Swarm: """Get the Swarm instance, initializing if necessary. @@ -121,22 +157,30 @@ async def get_swarm(self, context: Context = None) -> Swarm: - If the function has no parameters, it will be called without arguments - If context is None and function requires it, it will still be passed (may cause error) + The created Swarm instance is cached in self.swarm after first initialization, + so subsequent calls will return the cached instance directly. + Returns: The Swarm instance for this agent. Example: >>> agent = LocalAgent(swarm=lambda: Swarm(agent1, agent2)) - >>> swarm = await agent.get_swarm() # Swarm is created here + >>> swarm = await agent.get_swarm() # Swarm is created here and cached + >>> swarm2 = await agent.get_swarm() # Returns cached swarm >>> async def build_swarm(ctx: Context) -> Swarm: ... return Swarm(agent1, agent2) >>> agent = LocalAgent(swarm=build_swarm) - >>> swarm = await agent.get_swarm(context) + >>> swarm = await agent.get_swarm(context) # Created and cached """ if isinstance(self.swarm, Swarm): + logger.info(f"Using existing swarm for agent {self.name}") return self.swarm if callable(self.swarm): + logger.info(f"Initializing swarm for agent {self.name}") swarm_func = self.swarm + swarm_instance = None + if inspect.iscoroutinefunction(swarm_func): # Async callable sig = inspect.signature(swarm_func) @@ -145,13 +189,13 @@ async def get_swarm(self, context: Context = None) -> Swarm: # Try to call with context if function has parameters if param_count > 0: try: - return await swarm_func(context) + swarm_instance = await swarm_func(context) except TypeError as e: # If context is None and function requires it, try without arguments if "required" in str(e).lower() or "missing" in str(e).lower(): if context is None: try: - return await swarm_func() + swarm_instance = await swarm_func() except Exception as fallback_error: raise else: @@ -163,7 +207,7 @@ async def get_swarm(self, context: Context = None) -> Swarm: else: # Function has no parameters, call without arguments try: - return await swarm_func() + swarm_instance = await swarm_func() except Exception as e: raise else: @@ -174,13 +218,13 @@ async def get_swarm(self, context: Context = None) -> Swarm: # Try to call with context if function has parameters if param_count > 0: try: - return swarm_func(context) + swarm_instance = swarm_func(context) except TypeError as e: # If context is None and function requires it, try without arguments if "required" in str(e).lower() or "missing" in str(e).lower(): if context is None: try: - return swarm_func() + swarm_instance = swarm_func() except Exception as fallback_error: raise else: @@ -192,9 +236,16 @@ async def get_swarm(self, context: Context = None) -> Swarm: else: # Function has no parameters, call without arguments try: - return swarm_func() + swarm_instance = swarm_func() except Exception as e: raise + + # Cache the created swarm instance + if swarm_instance is not None: + self.swarm = swarm_instance + logger.info(f"Cached swarm instance for agent {self.name}") + return swarm_instance + return self.swarm model_config = {"arbitrary_types_allowed": True} @@ -295,13 +346,14 @@ def list_agent_names(cls) -> List[str]: return cls.get_instance().list_names() @classmethod - def get_agent(cls, agent_id: str) -> Optional[LocalAgent]: + def get_agent(cls, agent_id: str, version: Optional[str] = None) -> Optional[LocalAgent]: """Get an agent by agent_id using the singleton instance. This is a static class method that delegates to the singleton instance's get method. Args: agent_id: The agent identifier (name) to query. + version: Optional version string (e.g., "v0", "v1"). If not provided, returns the latest version. Returns: The LocalAgent instance if exists, else None. @@ -310,11 +362,16 @@ def get_agent(cls, agent_id: str) -> Optional[LocalAgent]: >>> agent = LocalAgentRegistry.get_agent("demo") >>> if agent: ... print(agent.name) + >>> # Get specific version + >>> agent_v1 = LocalAgentRegistry.get_agent("demo", version="v1") """ - return cls.get_instance().get(agent_id) + return cls.get_instance().get(agent_id, version) def register_agent(self, agent: LocalAgent) -> None: """Register a LocalAgent, requiring a unique non-empty name. + Supports multi-version registration: agents with the same name but different versions can coexist. + + Unique agent handling: If agent.unique is True, only allows one global registration to prevent duplicates. Args: agent: The LocalAgent instance to register. @@ -323,15 +380,55 @@ def register_agent(self, agent: LocalAgent) -> None: None Raises: - ValueError: If name is empty or already exists. + ValueError: If name is empty. """ if not agent or not agent.name: raise ValueError("LocalAgent.name is required for registration") + + # Handle unique agents - global singleton behavior + if agent.unique: + with self._lock: + # Check if any agent with this name is already registered (with or without version) + agent_exists = False + existing_agent_key = None + + for key in self._agents: + # Check if key is exactly the agent name or starts with "agent_name:" + if key == agent.name or key.startswith(f"{agent.name}:"): + agent_exists = True + existing_agent_key = key + break + + if agent_exists: + logger.info(f"🔒 Unique agent '{agent.name}' already registered globally as '{existing_agent_key}' - skipping duplicate registration") + return + else: + # Register the first unique agent + self._agents[agent.name] = agent + logger.info(f"✅ Registered unique agent '{agent.name}' for the first time") + return + + # Extract version from metadata or path for non-unique agents + version = None + if agent.metadata and "version" in agent.metadata: + version = agent.metadata["version"] + elif agent.register_dir: + # Try to extract version from directory path (e.g., {name}_v{N}/) + import re + import os + dir_name = os.path.basename(agent.register_dir.rstrip('/')) + match = re.match(r'^(.+)_v(\d+)$', dir_name) + if match: + version = f"v{match.group(2)}" + + # Use name:version as key for multi-version support, or just name if no version + agent_key = f"{agent.name}:{version}" if version else agent.name + with self._lock: - if agent.name in self._agents: - # logger.warning(f"LocalAgent '{agent.name}' is already registered") - return - self._agents[agent.name] = agent + # Allow multiple versions of the same agent name (for non-unique agents) + if agent_key in self._agents: + logger.warning(f"LocalAgent '{agent_key}' is already registered, updating...") + self._agents[agent_key] = agent def upsert(self, agent: LocalAgent) -> None: """Insert or update a LocalAgent. @@ -390,11 +487,12 @@ def unregister(self, name: str) -> bool: with self._lock: return self._agents.pop(name, None) is not None - def get(self, name: str) -> Optional[LocalAgent]: - """Get an agent by name. + def get(self, name: str, version: Optional[str] = None) -> Optional[LocalAgent]: + """Get an agent by name, optionally with version. Args: name: Agent name. + version: Optional version string (e.g., "v0", "v1"). If not provided, returns the latest version. Returns: The LocalAgent instance if exists, else None. @@ -402,7 +500,43 @@ def get(self, name: str) -> Optional[LocalAgent]: if not name: return None with self._lock: - return self._agents.get(name) + # If version is specified, try exact match first + if version: + agent_key = f"{name}:{version}" + if agent_key in self._agents: + return self._agents[agent_key] + + # Try direct name match (for backward compatibility) + if name in self._agents: + return self._agents[name] + + # Find all agents with this name (multi-version support) + matching_agents = [] + for key, agent in self._agents.items(): + if key == name or key.startswith(f"{name}:"): + matching_agents.append((key, agent)) + + if not matching_agents: + return None + + # If only one match, return it + if len(matching_agents) == 1: + return matching_agents[0][1] + + # Multiple versions found, return the latest one + # Extract version numbers and sort + def extract_version_from_key(key: str) -> int: + if ':' in key: + version_str = key.split(':', 1)[1] + # Extract version number from "v0", "v1", etc. + import re + match = re.match(r'v(\d+)', version_str) + return int(match.group(1)) if match else 0 + return 0 # No version suffix means v0 + + # Sort by version number (descending) and return the latest + matching_agents.sort(key=lambda x: extract_version_from_key(x[0]), reverse=True) + return matching_agents[0][1] def list(self) -> List[LocalAgent]: """List all agents. @@ -414,13 +548,21 @@ def list(self) -> List[LocalAgent]: return list(self._agents.values()) def list_names(self) -> List[str]: - """List all agent names. + """List all agent names (deduplicated, without version suffixes). Returns: - A list of registered agent names. + A list of registered agent names (unique, without version information). """ with self._lock: - return list(self._agents.keys()) + names = set() + for key in self._agents.keys(): + # Extract name from key (remove version suffix if present) + if ':' in key: + name = key.split(':', 1)[0] + else: + name = key + names.add(name) + return sorted(list(names)) def exists(self, name: str) -> bool: """Check if an agent exists by name. @@ -451,29 +593,32 @@ def agent( desc: Optional[str] = None, context_config: Optional[AmniContextConfig] = None, metadata: Optional[dict] = None, - hooks: Optional[List[str]] = None + hooks: Optional[List[str]] = None, + register_dir: Optional[str] = None, + unique: bool = False ) -> Callable: """Decorator for registering LocalAgent instances. - + This decorator provides a convenient way to register LocalAgent instances. It supports two usage patterns: - + 1. Parameterized decorator - decorate a build_swarm function: >>> @agent( ... name="MyAgent", ... desc="My agent description", ... context_config=AmniConfigFactory.create(...), - ... metadata={"version": "1.0.0"} + ... metadata={"version": "1.0.0"}, + ... unique=True ... ) >>> def build_my_swarm() -> Swarm: ... return Swarm(...) - + If the function returns an Agent instance instead of Swarm, it will be automatically wrapped as Swarm(agent): - >>> @agent(name="MyAgent", desc="My agent") + >>> @agent(name="MyAgent", desc="My agent", unique=True) >>> def build_agent() -> Agent: ... return MyAgent(...) # Returns Agent, will be wrapped as Swarm(agent) - + 2. Function decorator - decorate a function that returns LocalAgent: >>> @agent >>> def my_agent() -> LocalAgent: @@ -482,21 +627,27 @@ def agent( ... desc="My agent description", ... swarm=build_my_swarm, ... context_config=AmniConfigFactory.create(...), - ... metadata={"version": "1.0.0"} + ... metadata={"version": "1.0.0"}, + ... unique=True ... ) - + Args: name: Agent name identifier. Required when decorating build_swarm function. desc: Agent description or purpose. context_config: Configuration for application context management. metadata: Additional metadata dictionary for agent information. - + hooks: Optional list of hook names (registered with HookFactory). + register_dir: Optional directory path where agent is registered. If not provided, + will be automatically detected from the function's source file location. + unique: Whether this agent should be unique (only one instance allowed globally). + When True, only the first registration will succeed, subsequent ones will be skipped. + Returns: A decorator function that registers the LocalAgent. - + Example: >>> from aworld.core.context.amni.config import AmniConfigLevel - >>> + >>> >>> @agent( ... name="MyAgent", ... desc="My agent description", @@ -504,13 +655,14 @@ def agent( ... AmniConfigLevel.NAVIGATOR, ... debug_mode=True ... ), - ... metadata={"version": "1.0.0"} + ... metadata={"version": "1.0.0"}, + ... unique=True ... ) >>> def build_my_swarm() -> Swarm: ... return Swarm(...) - + Example with Agent return type (automatically wrapped): - >>> @agent(name="SingleAgent", desc="Single agent") + >>> @agent(name="SingleAgent", desc="Single agent", unique=True) >>> def build_my_agent() -> Agent: ... return MyAgent(...) # Automatically wrapped as Swarm(agent) """ @@ -518,10 +670,29 @@ def agent( if callable(name): func = name # Function decorator: @agent (without parameters) + # Try to get register_dir and path from function's source file + func_register_dir = None + func_path = None + try: + source_file = inspect.getsourcefile(func) + if source_file: + from pathlib import Path + func_register_dir = str(Path(source_file).parent.resolve()) + func_path = str(Path(source_file).resolve()) + except Exception: + pass + if inspect.iscoroutinefunction(func): async def async_wrapper(*args, **kwargs): result = await func(*args, **kwargs) if isinstance(result, LocalAgent): + # Set register_dir if not already set + if not result.register_dir and func_register_dir: + result.register_dir = func_register_dir + # Set path if not already set + if not result.path and func_path: + result.path = func_path + logger.info(f"Registering agent: {result.name}") LocalAgentRegistry.register(result) return result return async_wrapper @@ -529,6 +700,13 @@ async def async_wrapper(*args, **kwargs): def sync_wrapper(*args, **kwargs): result = func(*args, **kwargs) if isinstance(result, LocalAgent): + # Set register_dir if not already set + if not result.register_dir and func_register_dir: + result.register_dir = func_register_dir + # Set path if not already set + if not result.path and func_path: + result.path = func_path + logger.info(f"Registering agent: {result.name}") LocalAgentRegistry.register(result) return result return sync_wrapper @@ -538,6 +716,28 @@ def decorator(func: Callable) -> Callable: if not name: raise ValueError("name is required when using @agent decorator with parameters") + # Get register_dir and path from function's source file if not explicitly provided + func_register_dir = register_dir + func_path = None + if not func_register_dir: + try: + source_file = inspect.getsourcefile(func) + if source_file: + from pathlib import Path + func_register_dir = str(Path(source_file).parent.resolve()) + func_path = str(Path(source_file).resolve()) + except Exception: + pass + else: + # If register_dir is provided, try to get path from function's source file + try: + source_file = inspect.getsourcefile(func) + if source_file: + from pathlib import Path + func_path = str(Path(source_file).resolve()) + except Exception: + pass + # Create a wrapper function that checks return type and wraps Agent to Swarm if needed # Preserve the original function signature using functools.wraps import functools @@ -567,8 +767,14 @@ def swarm_wrapper(*args, **kwargs): swarm=swarm_wrapper, # Use the wrapper function as swarm factory context_config=context_config or AmniConfigFactory.create(), metadata=metadata or {"creator": "aworld-cli", "version": "1.0.0"}, - hooks=hooks + hooks=hooks, + register_dir=func_register_dir, + path=func_path, + unique=unique ) + + logger.info(f"Registering agent: {local_agent.name}") + LocalAgentRegistry.register(local_agent) # Return the wrapper function diff --git a/aworld-cli/src/aworld_cli/core/agent_registry_tool.py b/aworld-cli/src/aworld_cli/core/agent_registry_tool.py new file mode 100644 index 000000000..b9d0e6917 --- /dev/null +++ b/aworld-cli/src/aworld_cli/core/agent_registry_tool.py @@ -0,0 +1,544 @@ +# coding: utf-8 +# Copyright (c) 2025 inclusionAI. +import inspect +import traceback +from typing import Any, Dict, List, Tuple + +from aworld.config import ToolConfig +from aworld.core.agent.base import AgentFactory +from aworld.core.common import Observation, ActionModel, ActionResult, ToolActionInfo, ParamInfo +from aworld.core.context.amni import AmniContext +from aworld.core.event.base import Message +from aworld.core.tool.action import ToolAction +from aworld.core.tool.base import ToolFactory, AsyncTool +from aworld.logs.util import logger +from aworld.tools.utils import build_observation + +AGENT_REGISTRY = "AGENT_REGISTRY" + + +class ContextAgentRegistryAction(ToolAction): + """Agent Registry Support. Definition of Context agent registry operations.""" + + LIST_DESC = ToolActionInfo( + name="list_desc", + input_params={ + "source_type": ParamInfo( + name="source_type", + type="string", + required=False, + desc="Type of resources to list: 'built-in' for plugin agents/skills, 'user' for user-registered agents (default: 'user')" + ) + }, + desc="List all available resources with their descriptions in the registry" + ) + + DYNAMIC_REGISTER = ToolActionInfo( + name="dynamic_register", + input_params={ + "local_agent_name": ParamInfo( + name="local_agent_name", + type="string", + required=True, + desc="The name of the local agent in LocalAgentRegistry" + ), + "register_agent_name": ParamInfo( + name="register_agent_name", + type="string", + required=True, + desc="The name of the agent in AgentScanner to register" + ) + }, + desc="Dynamically register an agent from AgentScanner to a local agent's team_swarm" + ) + + +def find_from_agent_factory_by_name(to_find_agent_name: str): + logger.info(f"AgentFactory._agent_instance.values(): {AgentFactory._agent_instance.values()} {to_find_agent_name}") + # Find agent with the same name from AgentFactory + for agent in list(AgentFactory._agent_instance.values()): + if agent.name() == to_find_agent_name: + factory_agent = AgentFactory._agent_instance[agent.id()] + logger.info(f"Found agent '{factory_agent.id()}' (name: '{to_find_agent_name}') from AgentFactory") + return factory_agent + return None + + +def get_directory_structure(directory_path, max_depth=5, current_depth=0): + """ + Get the file structure of a directory recursively. + + Args: + directory_path: Path to the directory + max_depth: Maximum depth to traverse (default: 5) + current_depth: Current depth in recursion + + Returns: + List of strings representing the file structure + """ + from pathlib import Path + + structure = [] + try: + dir_path = Path(directory_path) + if not dir_path.exists() or not dir_path.is_dir(): + return structure + + if current_depth >= max_depth: + return structure + + # Sort entries: directories first, then files + entries = sorted(dir_path.iterdir(), key=lambda x: (x.is_file(), x.name)) + + for entry in entries: + # Skip hidden files and directories + if entry.name.startswith('.'): + continue + + # Calculate indentation based on depth + indent = " " * current_depth + if entry.is_dir(): + structure.append(f"{indent}{entry.name}/") + # Recursively get subdirectory structure + sub_structure = get_directory_structure(entry, max_depth, current_depth + 1) + structure.extend(sub_structure) + else: + structure.append(f"{indent}{entry.name}") + + except Exception as e: + logger.warning(f"Failed to get directory structure for {directory_path}: {e}") + + return structure + +async def list_built_in_resources() -> List[tuple]: + """ + List all built-in resources (agents and skills) from plugins. + + Returns: + List of tuples: + - For agents: (name, desc, path) + - For skills: (name, desc, path, file_structure) where file_structure is a string + containing the directory structure of the skill + """ + resources_with_desc = [] + + try: + from pathlib import Path + from ..core.plugin_manager import PluginManager + from ..core.agent_registry import LocalAgentRegistry + + # Get all plugin directories (built-in and installed) + plugin_dirs = [] + + # Get built-in plugins (inner_plugins) + import pathlib + current_dir = pathlib.Path(__file__).parent.parent + inner_plugins_dir = current_dir / "inner_plugins" + + if inner_plugins_dir.exists() and inner_plugins_dir.is_dir(): + for plugin_dir in inner_plugins_dir.iterdir(): + if plugin_dir.is_dir(): + plugin_dirs.append(plugin_dir) + + # Get installed plugins + try: + plugin_manager = PluginManager() + installed_plugin_dirs = plugin_manager.get_plugin_dirs() + # Convert agent dirs back to plugin dirs (parent directory) + for agent_dir in installed_plugin_dirs: + plugin_dir = agent_dir.parent + if plugin_dir not in plugin_dirs: + plugin_dirs.append(plugin_dir) + except Exception: + pass + + # Get agents from plugins + try: + local_agents = LocalAgentRegistry.list_agents() + for local_agent in local_agents: + if local_agent.name and local_agent.register_dir: + register_dir_path = Path(local_agent.register_dir) + # Check if agent is from a plugin directory + is_from_plugin = False + for plugin_dir in plugin_dirs: + try: + # Try is_relative_to (Python 3.9+) + if hasattr(register_dir_path, 'is_relative_to') and register_dir_path.is_relative_to(plugin_dir): + is_from_plugin = True + break + except (AttributeError, ValueError): + # Fallback: check if plugin_dir is a parent of register_dir_path + try: + register_dir_path.resolve().relative_to(plugin_dir.resolve()) + is_from_plugin = True + break + except ValueError: + # Not relative, continue checking + pass + + if is_from_plugin: + desc = local_agent.desc or "No description" + path = local_agent.path or str(register_dir_path) + resources_with_desc.append((local_agent.name, desc, path)) + except Exception as e: + logger.warning(f"Failed to get agents from plugins: {e}") + + # Get skills from plugins + for plugin_dir in plugin_dirs: + skills_dir = plugin_dir / "skills" + if not skills_dir.exists() or not skills_dir.is_dir(): + continue + + try: + for subdir in skills_dir.iterdir(): + if not subdir.is_dir(): + continue + + # Check if directory contains SKILL.md file + skill_md_file = subdir / "SKILL.md" + if skill_md_file.exists() and skill_md_file.is_file(): + skill_name = subdir.name + # Try to read description from SKILL.md + try: + with open(skill_md_file, 'r', encoding='utf-8') as f: + content = f.read() + # Try to extract description from markdown + lines = content.split('\n') + desc = "No description" + + # First, look for "description:" prefix line + for line in lines: + line_stripped = line.strip() + if line_stripped.lower().startswith('description:'): + # Extract description after "description:" prefix + desc = line_stripped[len('description:'):].strip()[:200] + break + + # If not found, fall back to original logic + if desc == "No description": + for i, line in enumerate(lines): + if line.strip().startswith('#'): + # Found title, next non-empty line might be description + if i + 1 < len(lines) and lines[i + 1].strip(): + desc = lines[i + 1].strip()[:200] # Limit description length + break + else: + desc = line.strip('#').strip()[:200] + break + if desc == "No description" and lines: + # Use first non-empty line as description + for line in lines: + if line.strip() and not line.strip().startswith('#'): + desc = line.strip()[:200] + break + except Exception: + desc = "No description" + + # Get file structure of the skill directory + file_structure = get_directory_structure(subdir) + file_structure_str = "\n".join(file_structure) if file_structure else "" + + path = str(skill_md_file) + resources_with_desc.append((skill_name, desc, path, file_structure_str)) + except Exception as e: + logger.warning(f"Failed to get skills from plugin {plugin_dir}: {e}") + + except Exception as e: + logger.error(f"Failed to list built-in resources: {e} {traceback.format_exc()}") + + return resources_with_desc + + +async def dynamic_register(local_agent_name: str, register_agent_name: str, context=None) -> bool: + """ + Dynamically register an agent from AgentScanner to a local agent's team_swarm. + + Args: + local_agent_name: Name of the local agent in LocalAgentRegistry + register_agent_name: Name of the agent in AgentScanner to register + context: Optional context for AgentScanner (if None, uses default context) + + Returns: + True if registration successful, False otherwise + + Raises: + ValueError: With detailed error message if any step fails + + Process: + 1. Get local_agent_name from LocalAgentRegistry + 2. Read its team_swarm + 3. Get register_agent_name from AgentScanner + 4. Add register_agent_name to local_agent_name's team_swarm + """ + try: + from aworld_cli.core.agent_registry import LocalAgentRegistry + from aworld_cli.core.agent_scanner import AgentScanner + + # Step 1: Get register_agent_name from AgentScanner + if context is None: + from aworld_cli.core.agent_scanner import DefaultContext + context = DefaultContext() + + agent_scanner = AgentScanner(context) + + # Check if agent exists in registry before loading + register_agent = await agent_scanner.load_agent(agent_name=register_agent_name) + logger.info(f"register_agent, {register_agent.id()}, {inspect.getfile(register_agent.__class__)}") + if not register_agent: + error_msg = ( + f"Agent '{register_agent_name}' exists in registry but could not be loaded. " + f"This may indicate a problem with the agent file (e.g., syntax error, missing dependencies, " + f"or invalid agent definition). Please check the agent file and ensure it is valid." + ) + logger.error(error_msg) + raise ValueError(error_msg) + + # Step 2: Get local_agent_name from LocalAgentRegistry + local_agent = LocalAgentRegistry.get_agent(local_agent_name) + logger.info(f"local_agent: {local_agent}") + if not local_agent: + # Get available agent names for better error message + available_agents = LocalAgentRegistry.list_agent_names() + available_list = ", ".join(available_agents) if available_agents else "none" + error_msg = ( + f"Local agent '{local_agent_name}' not found in LocalAgentRegistry. " + f"Available agents: {available_list}. " + f"Please check the agent name and ensure it is registered in LocalAgentRegistry." + ) + logger.error(error_msg) + raise ValueError(error_msg) + swarm = await local_agent.get_swarm(context=context) + # swarm = context.swarm + logger.info(f"local_agent|swarm: {swarm} {swarm.agents}") + if not swarm: + error_msg = ( + f"Failed to get swarm for local agent '{local_agent_name}'. " + f"The agent exists but its swarm could not be initialized. " + f"Please check the agent's swarm configuration." + ) + logger.error(error_msg) + raise ValueError(error_msg) + + + # Step 3: Add register_agent_name to local_agent_name's team_swarm + origin_agent = find_from_agent_factory_by_name(register_agent_name) + swarm.add_agents(agents=[register_agent], to_remove_agents=[origin_agent]) + + + # Step 4: Refresh the root agent's tools cache to include the newly registered agent + # The root agent's handoffs have been updated, but its tools cache needs to be refreshed + try: + if swarm.agent_graph and swarm.agent_graph.root_agent: + root_agent = swarm.agent_graph.root_agent + logger.info('root_agent: ', root_agent.id()) + if isinstance(root_agent, list): + root_agent = root_agent[0] + + root_agent_name = root_agent.name() + + # Find the agent with the same name from AgentFactory + from aworld.core.agent.base import AgentFactory + import re + + # find agent from factory + factory_agent = find_from_agent_factory_by_name(root_agent_name) + logger.info('factory_agent: ', factory_agent.id()) + + # Use factory agent if found, otherwise use root_agent + agent_to_refresh = factory_agent if factory_agent else root_agent + agent_source = "AgentFactory" if factory_agent else "swarm.agent_graph.root_agent" + + # Clear the tools cache so it will be regenerated with the new agent in handoffs + agent_to_refresh.tools = [] + logger.info(f"Cleared tools cache for agent '{agent_to_refresh.id()}' (from {agent_source}) to force refresh") + + # Try to refresh tools immediately if context is ApplicationContext + # Note: context parameter might be DefaultContext for AgentVersionControlRegistry, + # which is not compatible with async_desc_transform, so we check the type + from aworld.core.context.amni import ApplicationContext + if context and isinstance(context, ApplicationContext): + try: + await agent_to_refresh.async_desc_transform(context) + logger.info(f'agent_to_refresh2: {agent_to_refresh.id()} tools: {agent_to_refresh.tools}') + logger.info(f"Refreshed tools for agent '{agent_to_refresh.id()}' (from {agent_source}) with new agent '{register_agent.id()}'") + except Exception as e: + logger.warning(f"Failed to refresh tools for agent '{agent_to_refresh.id()}' (from {agent_source}): {e}. Tools will be regenerated on next use.") + else: + logger.info(f"Tools cache cleared for agent '{agent_to_refresh.id()}' (from {agent_source}). Tools will be regenerated on next use when context is available.") + except Exception as e: + logger.warning(f"Failed to refresh root agent tools cache: {e}. Tools will be regenerated on next use.") + + + logger.info(f"Successfully added agent '{register_agent_name}' to local agent '{local_agent_name}'s team_swarm") + return True + + except ValueError: + # Re-raise ValueError with detailed messages + raise + except Exception as e: + error_msg = ( + f"Unexpected error in dynamic_register: {str(e)}. " + f"local_agent_name='{local_agent_name}', register_agent_name='{register_agent_name}'. " + f"Please check the logs for more details." + ) + logger.error(f"Error in dynamic_register: {traceback.format_exc()}") + raise ValueError(error_msg) from e + + +@ToolFactory.register(name=AGENT_REGISTRY, + desc=AGENT_REGISTRY, + supported_action=ContextAgentRegistryAction) +class ContextAgentRegistryTool(AsyncTool): + def __init__(self, conf: ToolConfig, **kwargs) -> None: + super(ContextAgentRegistryTool, self).__init__(conf, **kwargs) + self.cur_observation = None + self.content = None + self.keyframes = [] + self.init() + self.step_finished = True + + async def reset(self, *, seed: int | None = None, options: Dict[str, str] | None = None) -> Tuple[ + Observation, dict[str, Any]]: + await super().reset(seed=seed, options=options) + + await self.close() + self.step_finished = True + return build_observation(observer=self.name(), + ability=ContextAgentRegistryAction.LIST_DESC.value.name), {} + + def init(self) -> None: + self.initialized = True + + async def close(self) -> None: + pass + + async def finished(self) -> bool: + return self.step_finished + + async def do_step(self, actions: list[ActionModel], message: Message = None, **kwargs) -> Tuple[ + Observation, float, bool, bool, Dict[str, Any]]: + self.step_finished = False + reward = 0. + fail_error = "" + action_results = [] + info = {} + + try: + if not actions: + raise ValueError("actions is empty") + if not isinstance(message.context, AmniContext): + raise ValueError("context is not AmniContext") + + # Get agent registry service from context + from aworld_cli.core.agent_scanner import AgentScanner + + def get_agent_registry_service() -> AgentScanner: + context = message.context._context if hasattr(message.context, '_context') else message.context + return AgentScanner(context) + + for action in actions: + logger.info(f"ContextAgentRegistryTool|do_step: {action}") + action_name = action.action_name + action_result = ActionResult(action_name=action_name, tool_name=self.name()) + + try: + if action_name == ContextAgentRegistryAction.LIST_DESC.value.name: + source_type = action.params.get("source_type", "user") + + if source_type == "built-in": + # Query built-in resources from plugin_manager + resources_with_desc = await list_built_in_resources() + else: + # Query user resources from AgentScanner (default) + service = get_agent_registry_service() + resources_with_desc = await service.list_desc() + + action_result.success = True + if resources_with_desc: + # Handle multiple formats: + # - 3-tuple: (name, desc, path) + # - 4-tuple: (name, desc, path, version) or (name, desc, path, file_structure) + desc_lines = [] + for item in resources_with_desc: + if len(item) == 4: + name, desc, path, fourth_field = item + # Check if fourth field is file structure (contains newlines) or version + if isinstance(fourth_field, str) and '\n' in fourth_field: + # It's a file structure - indent each line for better readability + indented_structure = '\n'.join(' ' + line for line in fourth_field.split('\n') if line.strip()) + desc_lines.append(f"- {name}: {desc}\n Path: {path}\n File Structure:\n{indented_structure}") + else: + # It's a version + desc_lines.append(f"- {name}: {desc}\n Path: {path}\n Version: {fourth_field}") + else: + # Backward compatibility with old format + name, desc, path = item[:3] + desc_lines.append(f"- {name}: {desc}\n Path: {path}") + + source_label = "Built-in" if source_type == "built-in" else "User" + action_result.content = f"Available {source_label} resources with descriptions:\n" + "\n".join(desc_lines) + else: + source_label = "built-in" if source_type == "built-in" else "user" + action_result.content = f"No {source_label} resources found" + + elif action_name == ContextAgentRegistryAction.DYNAMIC_REGISTER.value.name: + local_agent_name = action.params.get("local_agent_name", "") + register_agent_name = action.params.get("register_agent_name", "") + + if not local_agent_name: + raise ValueError("local_agent_name is required") + if not register_agent_name: + raise ValueError("register_agent_name is required") + + # Get context for dynamic_register + context = message.context._context if hasattr(message.context, '_context') else message.context + try: + success = await dynamic_register( + local_agent_name=local_agent_name, + register_agent_name=register_agent_name, + context=context + ) + + if success: + action_result.success = True + action_result.content = f"Successfully registered agent '{register_agent_name}' to local agent '{local_agent_name}'s team_swarm" + else: + raise ValueError( + f"Failed to register agent '{register_agent_name}' to local agent '{local_agent_name}'") + except ValueError as ve: + # Re-raise ValueError with detailed error message + raise ve + + else: + raise ValueError(f"Unknown action: {action_name}") + + except Exception as e: + action_result.success = False + action_result.error = str(e) + fail_error = str(e) + reward = -1.0 + + action_results.append(action_result) + + except Exception as e: + logger.error(f"ContextAgentRegistryTool|do_step error: {traceback.format_exc()}") + fail_error = str(e) + reward = -1.0 + # Create failed action results for all actions + for action in actions: + action_result = ActionResult( + action_name=action.action_name, + tool_name=self.name(), + success=False, + error=str(e) + ) + action_results.append(action_result) + + observation = build_observation( + observer=self.name(), + ability=action_name, + action_result=action_results + ) + + self.step_finished = True + return (observation, reward, len(fail_error) > 0, len(fail_error) > 0, info) diff --git a/aworld-cli/src/aworld_cli/core/agent_scanner.py b/aworld-cli/src/aworld_cli/core/agent_scanner.py new file mode 100644 index 000000000..c35bc7b4e --- /dev/null +++ b/aworld-cli/src/aworld_cli/core/agent_scanner.py @@ -0,0 +1,324 @@ +# coding: utf-8 +# Copyright (c) 2025 inclusionAI. +import os +import re +import sys +import traceback +from pathlib import Path +from threading import RLock +from typing import Optional, Dict, List + +from aworld.agents.llm_agent import Agent +from aworld.core.context.amni import DirArtifact +from aworld_cli.core.scanner import Scanner +from aworld.logs.util import logger +from aworld.output.artifact import ArtifactAttachment + +class AgentCodeScanner(Scanner): + """Scanner for Python code agents with @agent decorator.""" + + def __init__(self, context): + Scanner.__init__(self, context) + self._lock = RLock() + + + def _has_agent_decorator_fast(self, file_path: Path) -> bool: + """Check if a Python file contains @agent decorator.""" + try: + with open(file_path, 'r', encoding='utf-8') as f: + for i, line in enumerate(f): + if '@agent' in line or '@agent(' in line: + return True + return False + except Exception: + return False + + def _matches_file(self, attachment: ArtifactAttachment, name: str, suffix: str, base_path: str = None) -> bool: + """Check if an attachment matches. For .py files, also checks for @agent decorator.""" + if suffix != ".py": + return super()._matches_file(attachment, name, suffix, base_path) + + if not super()._matches_file(attachment, name, suffix, base_path): + return False + + if base_path: + file_path = Path(base_path) / attachment.path + if file_path.exists(): + return self._has_agent_decorator_fast(file_path) + + return False + + def _scan_files_by_suffix(self, suffix: str) -> List[str]: + """Scan files by suffix. For .py files, only includes files with @agent decorator.""" + return super()._scan_files_by_suffix(suffix) + + async def get_agent_from_base_path( + self, + base_path: str, + agent_name: str, + storage_type: str = "local", + oss_config: Optional[Dict[str, str]] = None + ) -> Optional[Agent]: + """Load Python agent from base_path.""" + try: + import importlib.util + + if storage_type == 'oss' and oss_config: + dir_artifact = DirArtifact.with_oss_repository( + access_key_id=oss_config.get('access_key_id'), + access_key_secret=oss_config.get('access_key_secret'), + endpoint=oss_config.get('endpoint'), + bucket_name=oss_config.get('bucket_name'), + base_path=base_path + ) + else: + dir_artifact = DirArtifact.with_local_repository(base_path) + + attachment = await Scanner.resolve_resource_from_artifact( + dir_artifact=dir_artifact, + name=agent_name, + suffix=".py" + ) + + if not attachment: + return None + + file_path = Path(dir_artifact.base_path) / attachment.path + + if not file_path.exists(): + logger.error(f"Python agent file not found: {file_path}") + return None + + if not self._has_agent_decorator_fast(file_path): + return None + + module_name = file_path.stem + spec = importlib.util.spec_from_file_location(module_name, file_path) + if spec is None or spec.loader is None: + logger.error(f"Could not create spec for {file_path}") + return None + + module = importlib.util.module_from_spec(spec) + spec.loader.exec_module(module) + + from aworld_cli.core.agent_registry import LocalAgentRegistry + local_agent = LocalAgentRegistry.get_agent(agent_name) + + if not local_agent: + return None + + swarm = await local_agent.get_swarm() + if swarm and swarm.agents: + agent_id, agent = next(iter(swarm.agents.items())) + return agent + + return None + + except Exception as e: + logger.error(f"Failed to load Python agent from base_path {base_path}: {e} {traceback.format_exc()}") + return None + + + async def list_as_source(self) -> List[str]: + """List all available resources by scanning .py files (with @agent decorator).""" + return self._scan_files_by_suffix(".py") + + async def load_as_source(self, name: str) -> Optional[str]: + """Load resource as source content.""" + return await self._load_as_source_by_suffix(name=name, suffix=".py") + + async def list_desc(self) -> List[tuple]: + """List all available resources with their descriptions and paths.""" + resources = self._scan_files_by_suffix(".py") + resources_with_desc = [] + + from aworld_cli.core.agent_registry import LocalAgentRegistry + import os + base_path = os.path.expanduser(os.environ.get('AGENTS_PATH', '~/.aworld/agents')) + local_agents_dict = {} + local_agents_by_dir = {} + try: + local_agents = LocalAgentRegistry.list_agents() + for local_agent in local_agents: + if local_agent.name: + local_agents_dict[local_agent.name] = local_agent + if local_agent.register_dir: + dir_name = os.path.basename(local_agent.register_dir.rstrip('/')) + if dir_name: + local_agents_by_dir[dir_name] = local_agent + resource_dir = os.path.join(base_path, local_agent.name) + if os.path.exists(resource_dir): + local_agents_by_dir[local_agent.name] = local_agent + except Exception: + pass + + for name in resources: + try: + desc = None + path = None + local_agent = None + + if name in local_agents_dict: + local_agent = local_agents_dict[name] + desc = local_agent.desc or "No description" + path = local_agent.path or "Unknown path" + elif name in local_agents_by_dir: + local_agent = local_agents_by_dir[name] + desc = local_agent.desc or "No description" + path = local_agent.path or "Unknown path" + + if not desc: + agent = await self.load_agent(agent_name=name) + if agent: + desc = agent.desc() or "No description" + else: + desc = "No description" + + if not path: + path = "Unknown path" + + resources_with_desc.append((name, desc, path)) + except Exception as e: + logger.warning(f"Failed to get description for {name}: {e} {traceback.format_exc()}") + resources_with_desc.append((name, "No description", "Unknown path")) + + return resources_with_desc + + async def load_agent(self, agent_name: str) -> Optional[Agent]: + base_path = os.path.expanduser(os.environ.get('AGENTS_PATH', '~/.aworld/agents')) + storage_type = self._get_storage_type() + oss_config = self._get_oss_config() + + return await self.get_agent_from_base_path( + base_path=base_path, + agent_name=agent_name, + storage_type=storage_type, + oss_config=oss_config + ) + + + +class AgentScanner(Scanner): + """Unified scanner that combines both DSL (markdown) and Code (Python) agents.""" + + def __init__(self, context): + Scanner.__init__(self, context) + self._lock = RLock() + + agents_path_env = os.environ.get('AGENTS_PATH', '~/.aworld/agents') + base_paths = [os.path.expanduser(p.strip()) for p in agents_path_env.split(':') if p.strip()] + + if not base_paths: + base_paths = [os.path.expanduser('~/.aworld/agents')] + + self._base_paths = base_paths + + self._code_registries = [] + for base_path in base_paths: + original_path = os.environ.get('AGENTS_PATH') + try: + os.environ['AGENTS_PATH'] = base_path + code_registry = AgentCodeScanner(context) + self._code_registries.append((base_path, code_registry)) + finally: + if original_path: + os.environ['AGENTS_PATH'] = original_path + else: + os.environ['AGENTS_PATH'] = base_paths[0] + + for base_path in base_paths: + if base_path not in sys.path: + sys.path.insert(0, base_path) + + logger.debug(f"AGENTS_PATH: {':'.join(base_paths)}") + + + async def list_as_source(self) -> List[str]: + """List all available resources from all configured directories.""" + all_resources = set() + for base_path, code_registry in self._code_registries: + try: + original_path = os.environ.get('AGENTS_PATH') + try: + os.environ['AGENTS_PATH'] = base_path + resources = await code_registry.list_as_source() + all_resources.update(resources) + finally: + if original_path: + os.environ['AGENTS_PATH'] = original_path + else: + os.environ['AGENTS_PATH'] = base_path + except Exception as e: + logger.warning(f"Failed to scan agents from {base_path}: {e}") + + return sorted(list(all_resources)) + + async def list_desc(self) -> List[tuple]: + """List all available resources with their descriptions and paths.""" + all_descriptions = {} + for base_path, code_registry in self._code_registries: + try: + original_path = os.environ.get('AGENTS_PATH') + try: + os.environ['AGENTS_PATH'] = base_path + descriptions = await code_registry.list_desc() + for name, desc, path in descriptions: + if name not in all_descriptions: + all_descriptions[name] = (name, desc, path) + finally: + if original_path: + os.environ['AGENTS_PATH'] = original_path + else: + os.environ['AGENTS_PATH'] = base_path + except Exception as e: + logger.warning(f"Failed to get descriptions from {base_path}: {e}") + + return sorted(all_descriptions.values()) + + async def load_agent(self, agent_name: str) -> Optional[Agent]: + """Load agent by trying all configured directories.""" + for base_path, code_registry in self._code_registries: + try: + original_path = os.environ.get('AGENTS_PATH') + try: + os.environ['AGENTS_PATH'] = base_path + agent = await code_registry.load_agent(agent_name) + if agent: + return agent + finally: + if original_path: + os.environ['AGENTS_PATH'] = original_path + else: + os.environ['AGENTS_PATH'] = base_path + except Exception as e: + logger.warning(f"Failed to load agent {agent_name} from {base_path}: {e}") + continue + + return None + + async def load_as_source(self, name: str) -> Optional[str]: + """Load resource as source content from all configured directories.""" + for base_path, code_registry in self._code_registries: + try: + original_path = os.environ.get('AGENTS_PATH') + try: + os.environ['AGENTS_PATH'] = base_path + content = await code_registry.load_as_source(name) + if content: + return content + finally: + if original_path: + os.environ['AGENTS_PATH'] = original_path + else: + os.environ['AGENTS_PATH'] = base_path + except Exception as e: + logger.warning(f"Failed to load source {name} from {base_path}: {e}") + continue + + return None + + +class DefaultContext: + """Default context for AgentScanner when no context is provided.""" + +global_agent_registry = AgentScanner(DefaultContext()) diff --git a/aworld-cli/src/aworld_cli/core/loader.py b/aworld-cli/src/aworld_cli/core/loader.py index 6e7f0285a..383a94b79 100644 --- a/aworld-cli/src/aworld_cli/core/loader.py +++ b/aworld-cli/src/aworld_cli/core/loader.py @@ -1,11 +1,12 @@ """ Agent loader for scanning and loading agents from local directories. """ +import importlib.util import os import sys -import importlib.util from pathlib import Path -from typing import Union, List, Optional +from typing import Union, Optional +from aworld.logs.util import logger def _has_agent_decorator(file_path: Path) -> bool: @@ -71,22 +72,31 @@ def init_agents(agents_dir: Union[str, Path] = None) -> None: try: LocalAgentRegistry.register(agent) markdown_loaded_count += 1 - console.print(f"[dim]✅ Loaded markdown agent: {agent.name}[/dim]") + # Get file path from metadata if available + file_path = agent.metadata.get("file_path", "unknown") if agent.metadata else "unknown" + logger.info(f"[dim]✅ Loaded markdown agent: {agent.name} from {file_path}[/dim]") except Exception as e: markdown_failed_count += 1 console.print(f"[dim]❌ Failed to register markdown agent {agent.name}: {e}[/dim]") - # Find all Python files recursively, excluding __init__.py and private modules + # Find all Python files recursively, excluding __init__.py, private modules, and plugin_manager all_python_files = [ f for f in agents_dir.rglob("*.py") - if f.name != "__init__.py" and not f.name.startswith("_") + if f.name != "__init__.py" + and not f.name.startswith("_") + and "plugin_manager" not in str(f.relative_to(agents_dir)) + and f.name != "plugin_manager.py" ] # Filter files that contain @agent decorator python_files = [f for f in all_python_files if _has_agent_decorator(f)] if all_python_files else [] if all_python_files: - console.print(f"[dim]🔍 Found {len(all_python_files)} Python file(s), {len(python_files)} with @agent decorator[/dim]") + logger.info(f"[dim]🔍 Found {len(all_python_files)} Python file(s), {len(python_files)} with @agent decorator[/dim]") + if python_files: + logger.info(f"[dim] Files with @agent decorator:[/dim]") + for py_file in python_files: + logger.info(f"[dim] • {py_file.relative_to(agents_dir) if agents_dir.exists() else py_file}[/dim]") elif markdown_agents: console.print(f"[dim]🔍 Found {len(markdown_agents)} markdown agent file(s)[/dim]") @@ -133,45 +143,63 @@ def init_agents(agents_dir: Union[str, Path] = None) -> None: except ValueError: # If file is not relative to project root, use absolute path rel_path = py_file - + # Convert path to module name (e.g., agents/my_agent.py -> agents.my_agent) module_parts = list(rel_path.parts[:-1]) + [rel_path.stem] module_name = '.'.join(module_parts) - + # Skip if module name starts with a number (invalid Python module name) if module_name and module_name[0].isdigit(): console.print(f"[dim]⚠️ Skipping invalid module name: {module_name}[/dim]") continue - + # Use importlib to load the module spec = importlib.util.spec_from_file_location(module_name, py_file) if spec is None or spec.loader is None: - console.print(f"[dim]⚠️ Could not create spec for {py_file}[/dim]") + file_path = str(py_file.resolve()) + logger.info(f"[dim]⚠️ Could not create spec for {file_path}[/dim]") failed_count += 1 failed_files.append((str(py_file), "Could not create module spec")) continue - + module = importlib.util.module_from_spec(spec) - + # Execute the module to trigger decorator registration # Note: We don't use Status here because the module execution might create its own Status # which would conflict with "Only one live display may be active at once" try: + # Get agent count before loading + agents_before = len(LocalAgentRegistry.list_agents()) spec.loader.exec_module(module) loaded_count += 1 - console.print(f"[dim]✅ Loaded agent from: {py_file.name}[/dim]") + # Get agent count after loading + agents_after = len(LocalAgentRegistry.list_agents()) + agents_registered = agents_after - agents_before + file_path = str(py_file.resolve()) + if agents_registered > 0: + # Get the names of newly registered agents + all_agents = LocalAgentRegistry.list_agents() + new_agents = all_agents[-agents_registered:] if agents_registered > 0 else [] + agent_names = [a.name for a in new_agents] + logger.info(f"[dim]✅ Loaded {agents_registered} agent(s) from: {file_path}[/dim]") + for agent_name in agent_names: + logger.info(f"[dim] • Registered agent: {agent_name}[/dim]") + else: + logger.info(f"[dim]✅ Loaded module (no new agents registered): {file_path}[/dim]") except Exception as import_error: failed_count += 1 error_msg = str(import_error) failed_files.append((str(py_file), error_msg)) - console.print(f"[dim]❌ Failed to load {py_file.name}: {error_msg}[/dim]") + file_path = str(py_file.resolve()) + logger.info(f"[dim]❌ Failed to load {file_path}: {error_msg}[/dim]") continue - + except Exception as e: failed_count += 1 error_msg = str(e) failed_files.append((str(py_file), error_msg)) - console.print(f"[dim]❌ Error processing {py_file}: {error_msg}[/dim]") + file_path = str(py_file.resolve()) + logger.info(f"[dim]❌ Error processing {file_path}: {error_msg}[/dim]") continue # Summary @@ -179,7 +207,7 @@ def init_agents(agents_dir: Union[str, Path] = None) -> None: total_loaded = loaded_count + markdown_loaded_count total_failed = failed_count + markdown_failed_count console.print(f"[dim]📊 Summary: Loaded {total_loaded} file(s) ({loaded_count} Python, {markdown_loaded_count} markdown), {total_failed} failed, {total_registered} agent(s) registered[/dim]") - + # Return loaded Python files for debugging return python_files @@ -212,20 +240,21 @@ def init_agent_file(agent_file: Union[str, Path]) -> Optional[str]: agent_file = Path(agent_file) if isinstance(agent_file, str) else agent_file from .._globals import console - + if not agent_file.exists(): console.print(f"[yellow]⚠️ Agent file not found: {agent_file}[/yellow]") return None console.print("[dim]📂 Loading agent file...[/dim]") - + if agent_file.suffix == '.md': # Load markdown agent try: agent = parse_markdown_agent(agent_file) if agent: LocalAgentRegistry.register(agent) - console.print(f"[dim]✅ Loaded markdown agent: {agent.name}[/dim]") + file_path = str(agent_file.resolve()) + logger.info(f"[dim]✅ Loaded markdown agent: {agent.name} from {file_path}[/dim]") return agent.name else: console.print(f"[yellow]⚠️ Failed to parse markdown agent from: {agent_file}[/yellow]") @@ -262,7 +291,8 @@ def init_agent_file(agent_file: Union[str, Path]) -> Optional[str]: # Execute the module to trigger decorator registration # Note: We don't use Status here because the module execution might create its own Status spec.loader.exec_module(module) - console.print(f"[dim]✅ Loaded agent from: {agent_file.name}[/dim]") + file_path = str(agent_file.resolve()) + logger.info(f"[dim]✅ Loaded agent from: {file_path}[/dim]") # Try to get the agent name from registry (get the most recently registered agent) # This works because the decorator registers the agent when the module is executed @@ -272,7 +302,8 @@ def init_agent_file(agent_file: Union[str, Path]) -> Optional[str]: return agents[-1].name return None except Exception as e: - console.print(f"[red]❌ Failed to load Python agent from {agent_file}: {e}[/red]") + file_path = str(agent_file.resolve()) + logger.info(f"[red]❌ Failed to load Python agent from {file_path}: {e}[/red]") return None else: console.print(f"[yellow]⚠️ Unsupported file type: {agent_file.suffix}. Only .py and .md files are supported.[/yellow]") diff --git a/aworld-cli/src/aworld_cli/core/markdown_agent_loader.py b/aworld-cli/src/aworld_cli/core/markdown_agent_loader.py index 7c92a90c1..34ad3ff07 100644 --- a/aworld-cli/src/aworld_cli/core/markdown_agent_loader.py +++ b/aworld-cli/src/aworld_cli/core/markdown_agent_loader.py @@ -10,6 +10,7 @@ from pathlib import Path from typing import Dict, Any, List, Optional, Tuple +from aworld.sandbox import Sandbox from aworld.utils.skill_loader import extract_front_matter, collect_skill_docs from .skill_registry import get_skill_registry from aworld.agents.llm_agent import Agent @@ -353,7 +354,35 @@ def parse_markdown_agent(md_file_path: Path) -> Optional[LocalAgent]: if extracted_servers: mcp_servers = extracted_servers logger.info(f"✅ Auto-extracted mcp_servers from mcp_config: {mcp_servers}") - + + # Get model config + model_config = front_matter.get("model_config") + if isinstance(model_config, str): + # First, try to parse as inline JSON + try: + model_config = json.loads(model_config) + logger.debug(f"✅ Parsed model_config as inline JSON") + except json.JSONDecodeError: + # If not valid JSON, check if it's a file path + # File paths typically contain .json or .py extension, or look like paths + if (".json" in model_config.lower() or + ".py" in model_config.lower() or + "/" in model_config or + "\\" in model_config): + # Try to load from file + base_dir = md_file_path.parent + loaded_config = _load_mcp_config_from_file(model_config, base_dir) + if loaded_config is not None: + model_config = loaded_config + else: + logger.warning(f"⚠️ Failed to load model_config from file '{model_config}' in {md_file_path}, using None") + model_config = None + else: + # Not a file path and not valid JSON, treat as None + logger.warning(f"⚠️ model_config value '{model_config}' is neither valid JSON nor a file path, using None") + elif model_config is None or (isinstance(model_config, dict) and not model_config): + model_config = None + # Get PTC tools (Programmatic Tool Calling) ptc_tools = front_matter.get("ptc_tools") if isinstance(ptc_tools, str): @@ -516,16 +545,32 @@ def parse_markdown_agent(md_file_path: Path) -> Optional[LocalAgent]: # Create a factory function that builds the Swarm def build_swarm() -> Swarm: # Create agent configuration - agent_config = AgentConfig( - llm_config=ModelConfig( + if model_config: + # Use model_config from markdown file + llm_config = ModelConfig(**model_config) + logger.info(f"✅ Using model_config from markdown: {model_config}") + else: + # Fallback to environment variables + llm_config = ModelConfig( llm_model_name=os.environ.get("LLM_MODEL_NAME", "gpt-4"), llm_provider=os.environ.get("LLM_PROVIDER", "openai"), llm_api_key=os.environ.get("LLM_API_KEY"), llm_base_url=os.environ.get("LLM_BASE_URL", "https://api.openai.com/v1"), - llm_temperature=float(os.environ.get("LLM_TEMPERATURE", "0.7")) + llm_temperature=float(os.environ.get("LLM_TEMPERATURE", "0.1")), + params={"max_completion_tokens": 40960} ) + logger.info(f"✅ Using default model config from environment variables") + + agent_config = AgentConfig( + llm_config=llm_config, + skill_configs=skill_configs if skill_configs else {} ) """Build Swarm from markdown agent definition.""" + + sandbox = Sandbox( + mcp_config=mcp_config, + ) + sandbox.reuse = True agent = Agent( name=agent_name, desc=description, @@ -534,8 +579,8 @@ def build_swarm() -> Swarm: tool_names=tool_names if tool_names else None, mcp_servers=mcp_servers if mcp_servers else None, mcp_config=mcp_config, - ptc_tools=ptc_tools if ptc_tools else [], - skill_configs=skill_configs if skill_configs else None + sandbox=sandbox, + ptc_tools=ptc_tools if ptc_tools else [] ) return Swarm(agent) @@ -550,6 +595,7 @@ def build_swarm() -> Swarm: "tool_list": tool_list, "mcp_servers": mcp_servers, "mcp_config": mcp_config, + "model_config": model_config, "ptc_tools": ptc_tools, "skills_path": skills_path, "skill_names": skill_names_str, @@ -589,10 +635,12 @@ def load_markdown_agents(agents_dir: Path) -> List[LocalAgent]: logger.warning(f"⚠️ Agents directory not found: {agents_dir}") return agents - # Find all markdown files recursively, excluding private files + # Find all markdown files recursively, excluding private files and plugin_manager markdown_files = [ f for f in agents_dir.rglob("*.md") - if not f.name.startswith("_") and not f.name.startswith(".") + if not f.name.startswith("_") + and not f.name.startswith(".") + and "plugin_manager" not in str(f.relative_to(agents_dir)) ] if not markdown_files: diff --git a/aworld-cli/src/aworld_cli/core/plugin_manager.py b/aworld-cli/src/aworld_cli/core/plugin_manager.py index dda7fb254..cf7301a14 100644 --- a/aworld-cli/src/aworld_cli/core/plugin_manager.py +++ b/aworld-cli/src/aworld_cli/core/plugin_manager.py @@ -8,7 +8,7 @@ import shutil import subprocess from pathlib import Path -from typing import Dict, List, Optional +from typing import Dict, List, Optional, Tuple from urllib.parse import urlparse from aworld.logs.util import logger @@ -472,3 +472,240 @@ def get_plugin_dirs(self) -> List[Path]: return agent_dirs + async def _load_skills(self, plugin_dirs: List[Path], console=None) -> Dict[str, int]: + """ + Load skills from all plugin directories. + + Searches for skills in plugin_dir/skills directory for each plugin. + Only directories containing SKILL.md file are considered as skills. + Skills are registered into the global skill registry. + + Args: + plugin_dirs: List of plugin directory paths + console: Optional Rich console for output + + Returns: + Dictionary mapping plugin names to number of skills loaded + """ + from ..core.skill_registry import get_skill_registry + + registry = get_skill_registry() + loaded_skills: Dict[str, int] = {} + + for plugin_dir in plugin_dirs: + skills_dir = plugin_dir / "skills" + + if not skills_dir.exists() or not skills_dir.is_dir(): + continue + + try: + # Check for subdirectories containing SKILL.md files + skill_count = 0 + for subdir in skills_dir.iterdir(): + if not subdir.is_dir(): + continue + + # Only consider directories that contain SKILL.md file + skill_md_file = subdir / "SKILL.md" + if skill_md_file.exists() and skill_md_file.is_file(): + skill_count += 1 + + # Only register if there are valid skill directories (with SKILL.md) + if skill_count > 0: + count = registry.register_source(str(skills_dir), source_name=str(skills_dir)) + plugin_name = plugin_dir.name + loaded_skills[plugin_name] = count + + if console and count > 0: + console.print(f"[dim]📚 Loaded {count} skill(s) from plugin: {plugin_name}[/dim]") + else: + # No valid skill directories found (no SKILL.md files) + plugin_name = plugin_dir.name + loaded_skills[plugin_name] = 0 + except Exception as e: + plugin_name = plugin_dir.name + if console: + console.print(f"[yellow]⚠️ Failed to load skills from plugin {plugin_name}: {e}[/yellow]") + loaded_skills[plugin_name] = 0 + + return loaded_skills + + async def _load_agents( + self, + plugin_dirs: List[Path], + local_dirs: Optional[List[str]] = None, + remote_backends: Optional[List[str]] = None, + console=None + ) -> Tuple[List, Dict[str, Dict]]: + """ + Load agents following unified lifecycle (Load phase): + 1. Load plugins (skills + agents) + 2. Load local agents + 3. Load remote agents + + Uses abstract loaders to eliminate code duplication. + Loaders are responsible ONLY for loading, not for creating executors. + + Args: + plugin_dirs: List of plugin directory paths + local_dirs: Optional list of local agent directories + remote_backends: Optional list of remote backend URLs + console: Optional Rich console for output + + Returns: + Tuple of (List of all loaded AgentInfo objects, agent_sources_map dictionary) + Agents are deduplicated, prioritizing local over remote + """ + from ..models import AgentInfo + from ..runtime.loaders import PluginLoader, LocalAgentLoader, RemoteAgentLoader + + all_agents: List[AgentInfo] = [] + agent_sources_map: Dict[str, Dict] = {} # Track sources for executor creation + + # ========== Lifecycle Step 1: Load Plugins ========== + # For each plugin: load skills, then load agents + for plugin_dir in plugin_dirs: + try: + loader = PluginLoader(plugin_dir, console=console) + + # Load agents from plugin (this also loads skills internally) + plugin_agents = await loader.load_agents() + + # Track source information + for agent in plugin_agents: + if agent.name not in agent_sources_map: + agent_sources_map[agent.name] = { + "type": "plugin", + "location": str(plugin_dir), + "agents_dir": str(plugin_dir / "agents") # Store agents dir for executor creation + } + all_agents.append(agent) + else: + if console: + console.print(f"[dim]⚠️ Duplicate agent '{agent.name}' from plugin, keeping first[/dim]") + + except Exception as e: + if console: + console.print(f"[yellow]⚠️ Failed to load plugin {plugin_dir}: {e}[/yellow]") + + # ========== Lifecycle Step 2: Load Local Agents ========== + if local_dirs: + if console: + console.print(f"[dim]📂 Loading local agents from {len(local_dirs)} directory(ies)...[/dim]") + + local_agents_count = 0 + for local_dir in local_dirs or []: + try: + if console: + console.print(f"[dim] 📁 Scanning local directory: {local_dir}[/dim]") + loader = LocalAgentLoader(local_dir, console=console) + + # Load agents from local directory + local_agents = await loader.load_agents() + + if local_agents: + if console: + console.print(f"[dim] ✅ Found {len(local_agents)} agent(s) in {local_dir}[/dim]") + local_agents_count += len(local_agents) + else: + if console: + console.print(f"[dim] ℹ️ No agents found in {local_dir}[/dim]") + + # Track source information (prioritize local over remote) + for agent in local_agents: + if agent.name not in agent_sources_map: + agent_sources_map[agent.name] = { + "type": "local", + "location": local_dir + } + all_agents.append(agent) + if console: + console.print(f"[dim] ✓ Loaded agent: {agent.name} (local)[/dim]") + else: + existing_source = agent_sources_map[agent.name] + if existing_source["type"] == "local": + if console: + console.print(f"[dim] ⚠️ Duplicate agent '{agent.name}' found, keeping first occurrence[/dim]") + else: + # Replace remote/plugin with local (prioritize LOCAL) + agent_sources_map[agent.name] = { + "type": "local", + "location": local_dir + } + # Replace in all_agents list + for i, a in enumerate(all_agents): + if a.name == agent.name: + all_agents[i] = agent + break + if console: + console.print(f"[dim] ⚠️ Duplicate agent '{agent.name}' found, replacing {existing_source['type']} version with local[/dim]") + + except Exception as e: + if console: + console.print(f"[yellow]⚠️ Failed to load from {local_dir}: {e}[/yellow]") + + if local_dirs and local_agents_count > 0: + if console: + console.print(f"[dim]📊 Total local agents loaded: {local_agents_count}[/dim]") + + # ========== Lifecycle Step 3: Load Remote Agents ========== + if remote_backends: + if console: + console.print(f"[dim]🌐 Loading remote agents from {len(remote_backends)} backend(s)...[/dim]") + + remote_agents_count = 0 + for backend_url in remote_backends or []: + try: + if console: + console.print(f"[dim] 🔗 Connecting to remote backend: {backend_url}[/dim]") + loader = RemoteAgentLoader(backend_url, console=console) + + # Load agents from remote backend + remote_agents = await loader.load_agents() + + if remote_agents: + if console: + console.print(f"[dim] ✅ Found {len(remote_agents)} agent(s) from {backend_url}[/dim]") + remote_agents_count += len(remote_agents) + else: + if console: + console.print(f"[dim] ℹ️ No agents found from {backend_url}[/dim]") + + # Track source information (only if local doesn't exist) + for agent in remote_agents: + if agent.name not in agent_sources_map: + agent_sources_map[agent.name] = { + "type": "remote", + "location": backend_url + } + all_agents.append(agent) + if console: + console.print(f"[dim] ✓ Loaded agent: {agent.name} (remote)[/dim]") + else: + # Local/plugin source exists, skip remote duplicate + existing_source = agent_sources_map[agent.name] + if console: + console.print(f"[dim] ⚠️ Duplicate agent '{agent.name}' found (remote), keeping {existing_source['type']} version[/dim]") + + except Exception as e: + if console: + console.print(f"[yellow]⚠️ Failed to load from {backend_url}: {e}[/yellow]") + + if remote_backends and remote_agents_count > 0: + if console: + console.print(f"[dim]📊 Total remote agents loaded: {remote_agents_count}[/dim]") + + # Summary log + plugin_count = len([a for a in all_agents if agent_sources_map.get(a.name, {}).get("type") == "plugin"]) + local_count = len([a for a in all_agents if agent_sources_map.get(a.name, {}).get("type") == "local"]) + remote_count = len([a for a in all_agents if agent_sources_map.get(a.name, {}).get("type") == "remote"]) + + if all_agents: + if console: + console.print(f"[green]✅ Agent loading complete: {len(all_agents)} total agent(s) (plugin: {plugin_count}, local: {local_count}, remote: {remote_count})[/green]") + + if not all_agents: + if console: + console.print("[red]❌ No agents found from any source.[/red]") + + return all_agents, agent_sources_map diff --git a/aworld-cli/src/aworld_cli/core/scanner.py b/aworld-cli/src/aworld_cli/core/scanner.py new file mode 100644 index 000000000..080b99676 --- /dev/null +++ b/aworld-cli/src/aworld_cli/core/scanner.py @@ -0,0 +1,298 @@ +# coding: utf-8 +# Copyright (c) 2025 inclusionAI. +import abc +import glob +import os +import re +from typing import Optional, Dict, List + +from aworld.core.context.amni.retrieval.artifacts.file.dir_artifact import DirArtifact +from aworld.logs.util import logger +from aworld.output.artifact import ArtifactAttachment + + +class Scanner(abc.ABC): + """ + Base class for scanning and loading resources (e.g., agent, swarm). + """ + + def __init__(self, context): + self._context = context + + # Initialize DirArtifact for storage + self._dir_artifact = self._create_dir_artifact() + + def _get_storage_type(self) -> str: + """Get storage type, 'local' or 'oss'""" + return 'local' + + def _get_oss_config(self) -> Optional[Dict[str, str]]: + """ + Get OSS configuration. + + Returns: + OSS configuration dictionary containing access_key_id, access_key_secret, endpoint, bucket_name. + Returns None if storage type is not OSS or configuration is unavailable. + """ + storage_type = self._get_storage_type() + if storage_type != 'oss': + return None + + oss_config = None + config = self._context.get_config() if hasattr(self._context, 'get_config') else None + if config and hasattr(config, 'env_config') and config.env_config: + oss_config_obj = config.env_config.working_dir_oss_config + if oss_config_obj: + oss_config = { + 'access_key_id': oss_config_obj.access_key_id, + 'access_key_secret': oss_config_obj.access_key_secret, + 'endpoint': oss_config_obj.endpoint, + 'bucket_name': oss_config_obj.bucket_name + } + + # If not in config, try to get from environment variables + if not oss_config or not all(oss_config.values()): + oss_config = { + 'access_key_id': os.environ.get('OSS_ACCESS_KEY_ID'), + 'access_key_secret': os.environ.get('OSS_ACCESS_KEY_SECRET'), + 'endpoint': os.environ.get('OSS_ENDPOINT'), + 'bucket_name': os.environ.get('OSS_BUCKET_NAME') + } + + return oss_config if oss_config and all(oss_config.values()) else None + + def _create_dir_artifact(self) -> DirArtifact: + """ + Create and configure DirArtifact based on storage type. + + Returns: + DirArtifact instance configured for the current storage type + """ + storage_type = self._get_storage_type() + base_path = os.path.expanduser(os.environ.get('AGENTS_PATH', '~/.aworld/agents')) + + if storage_type == 'oss': + # Get OSS configuration + config = self._context.get_config() if hasattr(self._context, 'get_config') else None + access_key_id = None + access_key_secret = None + endpoint = None + bucket_name = None + + if config and hasattr(config, 'env_config') and config.env_config: + oss_config = config.env_config.working_dir_oss_config + if oss_config: + access_key_id = oss_config.access_key_id + access_key_secret = oss_config.access_key_secret + endpoint = oss_config.endpoint + bucket_name = oss_config.bucket_name + + # Fallback to environment variables + if not access_key_id: + access_key_id = os.environ.get('OSS_ACCESS_KEY_ID') + if not access_key_secret: + access_key_secret = os.environ.get('OSS_ACCESS_KEY_SECRET') + if not endpoint: + endpoint = os.environ.get('OSS_ENDPOINT') + if not bucket_name: + bucket_name = os.environ.get('OSS_BUCKET_NAME') + + return DirArtifact.with_oss_repository( + access_key_id=access_key_id, + access_key_secret=access_key_secret, + endpoint=endpoint, + bucket_name=bucket_name, + base_path=base_path + ) + else: + # Local storage + return DirArtifact.with_local_repository(base_path) + + def _matches_file(self, attachment: ArtifactAttachment, name: str, suffix: str, base_path: str = None) -> bool: + """ + Check if an attachment matches the resource name and suffix. + Default implementation uses filename matching. + Subclasses can override this to implement custom matching logic (e.g., content-based matching). + + Args: + attachment: The attachment to check + name: Resource name + suffix: File suffix + base_path: Base path for file access (optional, for content-based matching) + + Returns: + True if the file matches, False otherwise + """ + # Default implementation: filename-based matching + # Check if filename matches the pattern + if attachment.filename == f"{name}{suffix}": + return True + # Check if filename matches versioned pattern (for backward compatibility) + pattern = re.compile(rf"^{re.escape(name)}_v\d+{re.escape(suffix)}$") + return bool(pattern.match(attachment.filename)) + + def _scan_files_by_suffix(self, suffix: str) -> List[str]: + """ + Scan files by suffix and extract resource names. + + Args: + suffix: File suffix (e.g., ".md") + + Returns: + Sorted list of resource names + """ + self._dir_artifact.reload_working_files() + resource_names = set() + base_path = os.path.expanduser(os.environ.get('AGENTS_PATH', '~/.aworld/agents')) + + if self._dir_artifact.attachments: + # Pattern to match directories: {resource_name}/ or {resource_name}_v{N}/ (for backward compatibility) + dir_pattern = re.compile(rf"^([^/]+)/") + + for attachment in self._dir_artifact.attachments: + if attachment.filename.endswith(suffix): + resource_name = None + # Extract resource name from directory path + match = dir_pattern.match(attachment.path) + if match: + dir_name = match.group(1) + # Remove version suffix if present (e.g., "agent_v1" -> "agent") + version_match = re.match(r'^(.+?)_v\d+$', dir_name) + if version_match: + resource_name = version_match.group(1) + else: + resource_name = dir_name + + if resource_name: + # Extract base filename without suffix for matching + file_base_name = attachment.filename[:-len(suffix)] if attachment.filename.endswith(suffix) else attachment.filename + # Use _matches_file to check if file matches (allows content-based matching in subclasses) + if self._matches_file(attachment, file_base_name, suffix, base_path): + resource_names.add(resource_name) + + return sorted(list(resource_names)) + + async def _load_content_by_suffix(self, name: str, suffix: str) -> Optional[str]: + """Load resource content from storage by suffix.""" + try: + # Build filename and relative path + filename = f"{name}{suffix}" + # Try to find file in directory structure + # First try without version suffix + relative_path = f"{name}/{filename}" + + # For .md files, use DirArtifact + if suffix == ".md": + self._dir_artifact.reload_working_files() + attachment = self._dir_artifact.get_file(relative_path) + if not attachment: + # Try versioned path for backward compatibility + # Look for latest version + if self._dir_artifact.attachments: + version_pattern = re.compile(rf"^{re.escape(name)}_v(\d+)/{re.escape(filename)}$") + versions = [] + for att in self._dir_artifact.attachments: + match = version_pattern.match(att.path) + if match: + versions.append(int(match.group(1))) + if versions: + latest_version = max(versions) + relative_path = f"{name}_v{latest_version}/{filename}" + attachment = self._dir_artifact.get_file(relative_path) + + if not attachment: + return None + + content = attachment.content + if isinstance(content, bytes): + content = content.decode('utf-8') + return content + else: + # For other suffixes (e.g., .yaml), read directly from filesystem + base_path = os.path.expanduser(os.environ.get('AGENTS_PATH', '~/.aworld/agents')) + path = os.path.join(base_path, relative_path) + + if not os.path.exists(path): + # Try versioned path for backward compatibility + if os.path.exists(os.path.join(base_path, name)): + # Look for latest version + version_dirs = glob.glob(os.path.join(base_path, f"{name}_v*")) + if version_dirs: + latest_version_dir = max(version_dirs, key=lambda x: int(re.search(r'_v(\d+)$', x).group(1)) if re.search(r'_v(\d+)$', x) else 0) + path = os.path.join(latest_version_dir, filename) + else: + path = os.path.join(base_path, name, filename) + + if not os.path.exists(path): + return None + + with open(path, "r", encoding="utf-8") as f: + return f.read() + + except Exception as e: + logger.error(f"Failed to load content using suffix {suffix}: {e}") + return None + + async def _load_as_source_by_suffix( + self, + name: str, + suffix: str + ) -> Optional[str]: + """Load resource as source content by suffix.""" + # For .md files, use DirArtifact + if suffix == ".md": + attachment = await self.resolve_resource_from_artifact( + self._dir_artifact, name, suffix + ) + + if attachment: + content = attachment.content + if isinstance(content, bytes): + content = content.decode('utf-8') + return content + return None + + # For other types, use _load_content_by_suffix + content = await self._load_content_by_suffix(name=name, suffix=suffix) + return content + + async def load_as_source(self, name: str) -> Optional[str]: + """Load resource as source content (default implementation, subclasses should override to specify suffix).""" + # Default implementation, subclasses should override this method and call _load_as_source_by_suffix with specified suffix + raise NotImplementedError("Subclasses must implement load_as_source") + + @staticmethod + async def resolve_resource_from_artifact( + dir_artifact: DirArtifact, + name: str, + suffix: str + ) -> Optional[ArtifactAttachment]: + """Find resource in DirArtifact.""" + dir_artifact.reload_working_files() + + # Get file from directory + filename = f"{name}{suffix}" + # First try without version suffix + path = f"{name}/{filename}" + attachment = dir_artifact.get_file(path) + + if attachment: + return attachment + + # Try to find latest versioned file for backward compatibility + version_dir_pattern = re.compile(rf"^{re.escape(name)}_v(\d+)/{re.escape(filename)}$") + versions = [] + + if dir_artifact.attachments: + for att in dir_artifact.attachments: + if att.filename == filename: + match = version_dir_pattern.match(att.path) + if match: + versions.append(int(match.group(1))) + + if versions: + latest_version = max(versions) + path = f"{name}_v{latest_version}/{filename}" + return dir_artifact.get_file(path) + + return None diff --git a/aworld-cli/src/aworld_cli/core/skill_registry.py b/aworld-cli/src/aworld_cli/core/skill_registry.py index afc885c31..a90e12733 100644 --- a/aworld-cli/src/aworld_cli/core/skill_registry.py +++ b/aworld-cli/src/aworld_cli/core/skill_registry.py @@ -64,24 +64,28 @@ def get_skill_registry( # Determine cache directory if cache_dir is None: env_cache_dir = os.getenv(ENV_SKILLS_CACHE_DIR) - cache_dir = Path(env_cache_dir) if env_cache_dir else DEFAULT_CACHE_DIR + if env_cache_dir: + # Expand ~ in path if present + cache_dir = Path(os.path.expanduser(env_cache_dir)) + else: + cache_dir = DEFAULT_CACHE_DIR _global_registry = SkillRegistry(cache_dir=cache_dir) if auto_init: # Register skills from environment variables _register_from_env(_global_registry) - + # Register skills from provided skill_paths parameter if skill_paths: for skill_path in skill_paths: try: count = _global_registry.register_source(skill_path, source_name=skill_path) if count > 0: - print(f"📚 Registered skill source: {skill_path} ({count} skills)") - logger.debug(f"📚 Registered skill source from parameter: {skill_path}") + logger.info(f"📚 Registered skill source: {skill_path} ({count} skills)") + logger.info(f"📚 Registered skill source from parameter: {skill_path}") except Exception as e: - print(f"⚠️ Failed to register skill source '{skill_path}': {e}") + logger.error(f"⚠️ Failed to register skill source '{skill_path}': {e}") logger.warning(f"⚠️ Failed to register skill source '{skill_path}': {e}") # Register default skills directory if provided or exists @@ -131,12 +135,26 @@ def _register_from_env(registry: SkillRegistry) -> None: except Exception as e: print(f"⚠️ Failed to register skill source from env '{source}': {e}") logger.warning(f"⚠️ Failed to register skill source from env '{source}': {e}") + else: + # Default to ~/.aworld/skills if ENV_SKILLS_PATH is not set + default_skills_path = Path.home() / ".aworld" / "skills" + try: + # Create directory if it doesn't exist + default_skills_path.mkdir(parents=True, exist_ok=True) + # Register the default directory + count = registry.register_source(str(default_skills_path), source_name=str(default_skills_path)) + if count > 0: + print(f"📚 Registered default skill source: {default_skills_path} ({count} skills)") + logger.info(f"📚 Registered default skill source: {default_skills_path} ({count} skills)") + except Exception as e: + logger.warning(f"⚠️ Failed to register default skill source '{default_skills_path}': {e}") # Register from SKILLS_DIR (legacy, single directory for backward compatibility) skills_dir_env = os.getenv(ENV_SKILLS_DIR) if skills_dir_env: try: - skills_dir_path = Path(skills_dir_env).resolve() + # Expand ~ in path if present + skills_dir_path = Path(os.path.expanduser(skills_dir_env)).resolve() if skills_dir_path.exists() and skills_dir_path.is_dir(): count = registry.register_source(str(skills_dir_path), source_name=str(skills_dir_path)) if count > 0: diff --git a/aworld-cli/src/aworld_cli/executors/base_executor.py b/aworld-cli/src/aworld_cli/executors/base_executor.py index f73f09c20..91773c1a5 100644 --- a/aworld-cli/src/aworld_cli/executors/base_executor.py +++ b/aworld-cli/src/aworld_cli/executors/base_executor.py @@ -18,13 +18,33 @@ from pathlib import Path from typing import Optional, List, Dict, Any, Union -from rich.console import Console, Group +from rich.console import Console from rich.panel import Panel from rich.text import Text from rich.markdown import Markdown from rich.syntax import Syntax from rich.status import Status +# Try to import Group from rich.console, with fallback for older Rich versions +try: + from rich.console import Group +except ImportError: + # Fallback for older Rich versions - try importing from rich directly + try: + from rich import Group + except ImportError: + # If Group is not available, create a simple wrapper class + # Group is used to combine multiple renderables + class Group: + """Fallback Group class for older Rich versions.""" + def __init__(self, *renderables): + self.renderables = renderables + + def __rich_console__(self, console, options): + from rich.console import RenderableType + for renderable in self.renderables: + yield renderable + from .base import AgentExecutor @@ -53,17 +73,23 @@ def __init__( ): """ Initialize base executor. - + Args: console: Optional Rich console for output session_id: Optional session ID. If None, will generate one automatically. """ self.console = console or Console() self.session_id = session_id or self._generate_session_id() + # Initialize content collapse states (adaptive display for CLI) + self._collapsed_sections = { + 'message': False, # 💬 message content - show full by default + 'tools': False, # 🔧 tool calls content - show full by default + 'results': True # ⚡ tool results content - collapse by default (often verbose) + } self._init_session_management() self._setup_logging() - # ========== Session Management (通用能力) ========== + # ========== Session Management (Common Capabilities) ========== def _init_session_management(self) -> None: """Initialize session history management.""" @@ -257,7 +283,94 @@ def new_session(self) -> str: self.console.print(f"[dim]Previous session: {old_session_id}[/dim]") return self.session_id - # ========== Output Rendering (通用能力) ========== + # ========== Output Rendering (Common Capabilities) ========== + + def _render_collapsible_content(self, section_type: str, header: str, content_lines: List[str], max_lines: int = 3) -> None: + """ + Render collapsible content with expand/collapse functionality. + + Args: + section_type: Type of section ('message', 'tools', 'results') + header: Header text to display (e.g., "💬 AgentName") + content_lines: List of content lines to display + max_lines: Maximum lines to show when collapsed + """ + if not content_lines: + return + + is_collapsed = self._collapsed_sections.get(section_type, False) + total_lines = len(content_lines) + + # Show header + if total_lines > max_lines: + # Add collapse/expand indicator + indicator = "[dim]▼[/dim]" if not is_collapsed else "[dim]▶[/dim]" + self.console.print(f"{header} {indicator}") + else: + # No collapse needed for short content + self.console.print(header) + + # Show content with proper indentation for wrapped lines + if is_collapsed and total_lines > max_lines: + # Show only first few lines + summary + for line in content_lines[:max_lines]: + if line.strip(): + self._print_indented_line(line) + else: + self.console.print() + self.console.print(f" [dim italic]... ({total_lines - max_lines} more lines)[/dim italic]") + else: + # Show all content + for line in content_lines: + if line.strip(): + self._print_indented_line(line) + else: + self.console.print() + + # No extra spacing here - let caller control spacing + + def _print_indented_line(self, line: str, indent: str = " ") -> None: + """ + Print a line with proper indentation, handling line wrapping. + + When a line is too long and wraps to the next line, the wrapped + portion should also be indented to maintain visual consistency. + + Args: + line: The line to print + indent: The indentation string (default: 3 spaces) + """ + if not self.console: + return + + # Get console width and calculate available width for content + console_width = self.console.size.width if self.console.size else 80 + available_width = console_width - len(indent) + + # If line fits within available width, print normally + if len(line) <= available_width: + self.console.print(f"{indent}{line}") + return + + # Handle long lines by splitting and indenting wrapped portions + import textwrap + + # Wrap the line, preserving existing indentation in the original line + wrapped_lines = textwrap.fill( + line, + width=available_width, + subsequent_indent='', + break_long_words=False, + break_on_hyphens=False + ).split('\n') + + # Print first line with original indent + if wrapped_lines: + self.console.print(f"{indent}{wrapped_lines[0]}") + + # Print subsequent lines with additional indentation + for wrapped_line in wrapped_lines[1:]: + self.console.print(f"{indent}{wrapped_line}") def _format_tool_call(self, tool_call, idx: int): """ @@ -487,7 +600,199 @@ def _format_tool_calls(self, tool_calls: list): padding=(1, 2) ) return tool_calls_panel - + + def _render_simple_message_output(self, output, answer: str, agent_name: str = None, is_handoff: bool = False) -> tuple[str, str]: + """ + Simplified message output rendering with modern, clean Claude Code style. + + Features: + - Remove heavy Panel borders + - Use clean emoji and text markers + - Reduce color usage, focus on content + - Add whitespace for better readability + - Show agent name and handoff notifications + + Args: + output: MessageOutput instance + answer: Current answer string + agent_name: Name of the current agent + is_handoff: Whether this is a handoff to a new agent + + Returns: + Tuple of (updated_answer, rendered_content) + """ + from aworld.output.base import MessageOutput + + if not isinstance(output, MessageOutput) or not self.console: + return answer, "" + + # Extract agent name from metadata if not provided + if not agent_name and hasattr(output, 'metadata') and output.metadata: + agent_name = output.metadata.get('agent_name') or output.metadata.get('from_agent') + + # Default agent name + if not agent_name: + agent_name = "Assistant" + + # Extract content + response_text = str(output.response) if hasattr(output, 'response') and output.response else "" + reasoning_text = str(output.reasoning) if hasattr(output, 'reasoning') and output.reasoning else "" + tool_calls = output.tool_calls if hasattr(output, 'tool_calls') and output.tool_calls else [] + + # Update answer + if response_text.strip(): + if not answer: + answer = response_text + elif response_text not in answer: + answer = response_text + + # Build display content + display_parts = [] + + # Add main response with collapsible content + if response_text.strip(): + # Prepare content lines for collapsible rendering + response_lines = [] + + # Add reasoning to response lines if available + if reasoning_text.strip(): + response_lines.extend(["[dim]💭 Thinking process:[/dim]", ""]) + reasoning_lines = reasoning_text.split('\n') + for line in reasoning_lines: + if line.strip(): + response_lines.append(f"[dim]{line}[/dim]") + else: + response_lines.append("") + response_lines.append("") # Add spacing after reasoning + + # Add main response content + content_lines = response_text.split('\n') + for line in content_lines: + response_lines.append(line) + + # Use collapsible rendering + header = f"🤖 [bold]{agent_name}[/bold]" + self._render_collapsible_content('message', header, response_lines, max_lines=10) + self.console.print() # Add spacing after message + + # Handle tool calls with collapsible display + if tool_calls: + # Filter out human tools + filtered_tool_calls = [] + for tool_call_output in tool_calls: + tool_call = None + if hasattr(tool_call_output, 'data'): + tool_call = tool_call_output.data + elif hasattr(tool_call_output, '__class__') and 'ToolCall' in str(tool_call_output.__class__): + tool_call = tool_call_output + + if tool_call: + function_name = "" + if hasattr(tool_call, 'function') and tool_call.function: + function_name = getattr(tool_call.function, 'name', '') + if 'human' not in function_name.lower(): + filtered_tool_calls.append((tool_call_output, tool_call)) + + # Render filtered tool calls with collapsible content + if filtered_tool_calls: + tool_lines = [] + + for idx, (tool_call_output, tool_call) in enumerate(filtered_tool_calls): + function_name = "Unknown" + if hasattr(tool_call, 'function') and tool_call.function: + function_name = getattr(tool_call.function, 'name', 'Unknown') + + # Add tool call entry with icon + tool_lines.append(f"▶ [cyan]{function_name}[/cyan]") + + # Special handling for code execution + if function_name == "execute_ptc_code" or function_name.endswith("__execute_ptc_code"): + function_args = getattr(tool_call.function, 'arguments', '') if hasattr(tool_call, 'function') and tool_call.function else '' + + try: + # Extract code + code = "" + if function_args: + if isinstance(function_args, str): + try: + args_dict = json.loads(function_args) + except json.JSONDecodeError: + code = function_args + args_dict = None + else: + args_dict = function_args + + if isinstance(args_dict, dict): + code = args_dict.get('code', '') or args_dict.get('ptc_code', '') or '' + elif not code and isinstance(function_args, str): + code = function_args + + # Add code content with proper indentation + if code and code.strip(): + tool_lines.append(" [dim]Code:[/dim]") + code_lines = code.strip().split('\n') + for code_line in code_lines: + tool_lines.append(f" {code_line}") + except Exception: + # Fallback for parsing errors + tool_lines.append(" [dim red]Code parsing failed[/dim red]") + else: + # For non-code tools, display arguments + function_args = getattr(tool_call.function, 'arguments', '') if hasattr(tool_call, 'function') and tool_call.function else '' + + if function_args: + try: + # Try to parse and format arguments nicely + if isinstance(function_args, str): + try: + args_dict = json.loads(function_args) + if isinstance(args_dict, dict) and args_dict: + tool_lines.append(" [dim]Arguments:[/dim]") + for key, value in args_dict.items(): + # Truncate long values for readability + if isinstance(value, str) and len(value) > 50: + display_value = value[:47] + "..." + else: + display_value = str(value) + tool_lines.append(f" {key}: {display_value}") + except json.JSONDecodeError: + # If not valid JSON, show as raw text (truncated) + if len(function_args) > 100: + display_args = function_args[:97] + "..." + else: + display_args = function_args + tool_lines.append(" [dim]Arguments:[/dim]") + tool_lines.append(f" {display_args}") + else: + # Non-string arguments + tool_lines.append(" [dim]Arguments:[/dim]") + tool_lines.append(f" {str(function_args)}") + except Exception: + # Fallback for any parsing errors + tool_lines.append(" [dim]Arguments:[/dim]") + tool_lines.append(" [dim red]Argument parsing failed[/dim red]") + + tool_lines.append("") # Add spacing between tools + + # Use collapsible rendering for tool calls + header = "🔧 [bold]Tool calls[/bold]" + self._render_collapsible_content('tools', header, tool_lines, max_lines=15) + # No extra spacing here - let next element control spacing + + # Build message_content for return value + message_parts = [] + if reasoning_text.strip(): + message_parts.append(f"Thinking process:{reasoning_text}") + if response_text.strip(): + message_parts.append(response_text) + if tool_calls: + tool_summary = f"Used {len(tool_calls)} tools" + message_parts.append(tool_summary) + + message_content = "\n\n".join(message_parts).strip() + + return answer, message_content + def _render_message_output(self, output, answer: str) -> tuple[str, str]: """ Render MessageOutput to console and extract answer. @@ -616,10 +921,150 @@ def _render_message_output(self, output, answer: str) -> tuple[str, str]: self.console.print() return answer, message_content - + + def _filter_file_line_info(self, content: str) -> str: + """ + Filter out file:line information and other unwanted text from tool result content. + Removes patterns like "server.py:619", "main.py:123", "Processing request of type", etc. + + Args: + content: Original content string + + Returns: + Filtered content string + """ + import re + if not content: + return content + + # Pattern to match file:line references (e.g., "server.py:619", "main.py:123") + # Matches: filename.extension:number + file_line_pattern = r'\b[a-zA-Z_][a-zA-Z0-9_]*\.[a-zA-Z0-9]+:\d+\b' + + # Remove file:line patterns + filtered_content = re.sub(file_line_pattern, '', content) + + # Remove "Processing request of type" lines + # This removes the entire line containing this text + processing_pattern = r'.*Processing request of type.*\n?' + filtered_content = re.sub(processing_pattern, '', filtered_content, flags=re.IGNORECASE) + + # Clean up extra whitespace and empty lines that might be left behind + filtered_content = re.sub(r'\n\s*\n', '\n', filtered_content) # Remove empty lines + filtered_content = re.sub(r'\s+', ' ', filtered_content) # Normalize whitespace + filtered_content = filtered_content.strip() + + return filtered_content + + def _render_simple_tool_result_output(self, output) -> None: + """ + Simplified tool result output rendering with modern, clean Claude Code style. + + Features: + - Remove heavy Panel borders + - Use clean emoji and text markers + - Reduce color usage, focus on content + - Add whitespace for better readability + - Smart content truncation and summarization + - Collapsible content display + + Args: + output: ToolResultOutput instance + """ + from aworld.output.base import ToolResultOutput + + if not isinstance(output, ToolResultOutput) or not self.console: + return + + # Extract tool information + tool_name = getattr(output, 'tool_name', 'Unknown Tool') + action_name = getattr(output, 'action_name', '') + tool_type = getattr(output, 'tool_type', '') + + # Skip rendering for human tools - user input doesn't need to be displayed + if 'human' in tool_name.lower() or 'human' in action_name.lower(): + return + + # Get tool_call_id + tool_call_id = "" + if hasattr(output, 'metadata') and output.metadata: + tool_call_id = output.metadata.get('tool_call_id', '') + if not tool_call_id and hasattr(output, 'origin_tool_call') and output.origin_tool_call: + tool_call_id = getattr(output.origin_tool_call, 'id', '') + + # Get summary from metadata first + summary = None + if hasattr(output, 'metadata') and output.metadata: + summary = output.metadata.get('summary') + + # Get result content and filter file:line info + result_content = "" + if hasattr(output, 'data') and output.data: + data_str = str(output.data) + if data_str.strip(): + # Filter out file:line information + result_content = self._filter_file_line_info(data_str) + + # Build simple tool info line + tool_parts = [] + if tool_name: + tool_parts.append(tool_name) + if action_name and action_name != tool_name: + tool_parts.append(f"→ {action_name}") + tool_info = " ".join(tool_parts) + + # Determine what content to show + display_content = None + if summary: + # Use provided summary and filter it too + display_content = self._filter_file_line_info(summary) + elif result_content: + # Smart content truncation + max_chars = int(os.environ.get("AWORLD_CLI_TOOL_RESULT_SUMMARY_MAX_CHARS", "300")) + max_lines = int(os.environ.get("AWORLD_CLI_TOOL_RESULT_SUMMARY_MAX_LINES", "3")) + + lines = result_content.split('\n') + + # Check if it's JSON and try to format nicely + is_json = False + try: + import json + parsed = json.loads(result_content) + if isinstance(parsed, dict): + # Show key info from JSON + key_info = [] + for key, value in list(parsed.items())[:3]: # First 3 keys + if isinstance(value, (str, int, float, bool)): + key_info.append(f"{key}: {value}") + elif isinstance(value, (list, dict)): + key_info.append(f"{key}: [{len(value)} items]" if isinstance(value, list) else f"{key}: {{object}}") + + if key_info: + display_content = "\n".join(key_info) + if len(parsed) > 3: + display_content += f"\n... ({len(parsed) - 3} more fields)" + is_json = True + except: + pass + + # If not JSON or JSON parsing failed, use line-based truncation + if not is_json: + display_content = result_content + + # Use collapsible rendering for tool results + if display_content: + content_lines = display_content.split('\n') + header = f"⚡ [bold]{tool_info}[/bold]" + self._render_collapsible_content('results', header, content_lines, max_lines=3) + else: + # No content case - still show header but indicate no output + self.console.print(f"⚡ [bold]{tool_info}[/bold]", style="dim") + self.console.print(" [dim italic]No output[/dim italic]") + self.console.print() + def _render_tool_result_output(self, output) -> None: """ - Render ToolResultOutput to console with collapsible content for long results. + Render ToolResultOutput to console with summary information by default. Skips rendering for human tools as their results are user input and don't need to be displayed. Args: @@ -648,6 +1093,11 @@ def _render_tool_result_output(self, output) -> None: if not tool_call_id and hasattr(output, 'origin_tool_call') and output.origin_tool_call: tool_call_id = getattr(output.origin_tool_call, 'id', '') + # Get summary from metadata first, fallback to generating a summary from data + summary = None + if hasattr(output, 'metadata') and output.metadata: + summary = output.metadata.get('summary') + # Get result content result_content = "" if hasattr(output, 'data') and output.data: @@ -664,44 +1114,42 @@ def _render_tool_result_output(self, output) -> None: if tool_call_id: tool_info += f" [ID: {tool_call_id}]" - if not result_content: - self.console.print(f"[yellow]🔧 Tool: {tool_info}[/yellow]") - return - - # Render based on content length. Use env for limits so long results (e.g. PPT outline JSON) display fully. - content_length = len(result_content) - max_preview_length = int(os.environ.get("AWORLD_CLI_TOOL_RESULT_MAX_CHARS", "20000")) - max_preview_lines = int(os.environ.get("AWORLD_CLI_TOOL_RESULT_MAX_LINES", "200")) - - if content_length > max_preview_length: - # Show preview for long content + # Default to showing summary only + if summary: + # Use provided summary + display_content = summary + elif result_content: + # Generate a brief summary from content (first few lines or truncated) + max_summary_length = int(os.environ.get("AWORLD_CLI_TOOL_RESULT_SUMMARY_MAX_CHARS", "500")) + max_summary_lines = int(os.environ.get("AWORLD_CLI_TOOL_RESULT_SUMMARY_MAX_LINES", "5")) + lines = result_content.split('\n') - if len(lines) > max_preview_lines: - # Show first few lines as preview - preview_lines = lines[:max_preview_lines] - preview_content = '\n'.join(preview_lines) - remaining_lines = len(lines) - max_preview_lines - preview_content += f"\n\n[dim]... ({remaining_lines} more lines, {content_length - len(preview_content)} more characters) ...[/dim]" + if len(lines) > max_summary_lines: + # Show first few lines as summary + summary_lines = lines[:max_summary_lines] + display_content = '\n'.join(summary_lines) + content_length = len(result_content) + remaining_lines = len(lines) - max_summary_lines + display_content += f"\n\n[dim]... ({remaining_lines} more lines, {content_length} total characters) ...[/dim]" + elif len(result_content) > max_summary_length: + # Show truncated summary + display_content = result_content[:max_summary_length] + f"\n\n[dim]... ({len(result_content) - max_summary_length} more characters) ...[/dim]" else: - # Show truncated preview - preview_content = result_content[:max_preview_length] + f"\n\n[dim]... ({content_length - max_preview_length} more characters) ...[/dim]" - - tool_panel = Panel( - preview_content, - title=f"[bold yellow]🔧 Tool Result: {tool_info}[/bold yellow]", - title_align="left", - border_style="yellow", - padding=(1, 2) - ) + # Short content, show as-is + display_content = result_content else: - # Short content, display directly - tool_panel = Panel( - result_content, - title=f"[bold yellow]🔧 Tool Result: {tool_info}[/bold yellow]", - title_align="left", - border_style="yellow", - padding=(1, 2) - ) + # No content, just show tool info + self.console.print(f"[yellow]🔧 Tool: {tool_info}[/yellow]") + return + + # Render summary panel + tool_panel = Panel( + display_content, + title=f"[bold yellow]🔧 Tool Result: {tool_info}[/bold yellow]", + title_align="left", + border_style="yellow", + padding=(1, 2) + ) self.console.print(tool_panel) self.console.print() @@ -724,7 +1172,7 @@ def _extract_answer_from_output(self, output) -> str: return output.get('answer', '') return "" - # ========== Logging (通用能力) ========== + # ========== Logging (Common Capabilities) ========== def _setup_logging(self) -> None: """ @@ -808,7 +1256,7 @@ def _setup_logging(self) -> None: logging.getLogger().setLevel(logging.ERROR) logging.getLogger("aworld").setLevel(logging.ERROR) - # ========== Abstract Methods (子类实现) ========== + # ========== Abstract Methods (Subclass Implementation) ========== @abstractmethod async def chat(self, message: Union[str, tuple[str, List[str]]]) -> str: diff --git a/aworld-cli/src/aworld_cli/executors/local.py b/aworld-cli/src/aworld_cli/executors/local.py index e0359fb76..c2709e4e4 100644 --- a/aworld-cli/src/aworld_cli/executors/local.py +++ b/aworld-cli/src/aworld_cli/executors/local.py @@ -1,12 +1,13 @@ """ Local agent executor. """ -import os import asyncio +import os import traceback from datetime import datetime from pathlib import Path from typing import Optional, List, Dict, Any, Union + from dotenv import load_dotenv from rich.console import Console from rich.panel import Panel @@ -14,8 +15,8 @@ from aworld.config import TaskConfig from aworld.core.agent.swarm import Swarm -from aworld.core.context.amni import TaskInput, ApplicationContext from aworld.core.common import Observation +from aworld.core.context.amni import TaskInput, ApplicationContext from aworld.core.context.amni.config import AmniConfigFactory, AmniConfigLevel from aworld.core.task import Task from aworld.runner import Runners @@ -81,7 +82,7 @@ def __init__( self.context_config = context_config self._hooks_config = hooks or [] self._hooks = self._load_hooks() - + def _load_hooks(self) -> Dict[str, List[ExecutorHook]]: """ Load hooks from configuration. @@ -91,18 +92,18 @@ def _load_hooks(self) -> Dict[str, List[ExecutorHook]]: (returned by hook.point() method). FileParseHook is automatically registered as a default hook for file parsing. - + Returns: Dict mapping hook point to list of hook instances - + Example: >>> hooks = executor._load_hooks() >>> # Returns: {"post_input_parse": [FileParseHook()], "post_build_context": [ImageParseHook()], ...} """ from aworld.runners.hook.hook_factory import HookFactory - + hooks = {} - + # Automatically register FileParseHook as default hook try: from .file_parse_hook import FileParseHook @@ -155,7 +156,7 @@ async def _execute_hooks(self, hook_point: str, **kwargs) -> Any: This method follows the same pattern as runner hooks, using Message objects to pass parameters, but extracts results from message for executor use. - + After each hook execution, updates kwargs with any modified values from message.headers, so subsequent hooks and the caller can see the updates. @@ -165,7 +166,7 @@ async def _execute_hooks(self, hook_point: str, **kwargs) -> Any: Returns: Result extracted from message.payload or message.headers, or None if no hooks executed - + Example: >>> result = await executor._execute_hooks( ... ExecutorHookPoint.POST_INPUT_PARSE, @@ -175,11 +176,11 @@ async def _execute_hooks(self, hook_point: str, **kwargs) -> Any: ... ) """ from aworld.core.event.base import Message - + hooks = self._hooks.get(hook_point, []) if not hooks: return None - + # Extract context from kwargs if available context = kwargs.get('context') if not context: @@ -188,7 +189,7 @@ async def _execute_hooks(self, hook_point: str, **kwargs) -> Any: if isinstance(value, ApplicationContext): context = value break - + result = None for hook in hooks: try: @@ -236,7 +237,7 @@ async def _execute_hooks(self, hook_point: str, **kwargs) -> Any: self.console.print(f"[red]❌ [Executor] Hook '{hook.__class__.__name__}' failed at '{hook_point}': {e}[/red]") return result - + async def _build_task( self, task_content: str, @@ -322,6 +323,7 @@ async def build_context(_task_input: TaskInput, _swarm: Swarm, _workspace) -> Ap workspace=_workspace, context_config=self.context_config ) + _context.get_config().debug_mode=True await _context.init_swarm_state(_swarm) return _context @@ -488,9 +490,11 @@ async def _update_elapsed_time(): hours = int(elapsed // 3600) minutes = int((elapsed % 3600) // 60) elapsed_str = f"{hours}h {minutes}m" - + if loading_status: - loading_status.update(f"[dim]{base_message} [{elapsed_str}][/dim]") + # Add indentation to elapsed time updates + indented_message = f" {base_message} [{elapsed_str}]" + loading_status.update(f"[dim]{indented_message}[/dim]") await asyncio.sleep(0.5) # Update every 0.5 seconds def _start_loading_status(message: str): @@ -535,6 +539,10 @@ def _stop_loading_status(): # Show loading status while waiting for first output _start_loading_status("💭 Thinking...") + # Track current agent for handoff detection + current_agent_name = None + last_agent_name = None + try: # Ensure console is set before processing stream events if not self.console: @@ -544,14 +552,34 @@ def _stop_loading_status(): async for output in outputs.stream_events(): if not self.console: continue - + # Handle MessageOutput if isinstance(output, MessageOutput): # Stop thinking status before rendering message _stop_loading_status() - + + # Extract agent name from output metadata + current_agent_name = None + if hasattr(output, 'metadata') and output.metadata: + current_agent_name = output.metadata.get('agent_name') or output.metadata.get('from_agent') + + # Fallback to get current agent from swarm + if not current_agent_name and hasattr(self.swarm, 'cur_agent') and self.swarm.cur_agent: + current_agent_name = getattr(self.swarm.cur_agent, 'name', None) or getattr(self.swarm.cur_agent, 'id', lambda: None)() + + # Default agent name + if not current_agent_name: + current_agent_name = "Assistant" + + # Check if this is a handoff (agent switch) + is_handoff = last_agent_name is not None and last_agent_name != current_agent_name + last_message_output = output - answer, _ = self._render_message_output(output, answer) + # Pass agent_name and is_handoff parameters + answer, _ = self._render_simple_message_output(output, answer, agent_name=current_agent_name, is_handoff=is_handoff) + + # Update last_agent_name for next iteration + last_agent_name = current_agent_name # Check if there are tool calls - if so, show "Calling tool..." status # Skip status for human tools as they require user interaction @@ -582,18 +610,19 @@ def _stop_loading_status(): has_non_human_tool = True # Only show loading status if there are non-human tools - if has_non_human_tool: - _start_loading_status("🔧 Calling tool...") + # if has_non_human_tool: + # Use dynamic loading status without icon prefix + # _start_loading_status("Calling tool...") # If no tool calls, don't show thinking status here # It might be final response, or next output will trigger thinking status # Handle ToolResultOutput elif isinstance(output, ToolResultOutput): # Stop "Calling tool..." status before rendering result - _stop_loading_status() + # _stop_loading_status() # Render tool result - self._render_tool_result_output(output) + self._render_simple_tool_result_output(output) # Immediately show thinking status after tool execution completes # Agent will process the tool result and think about next steps @@ -605,7 +634,7 @@ def _stop_loading_status(): # Just silently continue, keeping the Thinking status active # Optionally, we can log or render step info without stopping status pass - + # Handle other output types else: # Stop any loading status @@ -736,7 +765,7 @@ def _stop_loading_status(): # Note: _format_tool_call, _format_tool_calls, _render_message_output, # _render_tool_result_output, _extract_answer_from_output are now inherited from BaseAgentExecutor - + async def _create_workspace(self, session_id: str): """Create local workspace for the session. diff --git a/aworld-cli/src/aworld_cli/handlers/human_handler.py b/aworld-cli/src/aworld_cli/handlers/human_handler.py index 54b4cf7dd..5f43098b4 100644 --- a/aworld-cli/src/aworld_cli/handlers/human_handler.py +++ b/aworld-cli/src/aworld_cli/handlers/human_handler.py @@ -4,7 +4,9 @@ """ import asyncio -from typing import Optional +import json +import re +from typing import Optional, Tuple from aworld.core.event.base import Constants, Message from aworld.logs.util import logger @@ -14,6 +16,7 @@ from rich.panel import Panel from .._globals import console +from ..user_input import UserInputHandler @HandlerFactory.register(name=f'__{Constants.HUMAN}__', prio=100) @@ -69,6 +72,123 @@ def _get_short_prompt(self, payload: str) -> str: """ return "Please Input" + def _parse_human_in_loop_content(self, payload: str) -> Tuple[str, dict]: + """ + 解析 human_in_loop 格式的内容,提取 input_type 和配置数据,并路由到对应的 user_input 接口。 + + 支持的格式(按优先级排序): + 1. JSON 格式(推荐):{"input_type": "1", "message": "..."} 等 + 2. human_in_loop 代码块格式:```human_in_loop ... ``` + 3. 前缀格式(向后兼容):1|, 2|, 3|, 4|, 5|, 6| 开头 + + 路由规则: + - input_type "1" -> user_input.submit() (确认/批准) + - input_type "2" -> user_input.text_input() (文本输入) + - input_type "3" -> 文件上传(暂未实现,回退到文本输入) + - input_type "4" -> user_input.select_multiple() (多选) + - input_type "5" -> user_input.single_select() (单选) + - input_type "6" -> user_input.composite_menu() (复合菜单) + + Args: + payload: 原始 payload 内容 + + Returns: + (input_type, config_dict) 元组,config_dict 包含所有配置数据,可直接传递给对应的 user_input 接口 + """ + if not payload or not payload.strip(): + return "2", {"input_type": "2", "text": ""} + + payload = payload.strip() + + # 优先级1: 尝试解析为 JSON 格式(推荐格式) + try: + data = json.loads(payload) + if isinstance(data, dict) and "input_type" in data: + input_type = str(data.get("input_type", "2")) + # 确保 input_type 在 config 中 + data["input_type"] = input_type + logger.debug(f"✅ 解析 JSON 格式成功: input_type={input_type}") + return input_type, data + except (json.JSONDecodeError, ValueError, AttributeError) as e: + logger.debug(f"JSON 解析失败,尝试其他格式: {e}") + + # 优先级2: 检查是否是 human_in_loop 代码块格式 + if "```human_in_loop" in payload: + try: + match = re.search(r'```human_in_loop\s*\n(.*?)\n```', payload, re.DOTALL) + if match: + json_str = match.group(1).strip() + data = json.loads(json_str) + if isinstance(data, dict) and "input_type" in data: + input_type = str(data.get("input_type", "2")) + data["input_type"] = input_type + logger.debug(f"✅ 解析 human_in_loop 代码块成功: input_type={input_type}") + return input_type, data + except (json.JSONDecodeError, ValueError, AttributeError, KeyError) as e: + logger.debug(f"human_in_loop 代码块解析失败: {e}") + + # 优先级3: 回退到前缀格式(向后兼容) + if payload.startswith("1|"): + input_type = "1" + config = {"input_type": "1", "message": payload[2:].strip(), "default": True} + elif payload.startswith("2|"): + input_type = "2" + config = {"input_type": "2", "text": payload[2:].strip()} + elif payload.startswith("3|"): + input_type = "3" + config = {"input_type": "3", "message": payload[2:].strip()} + elif payload.startswith("4|"): + input_type = "4" + content = payload[2:].strip() + # 尝试解析为 JSON 数组 + try: + options = json.loads(content) + if not isinstance(options, list): + options = [options] if options else [] + config = {"input_type": "4", "options": options, "title": "请选择(可多选)"} + except (json.JSONDecodeError, ValueError, AttributeError): + # 如果不是 JSON,按行分割 + options = [line.strip() for line in content.split('\n') if line.strip()] + config = {"input_type": "4", "options": options, "title": "请选择(可多选)"} + elif payload.startswith("5|"): + input_type = "5" + content = payload[2:].strip() + # 尝试解析为 JSON 对象 + try: + data = json.loads(content) + if isinstance(data, dict): + data["input_type"] = "5" + config = data + else: + config = {"input_type": "5", "title": str(data), "options": []} + except (json.JSONDecodeError, ValueError, AttributeError): + # 如果不是 JSON,按行分割(第一行是标题,后续是选项) + lines = [line.strip() for line in content.split('\n') if line.strip()] + if lines: + config = {"input_type": "5", "title": lines[0], "options": lines[1:] if len(lines) > 1 else []} + else: + config = {"input_type": "5", "title": "请选择", "options": []} + elif payload.startswith("6|"): + input_type = "6" + content = payload[2:].strip() + # 尝试解析为 JSON 对象 + try: + data = json.loads(content) + if isinstance(data, dict): + data["input_type"] = "6" + config = data + else: + config = {"input_type": "6", "title": "复合菜单", "tabs": []} + except (json.JSONDecodeError, ValueError, AttributeError): + config = {"input_type": "6", "title": "复合菜单", "tabs": []} + else: + # 默认文本输入 + input_type = "2" + config = {"input_type": "2", "text": payload} + + logger.debug(f"✅ 使用前缀格式解析: input_type={input_type}") + return input_type, config + async def handle_user_input(self, message: Message) -> Optional[str]: """ Handle user input - display formatted payload and prompt for input @@ -80,6 +200,7 @@ async def handle_user_input(self, message: Message) -> Optional[str]: User's input as string, or None if input failed/cancelled """ try: + logger.info(f"✅ Handling user input: {message.payload}") message.context.cli = self.console # Add a blank line for visual separation @@ -87,9 +208,154 @@ async def handle_user_input(self, message: Message) -> Optional[str]: payload = message.payload or "" + # 解析 input_type 和 config + input_type, config = self._parse_human_in_loop_content(payload) + + # 创建 UserInputHandler 实例 + user_input_handler = UserInputHandler(self.console) + + # 根据 input_type 处理不同的输入类型 + if input_type == "1": # approval/confirmation + from rich.prompt import Confirm + message_text = config.get("message", "请确认") + default = config.get("default", True) + confirmed = await asyncio.to_thread( + Confirm.ask, + f"[cyan]{message_text}[/cyan]", + default=default, + console=self.console + ) + return json.dumps({"confirmed": confirmed}, ensure_ascii=False) + + elif input_type == "2": # text_input + try: + prompt = config.get("text", config.get("prompt", "请输入")) + default = config.get("default", "") + placeholder = config.get("placeholder") + + user_input = await asyncio.to_thread( + user_input_handler.text_input, + prompt=prompt, + default=default, + placeholder=placeholder + ) + if user_input: + logger.info(f"✅ Human text input received: {user_input[:100]}...") + return user_input + except Exception as e: + logger.warning(f"调用文本输入接口失败,回退到简单输入: {e}") + self.console.print(f"[yellow]⚠️ 文本输入接口调用失败: {e}[/yellow]") + # 回退到简单输入 + display_text = config.get("text", config.get("prompt", "请输入")) + if display_text: + formatted_payload = self._format_payload(display_text) + self.console.print( + Panel( + formatted_payload, + border_style="dim", + padding=(1, 2), + title=None + ) + ) + self.console.print() + user_input = await asyncio.to_thread( + Prompt.ask, + f"[cyan]{self._get_short_prompt(display_text)}[/cyan]", + console=self.console + ) + return user_input.strip() if user_input else None + + elif input_type == "3": # file_upload + message_text = config.get("message", "请上传文件") + self.console.print(f"[yellow]⚠️ 文件上传功能暂未实现,请手动提供文件路径。[/yellow]") + self.console.print(f"[dim]{message_text}[/dim]") + file_path = await asyncio.to_thread( + Prompt.ask, + "[cyan]请输入文件路径[/cyan]", + console=self.console + ) + return json.dumps({"file_path": file_path.strip()}, ensure_ascii=False) if file_path.strip() else None + + elif input_type == "4": # multi_select + try: + options = config.get("options", []) + if not options: + self.console.print("[yellow]⚠️ 没有可选项,回退到文本输入。[/yellow]") + else: + title = config.get("title", "请选择(可多选)") + prompt = config.get("prompt", "输入选项编号(用逗号分隔,如:1,3,5)") + selected_indices = await asyncio.to_thread( + user_input_handler.select_multiple, + options=options, + title=title, + prompt=prompt + ) + if selected_indices: + selected_options = [options[i] for i in selected_indices] + user_input = json.dumps(selected_options, ensure_ascii=False) + logger.info(f"✅ Human multi-select received: {len(selected_indices)} items") + return user_input + return None + except Exception as e: + logger.warning(f"调用多选接口失败,回退到文本输入: {e}") + self.console.print(f"[yellow]⚠️ 多选接口调用失败: {e}[/yellow]") + + elif input_type == "5": # single_select + try: + options = config.get("options", []) + title = config.get("title", "请选择") + warning = config.get("warning") + question = config.get("question") + nav_items = config.get("nav_items") + + if not options: + self.console.print("[yellow]⚠️ 没有可选项,回退到文本输入。[/yellow]") + else: + selected_index = await asyncio.to_thread( + user_input_handler.single_select, + options=options, + title=title, + warning=warning, + question=question, + nav_items=nav_items + ) + if selected_index is not None: + selected_option = options[selected_index] if isinstance(options[selected_index], str) else options[selected_index].get("label", "") + user_input = json.dumps({"selected_index": selected_index, "selected_option": selected_option}, ensure_ascii=False) + logger.info(f"✅ Human single-select received: index {selected_index}") + return user_input + return None + except Exception as e: + logger.warning(f"调用单选接口失败,回退到文本输入: {e}") + self.console.print(f"[yellow]⚠️ 单选接口调用失败: {e}[/yellow]") + + elif input_type == "6": # composite_menu + try: + tabs = config.get("tabs", []) + title = config.get("title", "复合菜单") + + if not tabs: + self.console.print("[yellow]⚠️ 没有配置任何 tab,回退到文本输入。[/yellow]") + else: + results = await asyncio.to_thread( + user_input_handler.composite_menu, + tabs=tabs, + title=title + ) + if results: + user_input = json.dumps(results, ensure_ascii=False) + logger.info(f"✅ Human composite menu received: {len(results)} tabs") + return user_input + return None + except Exception as e: + logger.warning(f"调用复合菜单接口失败,回退到文本输入: {e}") + self.console.print(f"[yellow]⚠️ 复合菜单接口调用失败: {e}[/yellow]") + + # 处理默认情况(未匹配到任何 input_type 或回退情况) # Display formatted payload content if available - if payload: - formatted_payload = self._format_payload(payload) + display_text = config.get("message") if config.get("message") else payload + if display_text: + formatted_payload = self._format_payload(display_text) # Display in a subtle panel without title for cleaner look self.console.print( Panel( @@ -102,7 +368,7 @@ async def handle_user_input(self, message: Message) -> Optional[str]: self.console.print() # Add spacing before prompt # Get short prompt for input field - prompt_text = self._get_short_prompt(payload) + prompt_text = self._get_short_prompt(display_text if display_text else payload) # Display input prompt with consistent styling user_input = await asyncio.to_thread( diff --git a/aworld-cli/src/aworld_cli/inner_plugins/smllc/agents/aworld_agent.py b/aworld-cli/src/aworld_cli/inner_plugins/smllc/agents/aworld_agent.py index 7d1bb2939..1bff38364 100644 --- a/aworld-cli/src/aworld_cli/inner_plugins/smllc/agents/aworld_agent.py +++ b/aworld-cli/src/aworld_cli/inner_plugins/smllc/agents/aworld_agent.py @@ -5,195 +5,99 @@ 1. Direct task execution: Handle tasks directly using available tools and skills 2. Agent team delegation: Create and delegate tasks to specialized agent teams when needed -Role: Aworld - A versatile AI assistant capable of solving any task through direct execution +Role: Aworld - A versatile AI assistant capable of solving any task through direct execution or coordinated multi-agent collaboration. """ import os import sys from typing import Optional, List -import logging -from aworld.tools.human.human import HUMAN +from aworld.core.context.amni import AgentContextConfig +from aworld.core.context.amni.config import get_default_config, ContextEnvConfig +from aworld.experimental.cast.tools import CAST_ANALYSIS, CAST_CODER +from aworld.logs.util import logger +from aworld_cli.core.agent_registry_tool import AGENT_REGISTRY sys.path.append(os.path.dirname(os.path.dirname(os.path.abspath(__file__)))) from aworld.agents.llm_agent import Agent from aworld.core.agent.swarm import TeamSwarm, Swarm from aworld.core.agent.base import BaseAgent -from aworld_cli.core import agent -from aworld_cli.core.agent_registry import LocalAgentRegistry +from aworld_cli.core import agent, LocalAgentRegistry from aworld_cli.core.loader import init_agents +from aworld_cli.core.agent_scanner import global_agent_registry +import asyncio from aworld.config import AgentConfig, ModelConfig from aworld.utils.skill_loader import collect_skill_docs +# for skills use +CAST_ANALYSIS, CAST_CODER, AGENT_REGISTRY -logger = logging.getLogger(__name__) +from datetime import datetime +from zoneinfo import ZoneInfo + +def _build_beijing_date_line() -> str: + """Return a line stating today's Beijing date in Chinese format.""" + beijing_now = datetime.now(ZoneInfo("Asia/Shanghai")) + + return f"Today is {beijing_now.year} (year)-{beijing_now.month} (month)-{beijing_now.day}(day)." # System prompt based on orchestrator_agent prompt aworld_system_prompt = """ -You are Aworld, a versatile AI assistant designed to solve any task presented by users. - -## Role Identity: -Your name is Aworld. You are an intelligent assistant capable of handling tasks through two primary modes: -1. **Direct Execution Mode**: Execute tasks directly using available tools and capabilities -2. **Team Coordination Mode**: Delegate complex tasks to specialized agent teams when coordination is needed - -## Task Description: -Note that tasks can be highly complex. Do not attempt to solve everything at once. You should break down the task and use different tools step by step. After using each tool, clearly explain the execution results and suggest the next steps. - -Please use appropriate tools for the task, analyze the results obtained from these tools, and provide your reasoning. Always use available tools to verify correctness. - -## Execution Strategy: - -### Mode Selection: -- **Simple Tasks**: Execute directly using available tools (search, file operations, code execution, etc.) -- **Complex Multi-Step Tasks**: Consider delegating to specialized agent teams if available -- **Tasks Requiring Coordination**: Use agent team delegation when multiple specialized agents are needed - -### Available Execution Methods: -1. **Direct Tool Execution**: Use MCP tools, skills, and capabilities directly -2. **Agent Team Delegation**: Use the `run_task` tool to delegate tasks to specialized agent teams - - Check available teams using `get_agent_info` or `list_agents` tools - - Delegate to appropriate teams based on task requirements - - Examples of available teams might include: - - Research teams for deep information gathering - - Code teams for software development tasks - - Analysis teams for data processing - - Multi-agent teams for complex coordination - -## Workflow: -1. **Task Analysis**: Analyze the task and determine the steps required to complete it. Propose a complete plan consisting of multi-step tuples (subtask, goal, action). - - **Concept Understanding Phase**: Before task analysis, you must first clarify and translate ambiguous concepts in the task - - **Terminology Mapping**: Convert broad terms into specific and accurate expressions - - **Geographical Understanding**: Supplement and refine concepts based on geographical location - - **Technical Term Precision**: Ensure accuracy and professionalism in terminology usage - -2. **Execution Mode Decision**: - - Assess task complexity and requirements - - Decide whether to execute directly or delegate to an agent team - - If delegating, select the most appropriate agent team - -3. **Information Gathering**: Prioritize using the model's prior knowledge to answer non-real-time world knowledge questions, avoiding unnecessary searches. For tasks requiring real-time information, specific data, or verification, collect necessary information from provided files or use search tools to gather comprehensive information. - -4. **Tool Selection**: Select the most appropriate tool based on task characteristics. - - **Code Mode Priority**: When a task requires multiple MCP tool calls (2+ times) or involves large intermediate results, prefer generating Python code to execute all operations at once instead of calling tools step by step. This reduces token usage by 95%+ and improves efficiency. - - **Report Type Tasks**: If the task involves generating reports such as stock analysis (股票分析), deep search reports (深度检索报告), research reports, or any comprehensive analysis reports, you must directly send the final report to the user using the notify tool (dingtalk_notify skill) after completing the analysis. Do not just return the result in the conversation - use the notify tool to ensure the user receives the report via notification. - -5. **Task Result Analysis**: Analyze the results obtained from the current task and determine whether the current task has been successfully completed. - -6. **Final Answer**: If the task_input task has been solved. If the task is not yet solved, provide your reasoning, suggest, and report the next steps. - - **For Report Tasks**: After generating reports (stock analysis, deep search, etc.), you must use the notify tool to send the report to the user. Format the report content in markdown format and send it via the notify tool with an appropriate title. - -7. **Task Exit**: If the current task objective is completed, simply return the result without further reasoning to solve the overall global task goal, and without selecting other tools for further analysis. - -8. **Ad-hoc Tasks**: If the task is a complete prompt, reason directly without calling any tools. If the prompt specifies an output format, respond according to the output format requirements. - -## Answer Generation Rules: -1. When involving mathematical operations, date calculations, and other related tasks, strictly follow logical reasoning requirements. For example: - - If it's yesterday, perform date minus 1 operation; - - If it's the day before yesterday, perform date minus 2 operation; - - If it's tomorrow, perform date plus 1 operation. - -2. When reasoning, do not over-rely on "availability heuristic strategy", avoid the "primacy effect" phenomenon to prevent it from affecting final results. Establish a "condition-first" framework: first extract all quantifiable, verifiable hard conditions (such as time, numbers, facts) as a filtering funnel. Prohibit proposing final answers before verifying hard conditions. - -3. For reasoning results, it is recommended to use reverse verification strategy for bias confirmation. For each candidate answer, list appropriate falsifiable points and actively seek counterexamples for judgment. - -4. Strictly follow logical deduction principles, do not oversimplify or misinterpret key information. Do not form any biased conclusions before collecting all clues. Adopt a "hypothesis-verification" cycle rather than an "association-confirmation" mode. All deductive conclusions must have clear and credible verification clues, and self-interpretation is not allowed. - -5. Avoid automatic dimensionality reduction operations. Do not reduce multi-dimensional constraint problems to common sense association problems. If objective anchor information exists, prioritize its use rather than relying on subjective judgment. - -6. **Strictly Answer According to Task Requirements**: Do not add any extra conditions, do not self-explain, strictly judge according to the conditions set by the task (such as specified technical specifications, personnel position information): - 6.1. When a broad time range condition is set in the original conditions, converting the condition into a fixed time window for hard filtering is not allowed - 6.2. When the original conditions only require partial condition satisfaction, converting the conditions into stricter filtering conditions is not allowed. For example: only requiring participation in projects but converting to participation in all projects during execution is not allowed - 6.3. **Do Not Add Qualifiers Not Explicitly Mentioned in the Task**: - - If the task does not specify status conditions like "completed", "in use", "built", do not add them in your answer - - If the task does not specify quantity conditions like "ranking", "top few", do not add them in your answer - - If the task does not specify classification conditions like "region", "type", do not add them in your answer - - If the task does not specify authority conditions like "official", "formal", do not add them in your answer - 6.4. **Example Comparisons**: - - ❌ Wrong: Task asks "highest peak", answer "highest climbed peak" - - ❌ Wrong: Task asks "longest river", answer "longest among major rivers" - - ❌ Wrong: Task asks "largest company", answer "largest among listed companies" - - ✅ Correct: Task asks "highest peak", directly answer "highest peak" - - ✅ Correct: Task asks "longest river", directly answer "longest river" - - ✅ Correct: Task asks "largest company", directly answer "largest company" - -7. **Avoid Excessive Information Gathering**: Strictly collect information according to task requirements. Do not collect relevant information beyond the task scope. For example: if the task requires "finding an athlete's name", only search for the name, do not additionally search for age, height, weight, career history, etc. - -8. **Prior Knowledge First Principle**: For non-real-time world knowledge questions, prioritize using the model's prior knowledge to answer directly, avoiding unnecessary searches: - 8.1. **Applicable Scenarios**: Common sense knowledge, historical facts, geographical information, scientific concepts, cultural backgrounds, and other relatively stable information - 8.2. **Judgment Criteria**: - - Whether the information has timeliness requirements (such as "latest", "current", "2024") - - Whether specific data verification is needed (such as specific numbers, rankings, statistics) - - Whether it is common sense knowledge (such as "What are China's national central cities", "Seven continents of the world", etc.) - 8.3. **Exceptions**: When the task explicitly requires verification, updating, or obtaining the latest information, search tools should still be used - -9. **Progressive Search Optimization Principle**: When conducting multi-step search tasks, precise searches should be based on clues already obtained, avoiding repeated searches of known information: - 9.1. **Clue Inheritance**: When generating subsequent search tasks, you must refer to clues and limiting conditions obtained from previous searches, avoiding repeated searches of known information - 9.2. **Search Scope Precision**: Narrow search scope based on existing clues, for example: - - If a specific region is identified, subsequent searches should focus on that region rather than global scope - - If a specific category is identified, subsequent searches should focus on that category rather than all categories - - If a specific time range is identified, subsequent searches should focus on that period rather than all time - 9.3. **Avoid Repeated Searches**: Do not re-search information already obtained. Instead, conduct more precise targeted searches based on existing information - 9.4. **Search Task Progression**: Each search task should be further refined based on previous task results, rather than starting over - -## ***IMPORTANT*** Tool or Agent Selection Recommendations: -1. For search-related tasks, consider selecting web browsing tools or delegating to web_agent teams if available. -2. For tasks involving code, github, huggingface, benchmark-related content, prioritize selecting coding tools or delegating to coding_agent teams. -3. For complex multi-agent coordination tasks, use the `run_task` tool to delegate to appropriate agent teams. -4. Always check available agent teams before deciding on execution strategy. - -## ***CRITICAL*** File Creation Guidelines: -When creating agent structures, configuration files, or any code files: -- ✅ **ALWAYS USE**: filesystem-server tools (`write_file`, `edit_file`, `read_file`) -- ❌ **NEVER USE**: knowledge tools (`add_knowledge`, `update_knowledge`) for code/file creation -- 🎯 **Target Location**: Create agent files in `./agents/` directory -- 📝 **Process**: Read templates with `read_file`, then create files with `write_file` -- 💡 **Why**: Files need to exist in the filesystem for aworld_cli to discover and load them - -**Example - Creating an Agent (CORRECT):** -``` -1. read_file("references/teamswarm_template.md") -2. write_file("./agents/MyTeam/__init__.py", "") -3. write_file("./agents/MyTeam/agents/__init__.py", "") -4. write_file("./agents/MyTeam/agents/orchestrator/config.py", "...") -``` - -**Example - What NOT to Do (WRONG):** -``` -❌ add_knowledge(name="Agent Implementation", content="...") # This only stores in memory, doesn't create files! -``` - -# Output Requirements: -1. Before providing the `final answer`, carefully reflect on whether the task has been fully solved. If you have not solved the task, please provide your reasoning and suggest the next steps. -2. When providing the `final answer`, answer the user's question directly and precisely. For example, if asked "what animal is x?" and x is a monkey, simply answer "monkey" rather than "x is a monkey". -3. Always identify yourself as "Aworld" when communicating with users. +You are AWorld, a versatile AI assistant designed to solve any task presented by users. + +Today is {{current_date}}, {{current_datetime}} (Beijing time). Your own knowledge has a cutoff in 2024, please keep in mind! + +## 1. Role & Identity +You are AWorldAgent, a sophisticated AI agent acting as a central coordinator. Your primary role is to understand complex user requests and orchestrate a solution by dispatching tasks to a suite of specialized assistants (tools). You do not solve tasks directly; you manage the workflow. + +## 2. Core Operational Workflow +You must tackle every user request by following this iterative, step-by-step process: + +1. **Analyze & Decompose:** Break down the user's complex request into a sequence of smaller, manageable sub-tasks. +2. **Select & Execute:** For the immediate sub-task, select **one and only one** assistant (tool) best suited to complete it. +3. **Report & Plan:** After the tool executes, clearly explain the results of that step and state your plan for the next action. +4. **Iterate:** Repeat this process until the user's overall request is fully resolved. + +## 3. Available Assistants (Tools) +You are equipped with multiple assistants. It is your job to know which to use and when. Your key assistants include: + +* `search_agent`: Handles reasoning, and document analysis tasks. +* `text2agent`: Creates a new agent from a user's description. +* `optimizer_agent`: Optimizes an existing agent to better meet user requirements. +* specialized agents/tools: Please be aware of other specialized assistants/tools equiped for you, call them to do the appropriate job while the user call them. + + +## 4. Critical Guardrails +- **One Tool Per Step:** You **must** call only one tool at a time. Do not chain multiple tool calls in a single response. +- **True to Task:** While calling your assistant, you must pass the user's raw request/details to the assistant, without any modification. +- **Honest Capability Assessment:** If a user's request is beyond the combined capabilities of your available assistants, you must terminate the task and clearly explain to the user why it cannot be completed. """ def extract_agents_from_swarm(swarm: Swarm) -> List[BaseAgent]: """ Extract all Agent instances from a Swarm. - + This function extracts agents from a Swarm in multiple ways: 1. If swarm has agent_graph with agents dict, extract from there 2. If swarm has agents property, extract from there 3. If swarm has topology, extract agents from topology 4. If swarm is a single Agent wrapped, extract the communicate_agent - + Args: swarm: The Swarm instance to extract agents from - + Returns: List of BaseAgent instances extracted from the swarm - + Example: >>> swarm = TeamSwarm(agent1, agent2, agent3) >>> agents = extract_agents_from_swarm(swarm) >>> print(f"Extracted {len(agents)} agents") """ agents = [] - + try: # Method 1: Try agent_graph.agents (most reliable after initialization) if hasattr(swarm, 'agent_graph') and swarm.agent_graph: @@ -202,7 +106,7 @@ def extract_agents_from_swarm(swarm: Swarm) -> List[BaseAgent]: agents.extend(swarm.agent_graph.agents.values()) elif isinstance(swarm.agent_graph.agents, (list, tuple)): agents.extend(swarm.agent_graph.agents) - + # Method 2: Try swarm.agents (direct access) if not agents and hasattr(swarm, 'agents') and swarm.agents: if isinstance(swarm.agents, dict): @@ -211,7 +115,7 @@ def extract_agents_from_swarm(swarm: Swarm) -> List[BaseAgent]: agents.extend(swarm.agents) elif isinstance(swarm.agents, BaseAgent): agents.append(swarm.agents) - + # Method 3: Try topology (before initialization) if not agents and hasattr(swarm, 'topology') and swarm.topology: for item in swarm.topology: @@ -226,14 +130,14 @@ def extract_agents_from_swarm(swarm: Swarm) -> List[BaseAgent]: # Recursively extract from nested swarm nested_agents = extract_agents_from_swarm(item) agents.extend(nested_agents) - + # Method 4: Try communicate_agent (root agent) if not agents and hasattr(swarm, 'communicate_agent') and swarm.communicate_agent: if isinstance(swarm.communicate_agent, BaseAgent): agents.append(swarm.communicate_agent) elif isinstance(swarm.communicate_agent, (list, tuple)): agents.extend([a for a in swarm.communicate_agent if isinstance(a, BaseAgent)]) - + # Remove duplicates based on agent id seen_ids = set() unique_agents = [] @@ -243,129 +147,284 @@ def extract_agents_from_swarm(swarm: Swarm) -> List[BaseAgent]: if agent_id not in seen_ids: seen_ids.add(agent_id) unique_agents.append(ag) - + return unique_agents - + except Exception as e: logger.warning(f"⚠️ Failed to extract agents from swarm: {e}") return [] +async def _load_agents_from_global_registry(exclude_names: List[str]) -> List[BaseAgent]: + """ + Async helper function to load agents from global_agent_registry. + + Args: + exclude_names: List of agent names to exclude + + Returns: + List of BaseAgent instances loaded from global_agent_registry + """ + registry_agents = [] + + try: + # Get all agent names from global_agent_registry + agent_names = await global_agent_registry.list_as_source() + logger.debug(f"📋 Found {len(agent_names)} agent(s) in global_agent_registry") + + for agent_name in agent_names: + # Skip excluded agents + if agent_name in exclude_names: + logger.debug(f"⏭️ Skipping excluded agent from global_agent_registry: {agent_name}") + continue + + try: + # Load agent from global_agent_registry + agent = await global_agent_registry.load_agent(agent_name) + if agent and isinstance(agent, BaseAgent): + registry_agents.append(agent) + logger.debug(f"✅ Loaded agent '{agent_name}' from global_agent_registry") + else: + logger.debug( + f"⚠️ Failed to load agent '{agent_name}' from global_agent_registry: agent is None or not BaseAgent") + except Exception as e: + logger.warning(f"⚠️ Failed to load agent '{agent_name}' from global_agent_registry: {e}") + continue + + except Exception as e: + logger.warning(f"⚠️ Error listing agents from global_agent_registry: {e}") + + return registry_agents + + def load_all_registered_agents( - agents_dir: Optional[str] = None, - exclude_names: Optional[List[str]] = None + agents_dir: Optional[str] = None, + exclude_names: Optional[List[str]] = None ) -> List[BaseAgent]: """ - Load all registered agents and extract their Agent instances. - + Load all registered agents from global_agent_registry. + This function: - 1. Initializes agents from the specified directory (or current directory) - 2. Gets all registered LocalAgent instances - 3. Extracts Agent instances from each LocalAgent's swarm - 4. Returns a list of all extracted Agent instances - + 1. Initializes agents from the specified directory (or current directory) if needed + 2. Gets all registered agent names from global_agent_registry + 3. Loads each agent from the registry + 4. Returns a list of all loaded Agent instances + Args: - agents_dir: Directory to load agents from. If None, uses current working directory + agents_dir: Directory to initialize agents from. If None, uses current working directory. + This is used to ensure agents are loaded into the registry before querying. exclude_names: List of agent names to exclude (e.g., ["Aworld"] to exclude self) - + Returns: - List of BaseAgent instances from all registered agents - + List of BaseAgent instances from all registered agents in global_agent_registry + Example: >>> agents = load_all_registered_agents(exclude_names=["Aworld"]) >>> print(f"Loaded {len(agents)} sub-agents") """ if exclude_names is None: exclude_names = [] - + + logger.info(f"🔄 Starting to load registered agents (exclude: {exclude_names if exclude_names else 'none'})") + # Initialize agents from directory if provided if agents_dir: + logger.info(f"📁 Initializing agents from directory: {agents_dir}") try: init_agents(agents_dir) + logger.info(f"✅ Successfully initialized agents from {agents_dir}") except Exception as e: logger.warning(f"⚠️ Failed to initialize agents from {agents_dir}: {e}") else: # Try to initialize from current working directory + logger.debug(f"📁 Attempting to initialize agents from current working directory") try: init_agents() + logger.debug(f"✅ Successfully initialized agents from current directory") except Exception as e: - logger.debug(f"Could not initialize agents from current directory: {e}") - + logger.debug(f"ℹ️ Could not initialize agents from current directory: {e} (this is usually fine)") + # Get all registered agents registered_agents = LocalAgentRegistry.list_agents() + logger.info(f"📋 Found {len(registered_agents)} registered agent(s) in LocalAgentRegistry") all_agent_instances = [] + skipped_count = 0 + failed_count = 0 + no_swarm_count = 0 + empty_swarm_count = 0 + success_count = 0 for local_agent in registered_agents: # Skip excluded agents if local_agent.name in exclude_names: logger.debug(f"⏭️ Skipping excluded agent: {local_agent.name}") + skipped_count += 1 continue + logger.debug(f"🔍 Processing agent: {local_agent.name}") try: # Try to get swarm without context first swarm = None + swarm_type = None + swarm_id = 'N/A' + swarm_name = 'N/A' try: # For sync callables or direct instances if isinstance(local_agent.swarm, Swarm): swarm = local_agent.swarm + swarm_type = "Swarm instance" + swarm_id = swarm.id() if hasattr(swarm, 'id') else 'N/A' + swarm_name = swarm.name() if hasattr(swarm, 'name') else 'N/A' + logger.debug(f" ✓ Found Swarm instance for {local_agent.name} [Swarm ID: {swarm_id}, Swarm Name: {swarm_name}]") elif callable(local_agent.swarm): # Try calling without context import inspect sig = inspect.signature(local_agent.swarm) if len(sig.parameters) == 0: swarm = local_agent.swarm() + swarm_type = "callable (no params)" + swarm_id = swarm.id() if hasattr(swarm, 'id') else 'N/A' + swarm_name = swarm.name() if hasattr(swarm, 'name') else 'N/A' + logger.debug(f" ✓ Created Swarm from callable (no params) for {local_agent.name} [Swarm ID: {swarm_id}, Swarm Name: {swarm_name}]") + else: + swarm_type = "callable (requires context)" + logger.debug(f" ℹ️ Swarm is callable but requires context for {local_agent.name}") except Exception as e: - logger.debug(f"Could not get swarm for {local_agent.name} without context: {e}") + logger.debug(f" ⚠️ Could not get swarm for {local_agent.name} without context: {e}") if swarm: # Extract agents from swarm extracted_agents = extract_agents_from_swarm(swarm) if extracted_agents: + # Get agent names and IDs + agent_info_list = [] + for agent in extracted_agents: + agent_name = agent.name() if hasattr(agent, 'name') else str(type(agent).__name__) + agent_id = agent.id() if hasattr(agent, 'id') else 'N/A' + agent_info_list.append(f"{agent_name}[ID: {agent_id}]") + all_agent_instances.extend(extracted_agents) - logger.info(f"✅ Loaded {len(extracted_agents)} agent(s) from {local_agent.name}") + success_count += 1 + logger.info(f"✅ Loaded {len(extracted_agents)} agent(s) from '{local_agent.name}' (swarm type: {swarm_type}, Swarm ID: {swarm_id}, Swarm Name: {swarm_name}):") + for agent_info in agent_info_list: + logger.info(f" • {agent_info}") else: - logger.warning(f"⚠️ No agents extracted from {local_agent.name}") + logger.warning(f"⚠️ No agents extracted from '{local_agent.name}' swarm (swarm type: {swarm_type})") + empty_swarm_count += 1 else: - logger.debug(f"⚠️ Could not get swarm for {local_agent.name} (may require context)") - + logger.debug( + f"⚠️ Could not get swarm for '{local_agent.name}' (swarm type: {swarm_type or 'unknown'}, may require context)") + no_swarm_count += 1 + except Exception as e: - logger.warning(f"⚠️ Failed to load agents from {local_agent.name}: {e}") + logger.warning(f"❌ Failed to load agents from '{local_agent.name}': {e}") + failed_count += 1 continue - - logger.info(f"📊 Total loaded {len(all_agent_instances)} sub-agent(s) from registered agents") + + # Load agents from global_agent_registry + try: + logger.info(f"🔄 Loading agents from global_agent_registry...") + # Get list of all agent names from global_agent_registry + try: + # Try to get existing event loop + try: + loop = asyncio.get_event_loop() + if loop.is_running(): + # If loop is running, we need to use a different approach + # Use asyncio.create_task or run in a new thread + import concurrent.futures + with concurrent.futures.ThreadPoolExecutor() as executor: + future = executor.submit(asyncio.run, _load_agents_from_global_registry(exclude_names)) + registry_agents = future.result(timeout=30) + else: + registry_agents = loop.run_until_complete(_load_agents_from_global_registry(exclude_names)) + except RuntimeError: + # No event loop exists, create a new one + registry_agents = asyncio.run(_load_agents_from_global_registry(exclude_names)) + except Exception as e: + logger.warning(f"⚠️ Failed to load agents from global_agent_registry: {e}") + registry_agents = [] + + if registry_agents: + all_agent_instances.extend(registry_agents) + logger.info(f"✅ Loaded {len(registry_agents)} agent(s) from global_agent_registry") + for agent in registry_agents: + agent_name = agent.name() if hasattr(agent, 'name') else str(type(agent).__name__) + agent_id = agent.id() if hasattr(agent, 'id') else 'N/A' + logger.info(f" • {agent_name} [ID: {agent_id}]") + except Exception as e: + logger.warning(f"⚠️ Error loading agents from global_agent_registry: {e}") + + # Summary log + logger.info(f"📊 Load summary:") + logger.info(f" • Total registered agents: {len(registered_agents)}") + logger.info(f" • Skipped (excluded): {skipped_count}") + logger.info(f" • Successfully loaded: {len(all_agent_instances)} sub-agent(s) from {success_count} agent(s)") + + # List all loaded agent instances with their IDs + if all_agent_instances: + logger.info(f" • Loaded agent instances:") + for agent in all_agent_instances: + agent_name = agent.name() if hasattr(agent, 'name') else str(type(agent).__name__) + agent_id = agent.id() if hasattr(agent, 'id') else 'N/A' + logger.info(f" - {agent_name} [ID: {agent_id}]") + + if no_swarm_count > 0: + logger.info(f" • No swarm available: {no_swarm_count}") + if empty_swarm_count > 0: + logger.info(f" • Empty swarms: {empty_swarm_count}") + if failed_count > 0: + logger.warning(f" • Failed to load: {failed_count}") + return all_agent_instances +def build_context_config(debug_mode): + config = get_default_config() + config.debug_mode = debug_mode + config.agent_config = AgentContextConfig( + enable_system_prompt_augment=True, + neuron_names=["skills"], + history_scope='session' + ) + config.env_config = ContextEnvConfig() + return config + + @agent( name="Aworld", - desc="Aworld is a versatile AI assistant that can execute tasks directly or delegate to specialized agent teams. Use when you need: (1) General-purpose task execution, (2) Complex multi-step problem solving, (3) Coordination of specialized agent teams, (4) Adaptive task handling that switches between direct execution and team delegation" + desc="Aworld is a versatile AI assistant that can execute tasks directly or delegate to specialized agent teams. Use when you need: (1) General-purpose task execution, (2) Complex multi-step problem solving, (3) Coordination of specialized agent teams, (4) Adaptive task handling that switches between direct execution and team delegation", + context_config=build_context_config( + debug_mode=True, + ), + unique=True ) def build_aworld_agent(include_skills: Optional[str] = None): """ Build the Aworld agent with integrated capabilities for direct execution and team delegation. - + This agent is equipped with: - Comprehensive tool access for direct task execution - Agent team delegation capabilities - Multiple skills for various task types - Adaptive execution strategy (direct vs. team-based) - + The agent can: 1. Execute tasks directly using available tools and skills 2. Delegate complex tasks to specialized agent teams 3. Coordinate multi-agent workflows when needed 4. Adapt execution strategy based on task complexity - + Args: include_skills (str, optional): Specify which skills to include. - Comma-separated list: "notify,bash" (exact match for each name) - Regex pattern: "notify.*" (pattern match) - If None, uses INCLUDE_SKILLS environment variable or loads all skills - + Returns: TeamSwarm: A TeamSwarm instance containing the Aworld agent - + Example: >>> agent = build_aworld_agent() >>> # Agent can execute tasks directly or delegate to teams @@ -377,15 +436,49 @@ def build_aworld_agent(include_skills: Optional[str] = None): # Load custom skills from skills directory SKILLS_DIR = cur_dir / "skills" - print(f"agent_config: {cur_dir}") - - - # Support skill filtering via parameter or environment variable - if include_skills is None: - include_skills = os.environ.get("INCLUDE_SKILLS") + logger.info(f"agent_config: {cur_dir}") + # Load custom skills from skills directory CUSTOM_SKILLS = collect_skill_docs(SKILLS_DIR) + # Load additional skills from SKILLS_PATH environment variable (single directory) + skills_path_env = os.environ.get("SKILLS_PATH") + if skills_path_env: + try: + logger.info(f"📚 Loading skills from SKILLS_PATH: {skills_path_env}") + additional_skills = collect_skill_docs(skills_path_env) + if additional_skills: + # Merge additional skills into CUSTOM_SKILLS + # If skill name already exists, log a warning but keep the first one found + for skill_name, skill_data in additional_skills.items(): + if skill_name in CUSTOM_SKILLS: + logger.warning( + f"⚠️ Duplicate skill name '{skill_name}' found in SKILLS_PATH '{skills_path_env}', skipping") + else: + CUSTOM_SKILLS[skill_name] = skill_data + logger.info(f"✅ Loaded {len(additional_skills)} skill(s) from SKILLS_PATH") + else: + logger.debug(f"ℹ️ No skills found in SKILLS_PATH: {skills_path_env}") + except Exception as e: + logger.warning(f"⚠️ Failed to load skills from SKILLS_PATH '{skills_path_env}': {e}") + + # Ensure all skills have skill_path for context_skill_tool to work + # collect_skill_docs already includes skill_path, but we verify and add if missing + for skill_name, skill_config in CUSTOM_SKILLS.items(): + if "skill_path" not in skill_config: + # Try to infer skill_path from skill name and SKILLS_DIR + potential_skill_path = SKILLS_DIR / skill_name / "SKILL.md" + if not potential_skill_path.exists(): + potential_skill_path = SKILLS_DIR / skill_name / "skill.md" + if potential_skill_path.exists(): + skill_config["skill_path"] = str(potential_skill_path.resolve()) + logger.debug(f"✅ Added skill_path for skill '{skill_name}': {skill_config['skill_path']}") + else: + logger.warning( + f"⚠️ Skill '{skill_name}' has no skill_path and cannot be found in {SKILLS_DIR}, context_skill_tool may not work for this skill") + else: + logger.debug(f"✅ Skill '{skill_name}' has skill_path: {skill_config['skill_path']}") + # Combine all skills ALL_SKILLS = CUSTOM_SKILLS @@ -397,22 +490,21 @@ def build_aworld_agent(include_skills: Optional[str] = None): llm_provider=os.environ.get("LLM_PROVIDER"), llm_api_key=os.environ.get("LLM_API_KEY"), llm_base_url=os.environ.get("LLM_BASE_URL"), - params={"max_completion_tokens": os.environ.get("MAX_COMPLETION_TOKENS", 10240), "max_tokens": os.environ.get("MAX_TOKENS", 64000)} + params={"max_completion_tokens": os.environ.get("MAX_COMPLETION_TOKENS", 10240)} ), use_vision=False, # Enable if needed for image analysis - skill_configs=ALL_SKILLS + # skill_configs=ALL_SKILLS ) # Get current working directory for filesystem-server current_working_dir = os.getcwd() - + # Create the Aworld agent aworld_agent = Agent( name="Aworld", desc="Aworld - A versatile AI assistant capable of executing tasks directly or delegating to agent teams", conf=agent_config, system_prompt=aworld_system_prompt, - human_tools=[HUMAN] ) # Load all registered agents as sub-agents @@ -422,17 +514,15 @@ def build_aworld_agent(include_skills: Optional[str] = None): agents_dir=None, # Use default (current directory) exclude_names=["Aworld"] # Exclude self to avoid circular reference ) - + if sub_agents: logger.info(f"🤝 Adding {len(sub_agents)} sub-agent(s) to Aworld TeamSwarm") # Create TeamSwarm with Aworld as leader and all other agents as sub-agents - return TeamSwarm(aworld_agent, *sub_agents) + return TeamSwarm(aworld_agent, *sub_agents, max_steps=100) else: logger.info("ℹ️ No sub-agents found, creating Aworld TeamSwarm without sub-agents") return TeamSwarm(aworld_agent) except Exception as e: - - logger.warning(f"⚠️ Failed to load sub-agents: {e}, creating Aworld TeamSwarm without sub-agents") return TeamSwarm(aworld_agent) diff --git a/aworld-cli/src/aworld_cli/inner_plugins/smllc/skills/agent-creator/SKILL.md b/aworld-cli/src/aworld_cli/inner_plugins/smllc/skills/agent-creator/SKILL.md deleted file mode 100644 index ee9e2de8c..000000000 --- a/aworld-cli/src/aworld_cli/inner_plugins/smllc/skills/agent-creator/SKILL.md +++ /dev/null @@ -1,571 +0,0 @@ ---- -name: agent-creator -description: "Guide for creating Multi-Agent System (MAS) implementations. Use this skill when you need to create agent teams with different coordination patterns: centralized (TeamSwarm with orchestrator) or decentralized (Swarm workflow). Uses filesystem-server tools (write_file, edit_file, read_file) to create complete agent team structures in ./agents/ directory with proper configuration, prompts, and swarm definitions based on task requirements." ---- -# Agent Creator - -This skill provides guidance and templates for creating Multi-Agent System (MAS) implementations in the AWorld framework. It supports two main coordination patterns: centralized (TeamSwarm) and decentralized (Swarm). - -## Quick Start - -**How This Skill Works:** -- 🛠️ **Uses MCP Tools**: Leverages filesystem-server's `write_file`, `edit_file`, and `read_file` tools -- 📁 **Target Location**: Creates all agent structures in `./agents/` directory -- 🔄 **Auto-Integration**: Works seamlessly with aworld_cli's agent discovery -- 📋 **Template-Based**: Uses built-in templates and examples to guide implementation - -**Typical Workflow:** -1. **Analyze**: Understand task requirements and determine MAS type (TeamSwarm or Swarm) -2. **Design**: Plan agent structure, roles, and responsibilities -3. **Create**: Use `write_file` to create directory structure and files in `./agents/{team_name}/` -4. **Customize**: Implement agent configs, prompts, and swarm initialization -5. **Verify**: Ensure all files are created with correct structure, imports, and **proper indentation** (top-level code at column 0) -6. **Validate Registration**: Run `aworld-cli list` to verify the agent is properly registered and discoverable -7. **Smoke Test**: Run `aworld-cli --task "Hello" --agent="YourTeamName"` to test basic agent functionality - -**Example:** -``` -User: "Create a research team with an orchestrator and two workers" - -Agent Actions: -1. write_file("./agents/ResearchTeam/__init__.py", "") -2. write_file("./agents/ResearchTeam/agents/__init__.py", "") -3. write_file("./agents/ResearchTeam/agents/orchestrator/config.py", "...") -4. write_file("./agents/ResearchTeam/agents/orchestrator/prompt.py", "...") -5. ... (repeat config.py and prompt.py for workers, no agent.py needed) -6. write_file("./agents/ResearchTeam/agents/swarm.py", "...") # Uses Agent class directly -7. Run: aworld-cli list # Verify ResearchTeam is registered and visible -8. Run: aworld-cli --task "Hello" --agent="ResearchTeam" # Smoke test basic functionality -``` - -**Key Simplification**: No need to create `agent.py` files - use `Agent` class directly in `swarm.py`! - -**⚠️ Critical: Python File Indentation** -- When creating Python files (`config.py`, `prompt.py`, `swarm.py`), **all top-level code must start at column 0** (no indentation) -- If using Python code to generate files, ensure string content doesn't have extra indentation -- Always verify generated files using `read_file` to check indentation is correct -- Example: `prompt.py` should start with `"""` at column 0, not indented - -## Overview - -Multi-Agent Systems can be structured in different ways depending on task complexity and coordination needs: - -1. **Centralized (TeamSwarm)**: Uses an orchestrator agent to coordinate and delegate tasks to specialized worker agents. Best for complex tasks requiring dynamic routing and coordination. - -2. **Decentralized (Swarm)**: Agents execute in a workflow sequence. Best for well-defined, sequential tasks where each agent has a specific role. - -## MAS Type Selection Guide - -### When to Use TeamSwarm (Centralized) - -Choose TeamSwarm when: - -- Task requires dynamic routing based on intermediate results -- Multiple specialized agents need coordination from a central decision-maker -- Task complexity requires adaptive planning and delegation -- Agent selection depends on context and task analysis -- Example: Browser-based research tasks where an orchestrator analyzes requirements and delegates to web browsing or coding agents - -**Architecture Pattern:** -``` -Orchestrator Agent (Leader) - ├── Agent 1 (Specialist) - ├── Agent 2 (Specialist) - └── Agent 3 (Specialist) -``` - -### When to Use Swarm (Decentralized) - -Choose Swarm when: - -- Task has a clear, sequential workflow -- Each agent performs a specific step in a pipeline -- Agent order is predetermined -- No dynamic routing or coordination needed -- Example: PPT generation workflow: analysis → planning → content → HTML generation - -**Architecture Pattern:** -``` -Agent 1 → Agent 2 → Agent 3 → Agent 4 -``` - -## Creating a MAS - -### Step 1: Analyze Task Requirements - -1. Identify the core task and its complexity -2. Determine if dynamic coordination is needed (TeamSwarm) or fixed sequence (Swarm) -3. List required agent roles and their responsibilities -4. Identify MCP servers and tools needed by each agent - -### Step 2: Choose MAS Type - -Refer to the selection guide above. If unsure, start with Swarm for simpler workflows; upgrade to TeamSwarm if dynamic coordination becomes necessary. - -### Step 3: Design Agent Structure - -For **TeamSwarm**: -- Design orchestrator agent with coordination logic -- Design specialized worker agents -- Define handoff patterns from orchestrator to workers - -For **Swarm**: -- Define sequential agent roles -- Establish data flow between agents -- Ensure each agent has clear input/output responsibilities - -### Step 4: Create Agent Implementation - -**Using Filesystem Tools:** - -Use `write_file` tool from filesystem-server to create files directly in `./agents/` directory: - -1. **Create Structure**: Use `write_file` to create files in `./agents/{team_name}/` directory - - Create `__init__.py` files for Python packages - - Create agent subdirectories under `agents/` folder (optional, can use Agent directly) - - Create configuration and prompt files - -2. **Customize Content**: Implement agent-specific configs, prompts, and swarm initialization - -**Important**: You can directly use `aworld.agents.llm_agent.Agent` class without creating custom Agent classes. Simply create: -- Agent configuration (`config.py`) - Optional, can use default AgentConfig -- System prompts (`prompt.py`) - Required for agent behavior -- Swarm initialization (`swarm.py`) - Required, instantiates Agent directly - -**Simplified Approach** (Recommended): -- Create `config.py` and `prompt.py` for each agent role -- In `swarm.py`, directly instantiate `Agent` class with config and prompt -- No need to create custom `agent.py` files unless you need custom behavior - -### Step 5: Implement Swarm Initialization - -Create `swarm.py` that: -- Imports `Agent` from `aworld.agents.llm_agent` (or custom agent classes if needed) -- Imports config and prompt from agent subdirectories (or defines them inline) -- Instantiates Agent instances directly with config and prompt -- Initializes the appropriate Swarm type (TeamSwarm or Swarm) -- Registers the team with `@agent` decorator from `aworld_cli.core` - -## Quick Start Examples - -### Example 1: Creating a TeamSwarm - -**Simplified approach using Agent directly:** - -```python -# agents/swarm.py -from aworld.core.agent.swarm import TeamSwarm -from aworld.core.context.amni.config import AmniConfigFactory -from aworld_cli.core import agent -from aworld.agents.llm_agent import Agent -from .orchestrator.config import orchestrator_config -from .orchestrator.prompt import orchestrator_prompt -from .worker.config import worker_config -from .worker.prompt import worker_prompt - -@agent( - name="MyTeam", - desc="Team for complex tasks", - context_config=AmniConfigFactory.create(debug_mode=True), - metadata={"version": "1.0.0", "creator": "aworld-cli"} -) -def build_swarm() -> TeamSwarm: - """ - Build and configure the MyTeam swarm. - - This creates a TeamSwarm with an orchestrator and worker agents. - Uses Agent class directly without custom agent classes. - - Returns: - Configured TeamSwarm instance with all agents - """ - orchestrator = Agent( - name="orchestrator", - desc="Coordinates tasks", - conf=orchestrator_config, - system_prompt=orchestrator_prompt, - ) - - worker = Agent( - name="worker", - desc="Executes tasks", - conf=worker_config, - system_prompt=worker_prompt, - ) - - return TeamSwarm(orchestrator, worker, max_steps=30) -``` - -**Note**: You can also define config and prompt inline if preferred, or create custom Agent classes only when you need custom behavior. - -### Example 2: Creating a Swarm - -```python -# agents/swarm.py -from aworld.core.agent.swarm import Swarm -from aworld.core.context.amni.config import AmniConfigFactory -from aworld_cli.core import agent -from aworld.agents.llm_agent import Agent - -@agent( - name="MySwarm", - desc="Swarm for sequential tasks", - context_config=AmniConfigFactory.create(debug_mode=True), - metadata={"version": "1.0.0", "creator": "aworld-cli"} -) -def build_swarm() -> Swarm: - """ - Build and configure the MySwarm sequential workflow. - - This creates a Swarm with agents executing in sequence. - - Returns: - Configured Swarm instance with sequential agents - """ - agent1 = Agent( - name="analysis_agent", - desc="Analyzes requirements", - conf=agent_config, - system_prompt=analysis_prompt, - ) - - agent2 = Agent( - name="execution_agent", - desc="Executes tasks", - conf=agent_config, - system_prompt=execution_prompt, - ) - - return Swarm(agent1, agent2, max_steps=30) -``` - -### Example 3: Creating a Single Agent (Markdown Format) - -**Simple single agent using Markdown format (recommended for simple agents):** - -```markdown ---- -name: DocumentAgent -description: A specialized AI agent focused on document management and generation using filesystem-server -mcp_servers: ["filesystem-server"] -mcp_config: { - "mcpServers": { - "filesystem-server": { - "type": "stdio", - "command": "npx", - "args": [ - "-y", - "@modelcontextprotocol/server-filesystem", - "~/workspace" - ] - } - } -} ---- -### 🎯 Mission -A document management assistant that helps you read, analyze, organize, and generate documents. - -### 💪 Core Capabilities -- **Document Reading & Analysis**: Read and analyze existing documents -- **Report Generation**: Generate reports from data files -- **Document Organization**: Organize documents into folders by category/date -- **Document Creation**: Create markdown documentation and summaries -- **Document Merging**: Merge multiple documents into one -- **Information Extraction**: Extract and summarize key information from files - -### 📥 Input Specification -Users can request: -- Document analysis: "Read all markdown files and create a summary" -- Report generation: "Generate a report from this CSV file" -- Document organization: "Organize my documents by date" -- Document creation: "Create a meeting notes template" -- Information extraction: "Extract key points from these documents" - -### 📤 Output Format -- Clear, structured document summaries -- Well-formatted reports and documents -- Logical folder structures -- Extracted key information - -### ✅ Usage Examples - -**Example 1: Document Summary** -``` -User: Read all markdown files in the docs folder and create a summary document -Agent: I'll read all markdown files, analyze their content, and create a comprehensive summary. -``` - -**Example 2: Report Generation** -``` -User: Generate a report from the data in this CSV file -Agent: I'll read the CSV file, analyze the data, and generate a formatted report. -``` - -**Example 3: Document Organization** -``` -User: Organize my documents by date into separate folders -Agent: I'll read the documents, extract their dates, and organize them into folders. -``` - -### 🎨 Guidelines -- Always read existing files before modifying them -- Create well-structured and formatted documents -- Organize documents logically -- Extract and present information clearly -- Ask clarifying questions if requirements are unclear -``` - -**File location**: `./agents/document_agent.md` - -**Note**: -- Markdown format is simpler and recommended for single agents -- YAML front matter defines agent configuration (name, description, mCP servers, etc.) -- Markdown body content becomes the system prompt -- No Python code needed - aworld_cli automatically loads `.md` files as agents -- Use Python format (Example 1 & 2) when you need more control or complex logic - -## Implementation Workflow - -### Using Filesystem Tools (Recommended) - -**Step-by-Step Process:** - -1. **Analyze Requirements** - - Determine MAS type (TeamSwarm or Swarm) - - Identify agent roles and responsibilities - - List required MCP servers and tools - -2. **Create Directory Structure** - ``` - Target location: ./agents/{team_name}/ - - For TeamSwarm (Simplified - using Agent directly): - ./agents/{team_name}/ - ├── __init__.py - └── agents/ - ├── __init__.py - ├── {orchestrator_name}/ - │ ├── __init__.py - │ ├── config.py # Optional, can use default - │ └── prompt.py # Required - ├── {worker1_name}/ - │ ├── __init__.py - │ ├── config.py # Optional, can use default - │ └── prompt.py # Required - └── swarm.py # Instantiates Agent directly - - For Swarm (Simplified - using Agent directly): - ./agents/{swarm_name}/ - ├── __init__.py - └── agents/ - ├── __init__.py - ├── {agent1_name}/ - │ ├── __init__.py - │ ├── config.py # Optional - │ └── prompt.py # Required - ├── {agent2_name}/ - │ ├── __init__.py - │ ├── config.py # Optional - │ └── prompt.py # Required - └── swarm.py # Instantiates Agent directly - - Note: agent.py files are optional - only create them if you need custom Agent behavior. - Otherwise, use Agent class directly in swarm.py. - ``` - -3. **Create Files Using write_file** - - Use `write_file` tool to create each file - - Create `config.py` and `prompt.py` for each agent (config is optional) - - **Important**: You don't need to create `agent.py` files - use `Agent` class directly in `swarm.py` - - Customize agent prompts based on task needs - - Configure MCP servers and tools in config or directly in swarm.py - - **⚠️ CRITICAL: Python File Indentation** - - When creating Python files (`config.py`, `prompt.py`, `swarm.py`), ensure content starts at column 0 (no indentation) - - Top-level code (imports, constants, functions) should have NO leading spaces - - If using Python code to generate files, ensure string content doesn't have extra indentation - - Example of CORRECT format: - ```python - # prompt.py - CORRECT (no indentation) - """ - Agent System Prompt - """ - - AGENT_PROMPT = """ - Your prompt content here... - """ - ``` - - Example of WRONG format (has extra indentation): - ```python - # prompt.py - WRONG (has 4-space indentation) - """ - Agent System Prompt - """ - - AGENT_PROMPT = """ - Your prompt content here... - """ - ``` - - Always verify generated files have correct indentation before using them - - **⚠️ CRITICAL: When Generating Files with Python Code** - - If you're writing Python code that generates files (e.g., creating markdown files with multi-line strings), be aware of indentation issues: - - **Problem**: When Python code is executed in an indented context (function, if block, etc.), multi-line strings inherit that indentation, causing generated files to have unwanted leading spaces. - - **Solution 1: Use `textwrap.dedent()` (Recommended)** - ```python - from textwrap import dedent - - usage_guide = dedent("""\ - # Title - - ## Section - Content here... - """) - - with open("file.md", "w", encoding="utf-8") as f: - f.write(usage_guide) - ``` - - **Solution 2: Use `write_file` tool directly (Best Practice)** - Instead of generating Python code, use the `write_file` MCP tool directly: - ``` - write_file( - file_path="./agents/usage_guide.md", - content="# Title\n\n## Section\nContent here..." - ) - ``` - This avoids indentation issues entirely since the content is passed as a parameter, not embedded in code. - - **Solution 3: Start string at column 0, use explicit newlines** - ```python - usage_guide = """# Title - -## Section -Content here... -""" - # Note: First line starts immediately after """, no indentation - ``` - - **Common Mistake to Avoid:** - ```python - # WRONG - string content inherits indentation - def create_file(): - usage_guide = '''# Title - ## Section - Content here... - ''' - with open("file.md", "w") as f: - f.write(usage_guide) # File will have unwanted indentation! - ``` - - **Best Practice**: Always use `write_file` MCP tool directly instead of generating Python code that writes files. This is simpler, more reliable, and avoids indentation issues. - -4. **Verify Structure** - - Ensure all `__init__.py` files are created - - Check that imports are correct - - Validate configuration parameters - - **⚠️ CRITICAL: Verify Python file indentation** - - All top-level code (imports, constants, functions) must start at column 0 - - No leading spaces for module-level code - - Use `read_file` to verify generated files have correct indentation - - If files have incorrect indentation, use `edit_file` to fix them - -5. **Validate Agent Registration** - - Run `aworld-cli list` command to verify the agent is registered - - Check that your agent appears in the list with correct name and description - - If agent is not listed, check for: - - Python syntax errors in generated files (imports, indentation) - - Missing `@agent` decorator in `swarm.py` - - Incorrect package structure (missing `__init__.py` files) - - Example expected output: - ``` - Available Agents: - ✓ MyTeam - Team for complex tasks (version: 1.0.0) - ``` - -6. **Smoke Test Agent** - - Run a simple test to verify the agent can execute basic tasks - - Command: `aworld-cli --task "Hello" --agent="YourTeamName"` - - This ensures: - - Agent initialization works correctly - - MCP servers and tools are properly configured - - Agent can process and respond to basic queries - - No runtime errors in agent logic - - Example expected behavior: - ```bash - $ aworld-cli --task "Hello" --agent="MyTeam" - 🚀 Starting agent: MyTeam - 🤖 Processing task: Hello - ✅ Task completed successfully - ``` - - If errors occur, check: - - MCP server configurations in agent config - - Tool permissions and availability - - Agent prompt logic and handoff patterns - - System dependencies and environment setup - -**Key Points:** -- 📁 Always create in `./agents/` directory (auto-integrates with aworld_cli) -- 🛠️ Use filesystem-server tools: `write_file`, `edit_file`, `read_file` -- 📋 Follow examples and patterns in this guide -- 🔄 No need to run external scripts - all done through MCP tools - -### Alternative: Using Script (Optional) - -For quick prototyping, you can also use the provided script: - -```bash -# Creates structure in ./agents/MyTeam -python scripts/create_mas.py --type teamswarm --name MyTeam --agents orchestrator,worker1,worker2 - -# Creates structure in ./agents/MySwarm -python scripts/create_mas.py --type swarm --name MySwarm --agents agent1,agent2,agent3 -``` - -**Note**: The script approach is less flexible than using filesystem tools directly, as it generates boilerplate code that still requires customization. - -## Resources - -### scripts/ - -- `create_mas.py` - Script to generate MAS structure from command line -- `validate_structure.py` - Validates generated MAS structure - -### assets/ - -- `teamswarm_structure.txt` - Directory structure template for TeamSwarm -- `swarm_structure.txt` - Directory structure template for Swarm - -## Best Practices - -1. **Start Simple**: Begin with Swarm for straightforward workflows; upgrade to TeamSwarm only if dynamic coordination is needed. - -2. **Clear Agent Roles**: Each agent should have a single, well-defined responsibility. - -3. **Prompt Design**: System prompts should clearly define agent behavior, decision criteria, and interaction patterns. - -4. **Error Handling**: Implement robust error handling in orchestrator agents for TeamSwarm. - -5. **Testing**: Test each agent independently before integrating into the swarm. - -6. **Validate Registration**: Always run `aworld-cli list` after creating an agent to verify it's properly registered and discoverable by the system. - -7. **Smoke Test**: Run `aworld-cli --task "Hello" --agent="YourTeamName"` to perform a basic functionality test before deploying or using the agent in production workflows. - -8. **Documentation**: Document agent responsibilities, expected inputs/outputs, and coordination patterns. - -## Common Patterns - -### Pattern 1: Research Team (TeamSwarm) -Orchestrator analyzes task → delegates to research agents → synthesizes results - -### Pattern 2: Content Generation Pipeline (Swarm) -Analysis → Planning → Content Creation → Formatting - -### Pattern 3: Code Review Team (TeamSwarm) -Orchestrator routes code → different reviewers (security, style, performance) → aggregates feedback - -Refer to the Quick Start Examples section above for detailed implementation examples of each pattern. diff --git a/aworld-cli/src/aworld_cli/inner_plugins/smllc/skills/optimizer/SKILL.md b/aworld-cli/src/aworld_cli/inner_plugins/smllc/skills/optimizer/SKILL.md new file mode 100644 index 000000000..36b808f71 --- /dev/null +++ b/aworld-cli/src/aworld_cli/inner_plugins/smllc/skills/optimizer/SKILL.md @@ -0,0 +1,485 @@ +--- +name: optimizer +description: Analyzes and automatically optimizes an existing Agent's code by applying patches to improve performance, quality, security, and functionality. +tool_list: {"AGENT_REGISTRY": [], "CAST_ANALYSIS": [], "CAST_CODER": [], "CAST_SEARCH": []} +--- +# Agent Optimization Skill (Optimizer) + +## 📌 Mandatory Usage Guidelines +**CRITICAL: READ BEFORE USE.** Adherence to these rules is essential for the skill to function correctly. + +1. **Tool Calls are Direct**: + * ✅ **DO** call tool functions like `CAST_ANALYSIS(...)` and `CAST_CODER(...)` directly. + * ❌ **DO NOT** write or show Python code examples that import or manually implement tool logic (e.g., `from aworld.experimental.ast import ACast`). The tools are pre-loaded and ready for direct invocation. + +2. **`CAST_ANALYSIS` Query Format**: + * ✅ **DO** use **regular expression (regex) patterns** for all `search_ast` queries. + * *Example*: `.*MyClassName.*|.*my_function_name.*` + * ❌ **DO NOT** use natural language for `search_ast` queries. + * *Incorrect*: `"Show me the implementation of the MyClassName class"` + +3. **`CAST_CODER` Workflow**: + * ✅ **DO** use `CAST_CODER.generate_snapshot` to create a backup before any modifications. + * ✅ **DO** generate patch content (either structured JSON for `search_replace` or `diff` format text) based on your analysis. The LLM's role is to *create* the patch content. + * ✅ **DO** use `CAST_CODER` actions (like `search_replace`) to *apply* the generated patch content to the source code. + * ❌ **DO NOT** show Python lists of patches to the user (e.g., `patches = [...]`). + +4. **Patch Content Rules**: + * ✅ **DO** ensure each patch operation targets **only one file**. + * ✅ **DO** create focused patches that modify **one logical block of code at a time** for clarity and safety. + * ✅ **DO** verify code with `CAST_ANALYSIS.search_ast` to get accurate line numbers and context before generating a `diff`. + +## 📜 Skill Overview +The **Optimizer Skill** is an advanced agent capability designed to analyze and enhance other agents. It leverages Abstract Syntax Tree (AST) analysis to systematically improve an agent's behavior and performance. + +It achieves this by focusing on an agent's core behavioral drivers: its **system prompt** (which controls its reasoning and workflow) and its **tool configuration** (mcp_config.py) (which defines its capabilities). By intelligently patching these high-impact areas, the Optimizer can rapidly correct flaws and expand an agent's functionality. This skill treats the target agent as a codebase, applying static analysis and automated patching to achieve its goals. + +## ⭐ Strategic Optimization Focus +While this skill can perform any code modification, effective agent optimization primarily targets the two core behavioral drivers: The System Prompt and The Tool Configuration. Your analysis and proposed solutions must prioritize these areas. + +1. **The System Prompt (Primary Target)** +* **What it is**: The system_prompt string variable within the agent's main Python file (e.g., simple_agent.py). +* **Why it's critical**: It governs the agent's entire reasoning process, workflow logic, persona, current time awareness, constraints, and output format. Most behavioral problems (e.g., incorrect task sequencing, ignoring instructions, wrong output format, unawareness of the current date) are solved by refining the prompt code. +* **Your Action**: Analyze the prompt for ambiguity, missing steps, or weak constraints. Propose specific, surgical additions or modifications to the prompt text to correct the agent's behavior. + Example, to fix a workflow where the agent does A then C instead of A then B, you would strengthen the "Methodology & Workflow" section of its prompt. Example, to fix the agent's unawareness of the current time, you should add the dynamic argument (such as `datetime.now(ZoneInfo("Asia/Shanghai"))` with datetime and ZoneInfo explicitly imported in the simple_agent.py) as the current date with the corresponding description ('Your own data is cutoff to the year 2024, so current date is xxxx, please keep in mind!') in the prompt code, to let the agent be aware of the current time. + +2. **The Tool Configuration (mcp_config.py)** +* **What it is**: The mcp_config dictionary, typically in a dedicated mcp_config.py file. +* **Why it's critical**: It defines the agent's capabilities. A missing capability (e.g., inability to search the web, read a PDF) is almost always due to a missing tool entry in this configuration. +* **Your Action**: If an agent lacks a required function, your first step is to verify if the corresponding tool is missing from mcp_config.py. Add the necessary tool configuration block to grant the agent that capability. +* **MCP Configuration**: Which MCP servers (e.g., pptx, google) are required? The terminal server is a mandatory, non-negotiable tool for every agent you build. It is essential for two primary reasons: + * **Dependency Management**: Installing missing Python packages via pip install. + * **File System Operations**: Verifying the current location (pwd) and saving all output files to that consistent, predictable location. You must ensure this tool is always included. + +**Core Principle**: Always assume the problem lies in the system_prompt or mcp_config.py first. Only resort to modifying other parts of the Python code if the issue cannot be resolved through these two primary vectors (e.g., adding support for a dynamic variable in the prompt). + +## 🎯 Core Features +* **Agent Discovery**: Locates target agents within the environment using the `AGENT_REGISTRY`. +* **Deep Code Analysis**: Performs comprehensive AST-based analysis via the `CAST_ANALYSIS` tool to identify bottlenecks, security risks, and architectural flaws. +* **Intelligent Refactoring**: Generates specific, actionable optimization strategies and code modification plans based on the analysis. +* **Automated Patching**: Creates codebase snapshots and applies structured code changes using the `CAST_CODER` toolset. + +## 🔄 Core Workflow: Each time only use one tool call! +### Phase 1: Discovery and Selection +1. **Identify Target**: Receive an agent identifier (name, path, or description) from the user. +2. **Query Registry**: Call `AGENT_REGISTRY` to find the specified agent(s). +3. **Confirm Target**: Present the located agent's information to the user for confirmation. + +### Phase 2: Deep Code Analysis +1. **Invoke Analyzer**: Call the `CAST_ANALYSIS` tool with the target agent's path and a precise analysis query. The tool automatically performs a multi-faceted analysis: + * **Structure**: Class/function organization, module dependencies. + * **Complexity**: Cyclomatic and cognitive complexity scores. + * **Performance**: Potential bottlenecks, inefficient algorithms. + * **Quality**: Code style, comments, maintainability metrics. + * **Security**: Basic checks for common vulnerabilities. +2. **Interpret Results**: Process the structured report from `CAST_ANALYSIS` to classify issues by severity (High, Medium, Low) and formulate an initial optimization approach. + +### Phase 3: Deep Architecture Analysis & Fusion (MANDATORY) +This is where you demonstrate your architectural expertise. You will deconstruct reference agents to extract their core patterns and then fuse them into a new design. + +#### Part A: Deconstruction and Analysis +**1. Foundation Analysis (search) - MANDATORY** +- **Action:** This is your non-negotiable first step. You **MUST** locate the `search` agent using `AGENT_REGISTRY.list_desc`. Once found, you **MUST** read both its `SKILL.md` (using `CAST_SEARCH.read_file`) using `CAST_ANALYSIS.search_ast`. +- **Analysis:** Your goal is to internalize its foundational architecture: the `system_prompt` design, functions, the ReAct loop logic, error handling patterns, file I/O safety rules, and multi-tool coordination. This architecture is the mandatory baseline for all agents you build or modify with better quality. + +**2. Specialist Analysis (Other Relevant Agents)** +- **Goal:** To find a specialized agent whose unique logic can be fused with the search foundation. +- **Action (Discovering Specialists):** You must now methodically search both sources for a relevant specialist: + **Source 1: Built-in Agents** + - **Command:** Use the AGENT_REGISTRY tool to list all platform-provided skills. + ```text + AGENT_REGISTRY.list_desc(source_type="built-in") + ``` + - **Analysis:** Review the description of each agent returned from the command. Identify and select the agent whose purpose is most specifically aligned with the user's current request. + +- **Deep Dive Analysis:** Once you have selected the most relevant specialist agent, read its SKILL.md using `CAST_SEARCH.read_file`. You must now perform a comparative analysis against search. Ask yourself: + - What is this agent's "secret sauce"? What unique rules, steps, or principles are in its system prompt that are NOT in search's? + - How is its workflow different? Does it have a specific multi-step process for its domain (e.g., for financial analysis: 1. gather data, 2. perform calculation, 3. add disclaimer, 4. format output)? + - What are its specialized guardrails? What does it explicitly forbid or require? + +**This analysis is critical. You must identify the unique DNA of the specialist agent to be fused into your new design.** + +#### Part B: Synthesis and Fusion +**3. Architectural Fusion:** Now, you will construct the new agent's `system_prompt`. This is a fusion process, not a simple copy-paste. +- **Start with the Foundation:** Begin with the robust, general-purpose instruction set you analyzed from search (planning, tool use, file safety, etc.). +- **Inject the Specialization:** Carefully layer the specialist agent's "secret sauce" on top of the search foundation. This means integrating its unique workflow steps, domain-specific rules, and specialized output formats. The new prompt should feel like search's powerful engine has been custom-tuned for a specific purpose. + +**4. Tool Configuration:** Based on this fused architecture, define the final `mcp_config` and `tool_list`. It should include search's foundational tools (like terminal, search) plus any specialized tools required by the new task. + + +### Phase 4: Optimization Strategy +1. **Formulate Plan**: Based on the user's goal and the initial analysis, formulate a precise modification plan. Your plan must adhere to the Strategic Optimization Focus: +* **Analyze High-Impact Files**: Your first step is to call CAST_ANALYSIS.search_ast to retrieve the contents of the agent's main file (to inspect the system_prompt) and its mcp_config.py. +* **Prioritize Prompt/Tooling**: Determine if the problem can be solved by modifying the system_prompt or adding/editing a tool in mcp_config.py. This is the preferred solution for most behavioral and capability issues. +* **Fallback to Code Logic**: If and only if the optimization cannot be achieved through the prompt or tool configuration, identify the specific Python code block that needs to be refactored. +2. **Generate Operations**: Create a list of specific modification operations (e.g., a JSON object for CAST_CODER.search_replace). Each operation must be atomic, targeting a single code block in a single file. + +### Phase 5: Snapshot and Patching +1. **Create Snapshot**: **Crucial first step.** Call `CAST_CODER.generate_snapshot` with the target agent's directory to create a compressed backup (`.tar.gz`). This ensures a safe rollback point. +2. **Apply Patches**: Execute the modification plan by calling `CAST_CODER` operations. The preferred method is `search_replace` for its precision and resilience to formatting differences. + * Each operation should be atomic and target a single file. +3. **Verify Changes**: After patching, perform a quick check to ensure the code remains valid and the change was applied as expected. + +### Phase 6: Verification and Reporting +1. **Validate Effects**: (Optional but recommended) Run unit tests or a basic functional check to ensure no regressions were introduced. Compare pre- and post-optimization metrics if applicable. +2. **Generate Report**: Summarize the analysis findings, the list of applied changes, and the expected benefits for the user. + +### Phase 7: Dynamic Registration +**MANDATORY FINAL STEP:** Register the newly optimized agent to make it discoverable and usable within the current swarm. + +* **Tool**: `AGENT_REGISTRY` +* **Action**: `dynamic_register` +* **Parameters**: + * `local_agent_name`: The name of the agent executing this workflow (must be "Aworld"). + * `register_agent_name`: The snake_case name of the optimized agent (must match the `@agent` decorator). +* **Example**: + ```json + AGENT_REGISTRY.dynamic_register(local_agent_name="Aworld", register_agent_name="optimized_simple_agent") + ``` + +--- +## 🛠️ Tool Reference + +
+

AGENT_REGISTRY Tool

+ +**Purpose**: Discover and retrieve information about existing agents. + +**Actions**: +* `query()`: Search for agents by name, description, or other metadata. +* `dynamic_register()`: Register a new or modified agent into the current environment's registry, making it active. + +**Usage**: Essential for the first (Discovery) and last (Registration) steps of the workflow. + +
+ +
+

CAST_ANALYSIS Tool

+ +**Purpose**: Perform deep, AST-based static analysis of Python code. + +**Primary Actions**: +* `analyze_repository()`: Conduct a broad analysis of an entire agent directory to find symbols, complexities, and potential issues. +* `search_ast()`: Fetch the precise source code for specific symbols (classes, functions) or line ranges. + +**Critical Usage Note for `search_ast`**: +The `analysis_query` for this action **MUST** be a regular expression. Natural language queries are not supported and will fail. + +* ✅ **Correct (Regex)**: `user_query=".*MyClass.*|.*my_function.*"` +* ❌ **Incorrect (Natural Language)**: `user_query="Find the MyClass class and the my_function function"`, `user_query=".*mcp_config\\.py."`, `user_query=".*"` + +**Output**: Returns structured JSON data containing detailed information about the code's structure, complexity, and identified issues, which serves as the foundation for the optimization strategy. + +
+ +
+

CAST_CODER Tool

+ +**Purpose**: A suite of functions for safely modifying source code files. It handles operations like creating backups and applying intelligent code replacements. + +--- +#### **Action: `generate_snapshot`** + +Creates a compressed (`.tar.gz`) backup of a source directory before modifications are applied. + +* **Parameters**: + * `target_dir`: The path to the directory to be backed up. +* **Usage**: This should **always** be the first action in the patching phase to ensure recoverability. + +--- +#### **Action: `search_replace`** + +Intelligently finds and replaces a block of code in a specified file. This is the **preferred method for applying patches** as it is robust against minor formatting differences. It is based on `aider`'s core matching algorithm. + +**Key Features**: +* **Exact Match**: First attempts a direct, character-for-character match. +* **Whitespace Flexible Match**: If an exact match fails, it retries while ignoring differences in leading whitespace and indentation. This handles most copy-paste formatting issues. +* **Similarity Match**: (Optional) If other methods fail, uses a fuzzy text similarity algorithm to find the best match. + +**How to Call**: +The operation is defined in a JSON string passed to the `operation_json` parameter. + +```python +# Conceptual tool call +action_params = { + "operation_json": json.dumps({ + "operation": { + "type": "search_replace", + "file_path": "path/to/your/file.py", + "search": "CODE_BLOCK_TO_FIND", + "replace": "NEW_CODE_BLOCK", + "exact_match_only": true + } + }), + "source_dir": "/path/to/agent/root", // Base directory for the operation + "show_details": True +} +CAST_CODER.search_replace(**action_params) +``` + +**JSON Parameters**: + +| Parameter | Type | Required | Description | +| ---------------------- | ------- | :------: |-----------------------------------------------------------| +| `type` | string | ✓ | Must be `"search_replace"`. | +| `file_path` | string | ✓ | The relative path to the file from `source_dir`. | +| `search` | string | ✓ | This field must contain one or more complete lines of the source code. | +| `replace` | string | ✓ | The multi-line code block to replace it with. | +| `exact_match_only` | boolean | - | fixed as true (Optional, for documentation purposes only) | + +**Best Practices**: +* search: The multi-line code block to search for. + * Use multi-line `search` blocks that include structural context (like `def` or `class` lines) for better accuracy. + * must not be blank! + * If the content consists of multiple lines, the content must be continuous and match the source code. + +
+ +--- + +## 📚 Agent Code Structure Reference (Few-Shot Examples) + +**⚠️ IMPORTANT**: The following code examples illustrate the standard AWorld agent structure. When generating patch content (`diff` format or for `search_replace`), you **MUST** ensure the resulting code adheres to these conventions to maintain compatibility and correctness within the framework. Pay close attention to imports, class definitions, decorators, and method signatures. + +### Standard Agent Code Structure (`simple_agent.py`) +```python +import os +from typing import Dict, Any, List + +from aworld.agents.llm_agent import Agent +from aworld.config import AgentConfig, ModelConfig +from aworld.core.agent.swarm import Swarm +from aworld.core.common import Observation, ActionModel +from aworld.core.context.base import Context +from aworld.core.event.base import Message +# use logger to log +from aworld.logs.util import logger +from aworld.runners.hook.hook_factory import HookFactory +from aworld.runners.hook.hooks import PreLLMCallHook, PostLLMCallHook +from aworld_cli.core import agent +from aworld.sandbox.base import Sandbox +from simple_agent.mcp_config import mcp_config + +@HookFactory.register(name="pre_simple_agent_hook") +class PreSimpleAgentHook(PreLLMCallHook): + """Hook triggered before LLM execution. Used for monitoring, logging, etc. Should NOT modify input/output content.""" + + async def exec(self, message: Message, context: Context = None) -> Message: + # Important: This if-check cannot be removed and must match the current agent's name (here 'simple_agent'). + # This ensures the Hook only processes messages belonging to the current agent, avoiding side effects on other agents. + if message.sender.startswith('simple_agent'): + # ⚠️ Important Note: The Message object (aworld.core.event.base.Message) is the communication carrier between agents in AWorld. + # It uses the 'payload' attribute to carry actual data, distinct from a direct 'content' attribute. + # In PreLLMCallHook, message.payload is usually an Observation object. To access content, use message.payload.content. + # Incorrect Example: message.content # ❌ AttributeError: 'Message' object has no attribute 'content' + # Correct Example: message.payload.content if hasattr(message.payload, 'content') else None # ✅ + # Note: Do not modify message.payload or other input/output content here. + # Hooks should be used for: + # - Logging and monitoring + # - Counting calls and performance metrics + # - Permission checks or auditing + # - Other auxiliary functions that do not affect I/O + pass + return message + + +@HookFactory.register(name="post_simple_agent_hook") +class PostSimpleAgentHook(PostLLMCallHook): + """Hook triggered after LLM execution. Used for monitoring, logging, etc. Should NOT modify input/output content.""" + + async def exec(self, message: Message, context: Context = None) -> Message: + # Important: This if-check cannot be removed and must match the current agent's name (here 'simple_agent'). + # This ensures the Hook only processes messages belonging to the current agent. + if message.sender.startswith('simple_agent'): + # Note: Do not modify input/output content (like message.content) here. + # Hooks should be used for: + # - Logging and monitoring + # - Counting calls and performance metrics + # - Result auditing or quality checks + # - Other auxiliary functions that do not affect I/O + pass + return message + + +class SimpleAgent(Agent): + """A minimal Agent implementation capable of performing basic LLM calls.""" + + async def async_policy(self, observation: Observation, info: Dict[str, Any] = {}, message: Message = None, + **kwargs) -> List[ActionModel]: + # Important Notes: + # 1. async_policy represents the model invocation; calling super().async_policy directly completes the LLM call. + # 2. Do not modify the observation object within async_policy; the observation should remain immutable. + # 3. Hooks (PreSimpleAgentHook and PostSimpleAgentHook) are strictly for monitoring/logging auxiliary functions + # and should never modify input/output content. + return await super().async_policy(observation, info, message, **kwargs) + + +@agent( + # ⚠️ CRITICAL: name MUST be lowercase words connected by underscores (snake_case) + # - ✅ CORRECT: "simple_agent", "my_custom_agent", "data_processor" + # - ❌ WRONG: "SimpleAgent", "my-agent", "MyAgent", "simpleAgent", "simple agent" + # - name should be unique and match the filename (without .py extension) + name="simple_agent", + desc="A minimal agent that can perform basic LLM calls" +) +def build_simple_swarm(): + # Create Agent configuration + agent_config = AgentConfig( + llm_config=ModelConfig( + llm_model_name=os.environ.get("LLM_MODEL_NAME", "gpt-3.5-turbo"), + llm_provider=os.environ.get("LLM_PROVIDER", "openai"), + llm_api_key=os.environ.get("LLM_API_KEY"), + llm_base_url=os.environ.get("LLM_BASE_URL", "https://api.openai.com/v1"), + llm_temperature=float(os.environ.get("LLM_TEMPERATURE", "0.1")), # temperature = 0.1 is preferred, while the thus built agent is conducting coding or other serious tasks. + params={"max_completion_tokens": 40960} + ) + ) + + # Extract all server keys from mcp_config + mcp_servers = list(mcp_config.get("mcpServers", {}).keys()) + + # Mandatory Use - You must use this. + sandbox = Sandbox( + mcp_config=mcp_config + ) + sandbox.reuse = True + + # Create SimpleAgent instance + simple_agent = SimpleAgent( + name="simple_agent", + desc="A simple AI Agent specific for basic LLM calls and tool execution", + conf=agent_config, + # Note: If the Agent needs to read/write files, remind the agent in the system_prompt to use absolute paths. + # Relative paths should be avoided. Use os.path.abspath() or Path(__file__).parent to resolve paths. + system_prompt="""You are an all-capable AI assistant aimed at solving any task presented by the user. + + """, + mcp_servers=mcp_servers, + mcp_config=mcp_config, + sandbox=sandbox + ) + + # Return the Swarm containing this Agent + return Swarm(simple_agent) +``` + +### Standard MCP Configuration (`mcp_config.py`) +```python +mcp_config = { + "mcpServers": { + "csv": { + "command": "python", + "args": [ + "-m", + "examples.gaia.mcp_collections.documents.mscsv" + ], + "env": { + }, + "client_session_timeout_seconds": 9999.0 + }, + "docx": { + "command": "python", + "args": [ + "-m", + "examples.gaia.mcp_collections.documents.msdocx" + ], + "env": { + }, + "client_session_timeout_seconds": 9999.0 + }, + "download": { + "command": "python", + "args": [ + "-m", + "examples.gaia.mcp_collections.tools.download" + ], + "env": { + }, + "client_session_timeout_seconds": 9999.0 + }, + "xlsx": { + "command": "python", + "args": [ + "-m", + "examples.gaia.mcp_collections.documents.msxlsx" + ], + "env": { + }, + "client_session_timeout_seconds": 9999.0 + }, + "image": { + "command": "python", + "args": [ + "-m", + "examples.gaia.mcp_collections.media.image" + ], + "env": { + }, + "client_session_timeout_seconds": 9999.0 + }, + "pdf": { + "command": "python", + "args": [ + "-m", + "examples.gaia.mcp_collections.documents.pdf" + ], + "env": { + }, + "client_session_timeout_seconds": 9999.0 + }, + "pptx": { + "command": "python", + "args": [ + "-m", + "examples.gaia.mcp_collections.documents.mspptx" + ], + "env": { + }, + "client_session_timeout_seconds": 9999.0 + }, + "search": { + "command": "python", + "args": [ + "-m", + "examples.gaia.mcp_collections.tools.search" + ], + "env": { + "GOOGLE_API_KEY": "${GOOGLE_API_KEY}", + "GOOGLE_CSE_ID": "${GOOGLE_CSE_ID}" + }, + "client_session_timeout_seconds": 9999.0 + }, + "terminal": { + "command": "python", + "args": [ + "-m", + "examples.gaia.mcp_collections.tools.terminal" + ] + }, + "txt": { + "command": "python", + "args": [ + "-m", + "examples.gaia.mcp_collections.documents.txt" + ], + "env": { + }, + "client_session_timeout_seconds": 9999.0 + }, + "ms-playwright": { + "command": "npx", + "args": [ + "@playwright/mcp@latest", + "--no-sandbox", + "--isolated", + "--output-dir=/tmp/playwright", + "--timeout-action=10000", + ], + "env": { + "PLAYWRIGHT_TIMEOUT": "120000", + "SESSION_REQUEST_CONNECT_TIMEOUT": "120" + } + } + } +} +``` \ No newline at end of file diff --git a/aworld-cli/src/aworld_cli/inner_plugins/smllc/skills/ppt/SKILL.md b/aworld-cli/src/aworld_cli/inner_plugins/smllc/skills/ppt/SKILL.md new file mode 100644 index 000000000..040229ee5 --- /dev/null +++ b/aworld-cli/src/aworld_cli/inner_plugins/smllc/skills/ppt/SKILL.md @@ -0,0 +1,1177 @@ +--- +name: ppt +description: Professional PPT generation skill that combines orchestrator (outline generation), template (HTML style template), and content (slide content) generation. Generates complete PowerPoint presentations with structured outlines, custom HTML templates, and rich slide content. +--- + +# PPT Generation Skill + +你是一位专业的 PPT 生成专家,负责将用户需求转化为完整的演示文稿。你的工作分为三个阶段:**大纲生成(Orchestrator)**、**模板设计(Template)**和**内容生成(Content)**。 + +## 工作流程概览 + +1. **阶段一:大纲生成(Orchestrator)** - 分析用户需求,生成结构化的 PPT 大纲和布局预判 +2. **阶段二:模板设计(Template)** - 根据主题和大纲,设计定制化的 HTML 风格模板 +3. **阶段三:内容生成(Content)** - 基于大纲、模板和布局预判,生成每页的 HTML 幻灯片内容 + +--- + +## 阶段一:大纲生成(Orchestrator) + + + + 1. **禁止沟通**:严禁提出任何问题或解释,必须直接输出结果。 + 2. **强制输出**:即使输入信息包含系统错误或极度模糊,也必须提取其中可能的关键词"脑补"生成大纲。若完全无关键词,请参考对话的上下文。 + 3. **指令绝对优先**:用户指定的任何页面内容、位置或布局是最高准则,必须强制执行,严禁被"脑补逻辑"覆盖。 + 4. **纯净JSON**:输出必须且仅能包含一个标准的 JSON 对象。禁止输出任何思考过程(COT)、开场白(如"好的")或结语(如"希望对您有帮助")。 + + + + 你是一位运行在自动化流水线后端的、精通视觉叙事与逻辑架构的 PPT 策划引擎。你的任务是将用户输入转译为逻辑严密、结构完整、视觉布局预判精准的全量大纲。你必须根据用户提供的"信息密度"自动切换工作模式: + - **提炼模式**:若用户提供了详细的分页内容,你负责将其精炼为"结论性标题+结构化论据"。 + - **扩充模式**:若用户只给出了主题或部分页面,你负责按照专业逻辑(总分总/背景-方案-价值)补全缺失环节。 + 你必须确保全篇 PPT 既满足用户的特定个性化需求,又具备专业演示文稿的起承转合逻辑,确保每一页的 `layout_prediction` 完美契合用户的显性需求或内容的内在逻辑。 + !! 极端重要警告 !!:输出将直接进入自动化解析流水线,绝不允许输出任何人类可读的开场白或建议。 + + + + + - **意图锁定协议**:分析多轮对话上下文,必须识别出"当前活跃主题"。当用户发出制作指令时,以距离指令最近的、信息量最完整的话题作为 PPT 全文的核心。 + - **冲突自愈逻辑**:若历史对话涉及多个主题,严禁"张冠李戴"。必须确保封面 [title] 与各页 [content_summary] 处于同一语义场内。 + - **输入模式识别**: + * **全量罗列型**:用户按顺序指明了每一页的内容。此时,你必须保持原顺序,专注于内容的标题提炼和结构化重组。 + * **点状插页型**:用户要求"增加一页讲XX"。你需识别出此页最合适的插入逻辑位置(如"公司介绍"后插入"核心优势")。 + * **素材驱动型**:用户丢下一堆文字。你需根据文字逻辑将其拆解为 8-12 页的完整 PPT。 + - **双轨指令捕获**: + * **内容/位置锚点**:捕捉显性的页面要求(如"最后一页是鸣谢"、"第三页必须讲财务数据")。 + * **布局锚点**:识别用户要求的排版模式(如"左图右文"、"表格形式"、"三栏布局")。 + - 分析输入内容:若包含系统报错,请自动忽略报错信息,检索其中涉及的主题词。 + - 提取关键信息:PPT 主题、受众属性、页数限制(用户未指定则默认为 8-12 页,单页需求除外)、页面内容要求(如果用户指定的话)。 + + + + + - **全量需求判定**: + 1. **显式序列识别**:若用户明确指明了"第n页"或"最后一页"的内容(如"最后一页讲联系方式"),则将该内容作为 PPT 的终点。**严禁**在此之后自动添加"感谢页"。 + 2. **1页硬约束**:若用户明确只要1页,**强制**仅生成该内容页,严禁生成封面和收尾页。 + 3. **2页智能适配**: + * **场景 A(素材充沛)**:若用户提供的素材足以拆分为两个实质性知识点(如:产品功能+应用场景),则生成"内容页1 + 内容页2",**不生成**封面和收尾。 + * **场景 B(标准演示)**:若用户提供的素材高度聚焦单一主题(如:仅一段公司简介),则生成"封面页 + 内容页1",**不生成**收尾页。 + 4. **>2页逻辑补全**:按照总分总逻辑,在显式锚点间自动补全,并进入【收尾页触发判定】。 + - **核心序列锁定**: + 1. **还原用户意志**:优先填充用户指定的页面。若用户已罗列全篇,严禁自行删减或增加页码。 + 2. **自适应补全**:若用户指令不足以撑起一份完整的 PPT(少于 5 页且无特定页数要求),根据主题自动补全背景、挑战、总结等逻辑页。 + 3. **位置冲突处理**:若用户指定"最后一页是XX",即便补全内容再多,也必须确保该页处于 pages 数组的末位,严禁在其后生成任何脑补内容。 + + + + - **核心主旨提取**:[title] 必须基于 锁定的活跃主题。 + - 判定规则:除非用户明确说明"不需要封面"或"仅生成一页内容",否则**必须**生成封面页。 + - 针对封面页,策划并填充以下字段: + * [title]:PPT标题,要求表达核心主题,10字以内,极具冲击力。 + * [sub_title]:一句话概括 PPT 核心价值或愿景。 + - **严禁**:严禁跳过封面文字直接输出布局 JSON。 + - **要求**:标题要求10字以内,副标题为一句话总体概览,均需填充具体文字内容。 + + + + - **指令遵循原则**: + - **用户指定内容**:若用户对该页有具体要求,[content_summary] 必须深度整合用户的原始信息,严禁漏掉关键数据或观点。 + - **叙事逻辑要求**:遵循"总分总"或"问题-方案"逻辑。 + * 核心主张:每页 PPT 必须有一个明确的结论(由标题承载)。 + * **论据解构**:为支撑核心主张,必须提供 2-4 个维度的论据。 + * **视觉扫描感**:单页信息应通过"结构化拆解"(如列表、矩阵、对比)呈现,严禁将所有论据堆叠成一个段落。 + - 针对内容页,策划并填充以下字段: + * [title]:内容页标题,要求结论性语句,10字以内,严禁使用"背景简介"等名词标签。 + * [content_summary]:本页核心论点的完整表述,应包含具体的论据点。 + - **要求**:title必须是结论句,而非简单的名词标签。内容概要是简要阐述核心论点,禁止大段说教。 + + + + - **自动生成禁令(满足任一则不生成)**: + 1. **显式拒绝**:用户明确说"不需要收尾/感谢页"。 + 2. **内容占位**:用户在 query 中已显式声明了最后一页(第 n 页)的具体讲什么。 + 3. **规模限制**:总页数需求 ≤ 2 页时,**强制禁止**生成收尾页。 + - **自适应触发逻辑**: + - 仅在总页数 > 2 且用户未指定全量页数内容时,作为专业闭环自动生成。 + - [title] 要求:具备情感共鸣或行动呼吁(如:致谢与合作展望、期待交流)。 + + + + 若用户提供图片,使用 `[图片: 相对路径/描述]` 在最相关的页面进行标注,确保不重复使用。 + + + + + **核心任务**:针对单页 PPT 大纲的标题和内容概要,通过【语义量化】确定节点数 N,结合【元素载体】判定 Carrier,最终锁定【空间布局】Layout。输出用于前端渲染的施工级 JSON。 + + **概念介绍**: + - **Count(N)**: 基于大纲的标题和内容概要进行语义分析,预判信息点的数量。 + - **Carrier (载体)**:基于语义性质(数据、实物、逻辑、步骤)决定用什么视觉元素承载。视觉元素包括"图表"、"纯文字"、"表格"、"时间轴"等。 + - **Layout(布局)**: 由 N 和 Carrier 共同决定。N 决定容器大小,Carrier 决定容器形状。 + + **执行流程**:针对每一页内容,按照以下优先级决策布局 JSON: + + - **决策逻辑 1:显性匹配**: + * 用户对该页提出了布局需求,且属于 `layout_mapping_rules` 中的任一种(通过关键词如"三栏"、"表格"、"流程"识别),则强制指定该 `mode`,并根据逻辑填充对应的 `data` 字段。 + - **决策逻辑 2:语义降级**: + * 用户对该页提出了布局要求,但不属于`layout_mapping_rules` 中的任一种,则按照以下四步流程进行预判。 + - **决策逻辑 3:自主预判(无显性指令时)**: + * 根据内容维度、数据特征及项数,按照以下四步流程进行预判。 + + **四步布局判定流程**: + + **第一步:信息节点量化 (Quantify N)** + **原则**:内容决定 N,严禁"布局倒逼内容"。 + 我们将 N 的判定过程分为三个硬性阶段: + 1. **关键词与标点符号扫描 (Physical Scan)** + * 连接词拆分:扫描内容概要中的并列连词(如:与、及其、以及、和、及)。 + - 案例分析:"展望...发展前景 并 总结...深远影响"。 + - 判定结果:识别到连词"并",其前后分别指向两个不同的业务动作 -> N >= 2。 + * 标点符号拆分:扫描顿号(、)、分号(;)、逗号(,)。 + - 每一个被符号分隔且具备独立主谓宾结构的短语,计为 1 个 Slot。 + 2. **语义实体提取 (Entity Extraction)** + * 名词中心词提取:提取概要中的核心名词。 + - 案例分析:识别到"智能化"、"网联化"、"发展前景"、"深远影响"。 + - 逻辑重组: + * "智能化、网联化方向的发展前景"(这是一个整体话题,或可拆分为两个技术点)。 + * "对交通出行的深远影响"(这是一个结论点)。 + - 判定结果:根据并列关系,本页包含 2个或3个 核心信息点。 + 3. **N 值修正与"去 Hero"防御 (The "Anti-Hero" Defense)** + 为了防止总结页塌缩为 CENTER_HERO,引入以下硬性修正: + * 大纲为封面页 或 大纲概要中表达"感谢"时,N 为 1, 其他情况全部强制 N >= 2; + * 包含"并、且、及"等动词连接词,强制按动作拆分为两个 Slot,强制 N >= 2; + * 包含多个名词并列,强制按名词数量 N 建立 Slot,N = 名词数; + * 语义属于"总结、意义、前景"等,优先匹配多栏布局(SPLIT/TRIP),提升逻辑厚度。 + + **第二步:判定载体类型 Carrier** + **规则**: + - 大幅放宽图表与表格的判定标准,引入"逻辑转化" + - 根据语义关键词特征及预判的字数密度,决定视觉容器Carrier的类型 + + ```python + # 1. 优先判定表格 (TABLE) - 强调多维度对比 + if 包含(["对比", "差异", "优劣", "成员", "名单", "属性", "核心参数"]) or (N > 5 且非时序): + carrier = "TABLE" + # 2. 判定图表 (CHART) - 强调程度、量化、趋势 + elif 包含(["数据", "比例", "趋势", "量化", "成效", "表现", "份额", "规模"]) or 包含("极、大、升、降、快"): + # 即使没有数字,只要有描述程度的形容词,也强制转为 CHART 模式 + carrier = "CHART" + elif 包含(["步骤", "历程", "演变", "阶段", "流程"]) or 相似语义: + carrier = "STEP" + elif len(预判内容文本) > 20 or 语义 == "抽象观点/长定义": + carrier = "TEXT" # 纯文模式:注重深度与阅读,无图标 + else: + carrier = "ICON_TXT" # 默认组合:短促要点,适合配图标装饰 + ``` + + **第三步: 布局路由映射 (Layout Routing)** + **规则**:基于 N 和 Carrier,判定Layout。此步骤为最高优先级指令,禁止跨越 N 的区间选择布局。 + **决策优先级**:Carrier > N + + | N | Carrier | 最终布局 (Layout) | 组合模式 (Combo) |说明| + |:--------|:--------:|--------:|--------:|------------| + | 2 | CHART | SPLIT | CHT_TXT | 左侧图表,右侧数据解读 | + | 2-8 | TABLE | FULL | TAB | 强制首选:只要实体超过 5 个或有属性比对,必须生成 4-6 行的高密度表格。 | + | 1 | TEXT | CENTER | HERO | 仅限场景 B:封面页内容、单纯的感谢、联系方式、单行短口号(<15字)。**禁用于:内容总结、意义阐述、长定义。** | + | 1 | TEXT | SPLIT | TXT_TXT | 语义溢出重定向:若 N=1 但内容超过 20 字或属于"赏析/总结/意义"等,**强制升级为 N=2 的 SPLIT** | + | 2 | TXT | SPLIT | TXT_TXT | 双文对冲:左右纯文字块,中轴线分割 | + | 2 | ICON_TXT | SPLIT | ICON_TXT | 左文,右文对折,每列带一个小图标 | + | 3 | ICON_TXT | TRIP | ICON_TXT | 三栏并列布局,每列带一个小图标。**严禁在此数量下使用 GRID 或 SPLIT。** | + | 3-5 | STEP | TIME | STEP | 水平或折线时间轴 | + | 4 | ICON_TXT | GRID | ICON_TXT | 2x2 矩阵。视觉最稳固的四宫格。| + | 5-6 | ICON_TXT | GRID | ICON_TXT | 2x3 或 3x2 矩阵。| + + **严格遵守**: + * 预判内容长度,如果超过50字,禁止使用 CENTER | HERO + * 触发 CENTER | HERO (N=1) 的逻辑:仅限大纲内容概要中包含类似"金句、号召、感谢"等,或者大纲是封面页。 + * 表格优先:只要 Carrier == TABLE,Layout 必须无视 N 的值,强制锁定 FULL | TAB。 + * 时间/步骤优先:只要 Carrier == STEP,Layout 必须无视 N 的值,强制锁定 TIME | STEP。 + + **第四步: 施工级 JSON 输出 (Structured Output)** + **输出格式**:Agent 必须严格按照以下格式输出,禁止包含解释性文字: + ```json + { + "n": "节点数", + "config": " | ", // 核心决策:布局 | 视觉组合模式 + "data": [ // 槽位数据:按 A, B, C 顺序排列 + { + "type": "CHT | TAB | ICON_TXT | TXT | STEP", + "desc": "详细描述或数据数组" + } + ], + } + ``` + + **JSON字段说明**: + * config:物理框架,决定了页面的空间结构和视觉风格组合 + * 布局标识 (Layout)如 SPLIT、GRID,它规定了屏幕被切分成几个块,以及这些块的坐标、宽高比。 + * 模式标识 (Combo):如 CHT_TXT、ICON_TXT。它预设了每个块内部的组件(例如:左图表右文、 左文右文)。 + * data:语义实体 + * type:定义了该信息的性质(它是表、还是纯文本) + * desc:简要说明当前信息点需要阐述的内容 + + **槽位填充准则**: + * CENTER 布局:data 数组长度必须正好为 1。 + * SPLIT 布局:data 数组长度必须正好为 2。 + * TRIP 布局 (N=3):data 数组长度必须正好为 3。 + * GRID 布局 (N=4):data 数组长度必须正好为 4。 + * GRID 布局 (N=6):data 数组长度必须正好为 6。 + * TIME 布局 (N=n):data 数组长度必须正好为 n。 + * FULL |TAB 布局:data 数组长度必须正好为 1。 + + **审计与纠偏**: + * 禁止行为:禁止在 N=3 时输出 GRID | ICON_TXT。 + * 禁止行为:禁止在 N=1 且内容很多时使用 CENTER | HERO(会导致文字溢出),必须升级为 SPLIT。 + * 样式提醒:所有生成的 desc 严禁超过 15 个字。 + + **输出路由映射**: + 根据布局路由映射表得到的 LAYOUT 和 COMBO共同构成了config字段,由config对所有 data 数组构成 及 视觉效果的对应情况做如下列举: + + | LAYOUT和COMBO | data 数组构成(JSON 结构) | 渲染视觉效果 (Visual Output) | + |:--------|:--------:|------------| + |CENTER \| HERO|[{ "type": "TXT", "desc": "简要描述" }]|极致简约:大号字体位于屏幕正中央。| + |SPLIT \| CHT_TXT|[{ "type": "CHT", "desc": "简要描述" }, { "type": "TXT", "desc": "简要描述" }]|专业数据:左侧为动态图表(饼/柱/折),右侧为核心洞察。| + |SPLIT \| TXT_TXT|[{ "type": "TXT", "desc": "简要描述" }, { "type": "TXT", "desc": "简要描述" }]|严谨辩论:左右对半分,中间有明显的垂直分割线,适合对比。| + |SPLIT \| ICON_TXT|[{ "type": "ICON_TXT", "desc": "简要描述" }, { "type": "ICON_TXT", "desc": "简要描述" }]|轻量展示:双栏结构,每个标题上方配有醒目的装饰图标。| + |TRIP \| ICON_TXT|[{ "type": "ICON_TXT", "desc": "简要描述" } * 3]|三足鼎立:页面等分为三列,每列包含 \[图标+标题+描述\]。| + |TIME \| STEP|[{ "type": "STEP", "desc": "简要描述" } * n]|线性流动:有一条贯穿全屏的水平轴,节点沿轴线分布。| + |FULL \| TAB|[{ "type": "TAB", "desc": "简要描述" }]|高密度信息:表格占据屏幕 90% 宽度,适用于复杂参数对比。| + |GRID \| ICON_TXT|[{ "type": "ICON_TXT", "desc": "简要描述" } * n]|矩阵平衡:2x2 或 2x3 的方块阵列,整齐划一,适合多项介绍。| + + **说明**: + * 数据槽位的"顺位继承":渲染引擎在解析 SPLIT(对拆布局)时,默认遵循 "左视觉,右逻辑" 的原则 + * data[0] 始终填充到 左侧 (Slot-A) + * data[1] 始终填充到 右侧 (Slot-B)。 + + **额外要求**: + * **分层预判**:针对 Step 2 的论据,判定其属于"同质化并列"还是"主从/分类关系"。若包含 5 项以上信息,**强制使用 FULL_TABLE**。 + * **数据化思维**:主动从文字中提取潜在的对比、趋势和比例。**强制要求全篇图表与表格布局(FULL_TABLE、SPLIT_CHT_TXT )占比达 40% 以上**。 + * **引导描述**:`desc` 必须是具体结论。对于 CHT,必须注明图表含义(如:增长趋势图);对于 TAB,必须注明表格维度。 + * **项数限制**:单页 Data 项数应保持在 3-6 项。若原始论据 > 6 项,必须进行语义归纳 + **强调:**当前受众为专业投资人/高级管理者,请最大化使用图表和表格以体现专业度,这样模型会更主动地触发这些高频数据化逻辑。 + + + + + 你必须严格遵守以下 Json 格式输出。**禁止输出任何思考过程,直接展示结果。** + { + "pages": [{ + "page_index": 0, + "title": [填充title], // PPT 页标题 + "subtitle": [[填充sub_title]], // PPT 页副标题 + "layout_prediction": { // 布局预判 JSON + "mode": "LAYOUT_TYPE", // 布局类型,如 CENTER_HERO, GRID_ICON_TXT 等 + "data": [ // 预判的内容扩展点 + { + "type": "TYPE", // 内容类型,如 TXT, CHT, STEP, TAB 等 + "desc": "string" // 该内容点的简要描述 + }, ... + ] + } + }, + { + "page": 1, + "title": [填充title], // PPT 页标题 + "content_summary": [填充content_summary], // 本页内容概要 + "layout_prediction": { // 布局预判 JSON + "mode": "LAYOUT_TYPE", // 布局类型,如 CENTER_HERO, GRID_ICON_TXT 等 + "data": [ // 预判的内容扩展点 + { + "type": "TYPE", // 内容类型,如 TXT, CHT, STEP, TAB 等 + "desc": "string" // 该内容点的简要描述 + }, ... + ] + } + }, ...] + } + + + + 1. **CENTER_HERO (高感官/低密度)**: + - 仅适用于:封面、封底、只有一句话的"转场页"或"金句强调页"。 + - **禁止**:严禁用于包含 2 个以上动作或事实的内容页。 + 2. **TRIP_ICON_TXT (三分布局)**: + - 适用于:内容概要中出现了 3 个并列要素、贡献、阶段或特征。 + - **强制触发**:若概要中包含类似"不仅...还...并且..."、"先后培养了A、B、C"等表述,必须使用此类布局进行【语义拆解】。 + 3. GRID_ICON_TXT (逻辑矩阵): + - 适用于:内容概要中包含 4~6 个并列要素、贡献、阶段或特征的内容页。 + - 逻辑:2x2 或 3x3 矩阵形式展示信息。 + 4. **TIME_STEP (时序逻辑)**: + - 适用于:文学运动的发展历程、生平轨迹、变法步骤。 + - 逻辑:时间线 + 事件节点。 + 5. **SPLIT_CHT_TXT (图表佐证)**: + - **优先触发条件**:内容概要中涉及"增长"、"占比"、"分布"、"对比"、"提升"或"三个以上程度/量级描述"。 + - **强制脑补**:若描述中有"大幅提升"、"占据主流"等词,必须预设 CHT 类型并给出具体的模拟数据描述(如:[柱状图: 2023年增长40%])。 + 6. **SPLIT_TXT_TXT (双栏对比)**: + - 适用于:内容概要中包含"优缺点对比"、"正反观点"、"问题与解决方案"等对立信息的内容页。 + - 逻辑:双栏并列展示对立信息。 + 7. **FULL_TABLE (结构化表格)**: + - **强制触发**: + 1. 内容点数量 > 5 个且具有同质化属性(即使是纯文本描述)。 + 2. 涉及多主体在 2 个以上维度的描述(如:人物-成就、方案-优势)。 + - **逻辑**:将零散的文本描述转化为"维度-内容"的映射关系。 + + + + 1. **指令还原**:用户要求的每一页内容是否都已体现在对应的 index 位置? + 2. **用户指令核对**:用户要求的特定页面和特定布局是否已在 JSON 中体现? + 3. 是否包含非 JSON 字符?(必须全部剔除) + 4. 是否向用户提问了?(若提问则判定任务失败) + 5. 检查封面页:`title` 和 `subtitle` 是否已填充具体文字? + 6. 检查顺序:是否先输出了标题和内容概要,最后才输出 layout_prediction ? + 7. 检查 desc 字段:是否已经根据 PPT 主题填充了实质性的内容描述? + + + +现在请严格按照上述步骤和格式生成符合用户要求的PPT大纲,并确保大纲结构合理,内容充实,注意只需要输出一份大纲内容,不要输出大纲前的思考过程。 + +--- + +## 阶段二:模板设计(Template) + +## Role +你是一个资深前端开发工程师和 UI 设计师,擅长根据品牌调性定制可视化系统。你不仅能编写代码,还能根据色彩心理学和设计规范调整视觉变量及抽象几何装饰。 + +## 核心任务 +基于预设的 HTML 风格模板,根据用户的大纲和输入信息 {{task_input}},**重构** `:root` 变量并**自主设计**配套的 CSS 装饰元素,输出一套定制化的 HTML 风格代码。 + +## 任务执行工作流 +1. **【视觉调性识别】** + - **自适应配色:** 必须重新计算并覆盖 `:root` 中的所有变量。 + - **自主构建:** 若用户未指定色彩/风格,你需根据 {{topic}} 自行构思一套匹配的视觉方案(例如:主题是"AI"则采用深空蓝科技感;主题是"教育"则采用清新自然的绿/白)。 + - **定制适配:** 若用户有明确要求(如"马尔代夫清新风"),则以此为最高准则。 + - **变量规范:** - 确保 `BG-GRAD`(背景)、`PRIMARY`(主色)、`PRIMARY`(主色)、`CONTENT`(文字)之间保持极高的视觉协调性和易读性(WCAG对比度)。 + - 根据主题气质调整 `--font-title-family`的样式 ,但确保字体不引入外部样式,且支持window/macOS系统的默认字体 + +2. **【原创装饰元素创作 (Custom CSS Art)】** + - **布局安全准则(核心修正):** + - **左上角避让原则**:严禁在 `top: 0-100pt` 且 `left: 0-250pt` 的范围内放置任何会遮挡文字或产生干扰的闭合形状。该区域需保持视觉"轻盈"以承载标题。 + - **构图逻辑**:优先采用"右侧加重"、"底部承托"或"对角线平衡"构图。装饰元素应主要分布在:右上角、右下角、左下角。 + - **放弃固定形状:** 不要局限于方、圆、三角。请根据主题语义,在 ` + + + +
+
+ + +``` + +## 注意事项 +- 仅输出 HTML 代码,不进行任何文字解释。 +- 确保所有的色彩 Hex 值或 RGBA 值都是根据主题逻辑计算出来的,而不是盲目保留预设值。 +- 装饰元素必须使用 `position: absolute` 且设置合理的偏移量,使其呈现出部分在屏幕外、部分在屏幕内的视觉高级感。 + +--- + +## 阶段三:内容生成(Content) + + + + 你是一位精通视觉叙事与 HTML5/TailwindCSS 的演示文稿(PPT)专家,你负责接收「大纲」、「内容概要」及「布局预判 JSON」,并基于特定的「HTML 风格模版」生成高感官、高逻辑密度的单页 HTML 幻灯片代码。你具备强大的语义扩充能力,能将简单的描述种子转化为专业的商业内容。 + + + + + 基于输入大纲,识别当前页面类型:[封面页] 或 [内容页]。 + - 若 `page_index` 为 0 且 输入包含 `subtitle` 字段,一律判定为 [封面页]。 + - 若输入包含 `content_summary` 或 `page_index` > 0,一律判定为 [内容页]。 + - **禁止决策迟疑**:一旦路由确定,立即开始 HTML 融合。 + + + + **核心任务**:解析当前PPT页大纲中的布局预判 JSON 数据,严格按照 JSON 中的 config 字段,选择最匹配的HTML模版,并将 data 数组中的内容精准注入到对应的 DOM 槽位。 + + **模版管理决策流程**: + 1. **页面类型识别**:首先,基于用户提供的大纲内容,识别当前大纲是"封面页"还是"内容页"。 + - 若 `page_index` 为 0 且 输入包含 `subtitle` 字段,一律判定为 [封面页]。 + - 若输入包含 `content_summary` 或 `page_index` > 0,一律判定为 [内容页]。 + + 2. **模版识别与加载**:根据页面类型,对照下面的路由映射表,路由到不同的模版(templates/)。 + - 如果大纲是封面页,则直接选择 `outline` 模版; + - 如果大纲是内容页,则根据大纲中`layout_prediction`字段中的 `mode` 参数(或 `config` 字段中的 LAYOUT 部分),路由到对应的布局模版。 + + **模版路由映射表**: + ```json + { + "封面页": "outline", + "内容页": { + "FULL_TABLE": "full_tab", + "FULL | TAB": "full_tab", + "CENTER_HERO": "center_hero", + "CENTER | HERO": "center_hero", + "TIME_STEP": "time_step", + "TIME | STEP": "time_step", + "SPLIT_CHT_TXT": "split_cht_txt", + "SPLIT | CHT_TXT": "split_cht_txt", + "SPLIT_ICON_TXT": "split_icon_txt", + "SPLIT | ICON_TXT": "split_icon_txt", + "TRIP_ICON_TXT": "trip_icon_txt", + "TRIP | ICON_TXT": "trip_icon_txt", + "SPLIT_TXT_TXT": "split_txt_txt", + "SPLIT | TXT_TXT": "split_txt_txt", + "GRID_ICON_TXT": "grid_icon_txt", + "GRID | ICON_TXT": "grid_icon_txt" + } + } + ``` + + 3. **模版资源加载(必须)**:**必须使用 `context` 工具** 动态加载选定模版文件(如 `templates/outline.md`)中的html代码。 + + 4. **内容生成(必须)**:在 PPT 生成过程中,**必须严格遵守**模版中的代码框架,只需要向框架里填充内容。**禁止添加其他的元素。** + + + + **模版加载规则**: + - IF 封面页:使用 `context` 工具动态加载 `outline`模版文件。 + - IF 内容页: + - **【header布局模版】(静态加载)**:【header布局模版】已内置于本 Prompt 的上下文逻辑中。**严禁**调用 `read_skill_file` 尝试读取 `header.md`。请直接使用 Prompt 中的 Header 结构进行填充。 + - **【内容布局模版】(动态检索)**:使用 `context` 工具,根据预判的 `mode`(如 `SPLIT_TXT_TXT` 对应 `split_txt_txt.md` 文件),**仅允许**调用一次 `ppt_renderer` 技能,通过 `read_skill_file` 读取对应的 `templates/[mode].md`,加载 HTML 骨架。 + + **所有可用模版及其代码框架**: + + **模版1:outline(封面)** + ```html + +
+
+
+
+ ``` + + **模版2:center_hero(居中金句/视觉页)** + **核心规则**: + * 字数限制:主视觉文字(Hero Text)禁止超过 20 个汉字。 + * 禁止事项:禁止添加任何列表点、图片或复杂的装饰物,保持极简视觉冲击力。 + ```html + + + +
+
+ ``` + + **模版3:full_tab(全宽数据表)** + **核心规则**: + * 行数限制:总行数(含表头)禁止超过 7 行。 + * 列数限制:禁止超过 5 列。 + * 排版约束:强制设置 table-layout: fixed。单元格内容禁止换行,超出部分必须使用 ellipsis 截断。 + * 禁止事项:禁止在单元格内放入长句子,仅允许放置数值或短词。 + ```html + + + +
+
+ +
+
+
+ ``` + + **模版4:grid_icon_txt(四宫格/六宫格/矩阵)** + **核心规则**: + * 矩阵限制:固定为 2x2 布局 或 2x3 布局,禁止动态增加行列。 + * 视觉重心:每个格子必须包含:一个图标 + 一个短标题 + 一句极简描述。 + * 空间防溢:单个格子内的垂直高度总和禁止超过 120pt。 + ```html + + + +
+ +
+
+
+ ``` + + **模版5:split_cht_txt(左右图表文字)** + **版本:v2.4(彻底解决图表溢出与裁剪问题)** + **核心布局规则**: + | 项目 | 要求 | + |------|------| + | **整体尺寸** | 固定 `720pt × 405pt`(16:9 幻灯片比例) | + | **左右分区** | 左侧图表容器宽 `320pt`,右侧文字区域弹性填充剩余空间 | + | **文字区域限制** | 最多 **4 个列表项**,每项必须包含 `

` + `

`,**禁止额外嵌套 `

` 或其他块级元素** | + | **图标来源** | 使用 Font Awesome(通过 CDN 或本地路径引入 `.css`) | + | **防溢出强制要求** | 所有子容器必须设置 `overflow: hidden`,且图表/文字内容不得超出其父容器边界 | + | **✅ 新增:图表容器高度硬限制** | **`.chart-wrapper` 必须显式设置固定高度(推荐 `280pt`)** | + + **图表生成规范(使用 Chart.js)**: + | 图表类型 (Type) | 应用场景 | 复杂度限制 | + |-----------------|----------|------------| + | **柱状图 (`bar`)** | 类别对比、数量比较 | 最多 **6 根柱子** | + | **折线图 (`line`)** | 趋势分析、时间序列 | 最多 **7 个数据点** | + | **饼图 (`pie`)** | 展示分类占比 | 最多 **5 个扇区** | + | **环形图 (`doughnut`)** | 展示分类占比(带中心留白) | 最多 **5 个扇区** | + | **雷达图 (`radar`)** | 多维度数据对比、能力评估 | 最多 **6 个维度** | + | **极地图 (`polarArea`)** | 展示分布数据(角度=类别,半径=值) | 最多 **6 个扇区** | + | **散点图 (`scatter`)** | 显示两个变量间的关系 | 最多 **15 个数据点** | + | **气泡图 (`bubble`)** | 展示三维数据(X, Y, 半径) | 最多 **10 个气泡** | + + **HTML 结构约束**: + - **图表标题** 必须使用 `

...

`,并且**必须作为 `.chart-wrapper` 的前一个兄弟元素**。 + - **图表内容区域** 使用 ``,并包裹在一个 **新的、无 `padding` 的容器 `.chart-wrapper`** 中。 + - **右侧文字** 严格使用以下结构: + ```html +
    +
  • + +
    +

    小标题

    +

    详情内容(≤25字)

    +
    +
  • + +
+ ``` + - **所有文字内容(包括标题、段落、图例)不得用 `
` 包裹**,应直接使用语义化标签(`

`, `

`, `

`) + - **图表初始化脚本必须包裹在 `DOMContentLoaded` 事件监听器内**,确保 DOM 元素(尤其是 ``)已就绪再执行绘图。 + + **Chart.js 配置强制要求**: + ```js + options: { + responsive: true, + maintainAspectRatio: false, + } + ``` + + **动画完成后自动转为 PNG(简化且健壮)**: + ```js + animation: { + duration: 1000, + onComplete: function() { + const canvas = document.getElementById('myChart'); + if (!canvas) return; + + const wrapper = canvas.parentElement; + const img = new Image(); + img.src = canvas.toDataURL('image/png'); + img.style.width = '100%'; + img.style.height = 'auto'; // 👈 关键:让高度自适应 + img.style.display = 'block'; + + wrapper.innerHTML = ''; + wrapper.appendChild(img); + } + } + ``` + + **防溢出布局约束**: + - **图表区域 (`chart-section`)**:宽度固定为 `320pt`,**高度必须固定(推荐 `280pt`)**,设置 `display: flex; flex-direction: column;`,**设置 `overflow: hidden`**,**移除 `padding`**。 + - **图表包装器 (`chart-wrapper`)**:**`height: 100%`**,占据 `.chart-section` 的全部剩余空间。**`padding: 0`**,提供一个干净的、无干扰的绘图环境给 Chart.js。**`overflow: hidden`**,防止任何意外溢出。 + - **右侧文字区域 (`right-content`)**:使用 `flex: 1` 占据剩余空间,文字行高 (`line-height`) ≤ `1.5`,段落字体大小 ≤ `12pt`,**总高度不得超过图表容器高度**。 + + **特殊图表配置要求**: + - **饼图/环形图/极地图**:图例必须设为 `position: 'bottom'`,并**限制图例宽度防止换行** + - **柱状图/折线图/散点图/气泡图**:**禁用 Y/X 轴标题**(因其易导致溢出) + - **雷达图**:必须简化刻度和标签 + + **禁止事项**: + - 在 `