-
Notifications
You must be signed in to change notification settings - Fork 0
Description
Description
Transform Genesis from a tool-calling chatbot into a true agent that can plan and execute multi-step tasks autonomously. This is the key capability gap between Genesis and the 2026 AI agent landscape.
Current State
Genesis has individual tools (web_fetch, shell, calendar, repository, etc.) but they operate in isolation. The LLM calls one tool at a time in a single request-response cycle. There is no ability to:
- Plan a sequence of actions
- Execute multiple steps toward a goal
- Handle errors mid-execution and retry/adjust
- Report progress on long-running tasks
- Chain tool outputs as inputs to subsequent tools
Target State
Users should be able to say things like:
- "Research the top 5 restaurants near me, check their reviews, and create a comparison table"
- "Back up my important files, compress them, and upload to my server"
- "Find all Python files that import deprecated modules and suggest replacements"
Genesis should decompose this into steps, execute them, handle failures, and deliver results.
Acceptance Criteria
- Task planning: LLM decomposes user request into ordered steps
- Step execution: Each step can invoke one or more tools
- Progress reporting: User sees real-time progress updates
- Error handling: Failed steps can be retried or skipped with explanation
- Tool chaining: Output of one tool can be input to another
- Task history: Completed tasks stored for reference
- Cancellation: User can cancel a running task
- Permission checks: Each step validates permissions before execution
- Tests: Unit tests for task planning and execution
- Streaming: Progress updates streamed to frontend via SSE
User Research
- "Agentic AI" is the defining trend of 2026 (Microsoft, MIT Technology Review, TechCrunch)
- Users want AI that DOES things, not just TALKS about things
- OpenClaw's popularity (171k stars) proves demand for autonomous task execution
- MCP protocol enables multi-agent workflows where agents chain actions
- Organizations report 40-60% efficiency gains with agentic workflows
Priority Rationale
HIGH for Phase 9. This differentiates Genesis from chatbots. Combined with MCP support (#51), this creates a genuine agent platform. However, it depends on the existing tool system being stable and MCP integration providing more tools to chain.
Technical Notes
- Use ReAct (Reason + Act) pattern for task decomposition
- Leverage existing tool registry for execution
- SSE events for progress streaming (existing infrastructure)
- SQLite for task state persistence
- Consider a "plan review" step where user approves the plan before execution