Skip to content

[Feature] Multi-step task execution (agentic workflows) #53

@Wan-ZL

Description

@Wan-ZL

Description

Transform Genesis from a tool-calling chatbot into a true agent that can plan and execute multi-step tasks autonomously. This is the key capability gap between Genesis and the 2026 AI agent landscape.

Current State

Genesis has individual tools (web_fetch, shell, calendar, repository, etc.) but they operate in isolation. The LLM calls one tool at a time in a single request-response cycle. There is no ability to:

  • Plan a sequence of actions
  • Execute multiple steps toward a goal
  • Handle errors mid-execution and retry/adjust
  • Report progress on long-running tasks
  • Chain tool outputs as inputs to subsequent tools

Target State

Users should be able to say things like:

  • "Research the top 5 restaurants near me, check their reviews, and create a comparison table"
  • "Back up my important files, compress them, and upload to my server"
  • "Find all Python files that import deprecated modules and suggest replacements"

Genesis should decompose this into steps, execute them, handle failures, and deliver results.

Acceptance Criteria

  • Task planning: LLM decomposes user request into ordered steps
  • Step execution: Each step can invoke one or more tools
  • Progress reporting: User sees real-time progress updates
  • Error handling: Failed steps can be retried or skipped with explanation
  • Tool chaining: Output of one tool can be input to another
  • Task history: Completed tasks stored for reference
  • Cancellation: User can cancel a running task
  • Permission checks: Each step validates permissions before execution
  • Tests: Unit tests for task planning and execution
  • Streaming: Progress updates streamed to frontend via SSE

User Research

  • "Agentic AI" is the defining trend of 2026 (Microsoft, MIT Technology Review, TechCrunch)
  • Users want AI that DOES things, not just TALKS about things
  • OpenClaw's popularity (171k stars) proves demand for autonomous task execution
  • MCP protocol enables multi-agent workflows where agents chain actions
  • Organizations report 40-60% efficiency gains with agentic workflows

Priority Rationale

HIGH for Phase 9. This differentiates Genesis from chatbots. Combined with MCP support (#51), this creates a genuine agent platform. However, it depends on the existing tool system being stable and MCP integration providing more tools to chain.

Technical Notes

  • Use ReAct (Reason + Act) pattern for task decomposition
  • Leverage existing tool registry for execution
  • SSE events for progress streaming (existing infrastructure)
  • SQLite for task state persistence
  • Consider a "plan review" step where user approves the plan before execution

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions