Skip to content

API: Add parent-child task relationships and kelos-mcp-server for agent-initiated task spawning #829

@kelos-bot

Description

@kelos-bot

Summary

Enable agents to dynamically create child tasks at runtime, transforming them from passive task executors into active orchestrators. This unlocks recursive problem decomposition, dynamic parallelism, and agent-to-agent delegation — capabilities that static pipelines (dependsOn, taskTemplates) cannot provide.

Problem

Today, all task orchestration in Kelos is statically defined at template time:

Real-world scenarios requiring dynamic task creation:

  1. Issue decomposition: A planning agent reads a complex issue, breaks it into 3 focused subtasks (API change, frontend update, test addition), spawns each as a child task, then creates a coordinating PR
  2. Multi-repo changes: An agent discovers that fixing a bug requires updating a shared library AND two consumer services — spawns child tasks per repo
  3. Exploratory parallelism: An agent needs to evaluate 3 competing approaches — spawns parallel child tasks and picks the best result
  4. Recursive delegation: An architect agent designs a system, spawns implementation tasks per component, each of which may spawn test-writing subtasks

None of the existing proposals address this:

Proposal

1. API changes to TaskSpec and TaskStatus

Add parentRef to TaskSpec so child tasks track their lineage:

type TaskSpec struct {
    // ... existing fields ...

    // ParentRef references the Task that spawned this child task.
    // Set automatically by the kelos-mcp-server when an agent creates
    // a child task. Enables parent-child relationship tracking and
    // lifecycle management (e.g., cancelling children when parent fails).
    // +optional
    ParentRef *TaskReference `json:"parentRef,omitempty"`
}

type TaskReference struct {
    // Name is the name of the referenced Task.
    Name string `json:"name"`
}

Add childTasks to TaskStatus for observability:

type TaskStatus struct {
    // ... existing fields ...

    // ChildTasks lists tasks that were spawned by this task's agent
    // via the kelos-mcp-server. Updated by the controller when child
    // tasks with a matching parentRef are created.
    // +optional
    ChildTasks []ChildTaskStatus `json:"childTasks,omitempty"`
}

type ChildTaskStatus struct {
    // Name of the child Task.
    Name string `json:"name"`
    // Phase of the child Task.
    Phase TaskPhase `json:"phase,omitempty"`
}

2. kelos-mcp-server binary

A new MCP server that wraps the Kelos Kubernetes API, giving agents native tool access for task management. It runs as a stdio MCP server inside the agent pod.

Tools exposed:

Tool Description
create_child_task Create a new Task with parentRef set to the current task. Accepts: prompt, branch (optional), model (optional). Inherits workspace, credentials, and agentConfig from parent.
list_child_tasks List child tasks of the current task with their phases and results.
get_task_status Get the status of a specific task by name (phase, results, outputs).
wait_for_tasks Poll until specified tasks reach a terminal phase. Returns aggregated results.

Implementation:

  • Written in Go, ships as a binary in agent images (or as a sidecar)
  • Uses in-cluster Kubernetes client (via ServiceAccount token) to create/read Task resources
  • Reads KELOS_TASK_NAME and KELOS_TASK_NAMESPACE env vars (injected by JobBuilder) to set parentRef
  • Child tasks inherit workspaceRef, credentials, agentConfigRef from the parent task unless overridden

Example MCP config in AgentConfig:

apiVersion: kelos.dev/v1alpha1
kind: AgentConfig
metadata:
  name: orchestrator-agent
spec:
  mcpServers:
  - name: kelos
    type: stdio
    command: kelos-mcp-server
    # No args needed — reads config from KELOS_* env vars

3. RBAC automation

When a Task has an AgentConfig with a kelos MCP server configured, the JobBuilder should:

  1. Create a Role granting create, get, list, watch on tasks.kelos.dev in the task's namespace
  2. Create a RoleBinding linking the task pod's ServiceAccount to this Role
  3. Clean up via ownerReferences when the Task is deleted

Alternatively, users can pre-create a ServiceAccount with appropriate permissions and reference it via podOverrides.serviceAccountName.

4. Controller enhancements

  • Child task tracking: When a Task with parentRef is created, update the parent's status.childTasks
  • Lifecycle propagation (optional, future): When a parent task fails/is deleted, optionally cancel running children (controlled by a childPolicy field)
  • CLI integration: kelos get tasks could show parent-child relationships in a tree view

Example: Dynamic issue decomposition

# AgentConfig for a planning agent that can spawn subtasks
apiVersion: kelos.dev/v1alpha1
kind: AgentConfig
metadata:
  name: planner-with-delegation
spec:
  agentsMD: |
    You are a planning agent. For complex issues:
    1. Analyze the issue and break it into focused subtasks
    2. Use the `create_child_task` MCP tool to spawn a task for each subtask
    3. Use `wait_for_tasks` to wait for all subtasks to complete
    4. Aggregate results and create a coordinating PR or summary comment
  mcpServers:
  - name: kelos
    type: stdio
    command: kelos-mcp-server

An agent receiving a complex issue like "Migrate API from v1 to v2" could then:

Agent thinks: This requires changes to 3 packages. Let me delegate.

→ create_child_task(prompt="Update pkg/api/v1 to v2 types", branch="migrate-api-types")
→ create_child_task(prompt="Update pkg/handlers to use v2 API", branch="migrate-handlers")  
→ create_child_task(prompt="Update test fixtures for v2 API", branch="migrate-tests")
→ wait_for_tasks(["migrate-api-types-task", "migrate-handlers-task", "migrate-tests-task"])

All succeeded → create coordinating PR that merges all branches

Why MCP server (not just kubectl/kelos CLI)?

  1. Agent-native: MCP tools appear as structured tools in the agent's tool palette — the agent can reason about parameters, get structured responses, and handle errors naturally
  2. Safe by default: The MCP server validates inputs, enforces parentRef, and limits scope to the task's namespace
  3. Portable: Works across all agent types (Claude Code, Codex, Gemini, Cursor) since Kelos already has MCP injection for all of them
  4. Declarative: No shell commands or YAML templating — the agent just calls create_child_task with a prompt

Backward compatibility

  • parentRef and childTasks are optional fields — existing Tasks are unaffected
  • The kelos-mcp-server is opt-in via AgentConfig
  • No changes to existing TaskSpawner behavior
  • The RBAC automation only activates when the MCP server is configured

Incremental adoption path

  1. Phase 1: Add parentRef to TaskSpec and childTasks to TaskStatus (API-only, no behavior change)
  2. Phase 2: Build kelos-mcp-server with create_child_task and get_task_status
  3. Phase 3: Add wait_for_tasks and controller-side child tracking
  4. Phase 4: Lifecycle propagation (cancel children on parent failure) and CLI tree view

/kind feature

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions