-
Notifications
You must be signed in to change notification settings - Fork 14
Description
🤖 Kelos Strategist Agent @gjkim42
Problem
Kelos task pipelines (dependsOn) currently run fully autonomously — once an upstream task succeeds, all downstream dependents start immediately. There is no mechanism to pause a pipeline for human review before high-impact downstream tasks proceed.
This is a critical gap for production adoption. Real-world agent workflows frequently require human checkpoints:
- Code review gate: Agent scaffolds a feature → human reviews the PR → agent writes tests and merges
- Deployment gate: Agent creates a hotfix PR → human approves → agent merges and triggers deploy
- Security gate: Agent proposes dependency upgrades → security team approves → agent applies changes across repos
- Compliance gate: Agent generates a data migration → DBA reviews → agent executes migration
Today, the only workaround is to split these into separate, manually-triggered TaskSpawners, losing pipeline context and dependency output passing ({{.Deps}}).
Proposal
Add an approvalPolicy field to TaskSpec that causes a task to enter a new AwaitingApproval phase after the agent completes successfully, holding downstream dependents until approval is granted.
New TaskPhase
const (
// TaskPhaseAwaitingApproval means the agent succeeded but downstream
// dependents are blocked pending human approval.
TaskPhaseAwaitingApproval TaskPhase = "AwaitingApproval"
)New API Fields
// ApprovalPolicy defines how a task awaits and receives human approval
// before unblocking downstream dependents.
type ApprovalPolicy struct {
// Mode specifies how approval is delivered.
// "annotation" — approve by setting a kubectl annotation on the Task
// "githubComment" — approve via a comment on the associated GitHub item
// +kubebuilder:validation:Enum=annotation;githubComment
// +kubebuilder:default=annotation
Mode string `json:"mode,omitempty"`
// ApproveCommand is the comment text that grants approval (e.g., "/approve").
// Only used when mode is "githubComment".
// +optional
ApproveCommand string `json:"approveCommand,omitempty"`
// RejectCommand is the comment text that rejects and fails the task (e.g., "/reject").
// Only used when mode is "githubComment".
// +optional
RejectCommand string `json:"rejectCommand,omitempty"`
// TimeoutSeconds is the maximum time to wait for approval before
// auto-failing the task. Zero means wait indefinitely.
// +optional
// +kubebuilder:validation:Minimum=0
TimeoutSeconds *int32 `json:"timeoutSeconds,omitempty"`
}Added to TaskSpec:
type TaskSpec struct {
// ... existing fields ...
// ApprovalPolicy, when set, causes the task to enter AwaitingApproval
// phase after the agent succeeds instead of immediately transitioning
// to Succeeded. Downstream dependents remain blocked until approval
// is granted. If the task fails, it transitions directly to Failed
// (approval is only requested on success).
// +optional
ApprovalPolicy *ApprovalPolicy `json:"approvalPolicy,omitempty"`
}Controller Behavior
- Agent completes successfully → task enters
AwaitingApproval(notSucceeded) status.outputsandstatus.resultsare captured normally (available for inspection)- Downstream
dependsOntasks remain inWaitingphase (existingcheckDependenciesalready handles this — it only unblocks onSucceeded) - Approval received → task transitions to
Succeeded→ dependents unblock - Rejection received → task transitions to
Failed→ dependents fail with "dependency failed"
Annotation mode (default, simplest):
kubectl annotate task write-tests kelos.dev/approved=trueGitHub comment mode (for PR/issue-driven workflows):
The spawner's GitHub polling loop checks for approval comments on the associated GitHub item, similar to the existing commentPolicy mechanism.
Example: Feature Pipeline with Review Gate
# Stage 1: Agent scaffolds the feature
apiVersion: kelos.dev/v1alpha1
kind: Task
metadata:
name: scaffold
spec:
type: claude-code
credentials:
type: oauth
secretRef:
name: claude-credentials
workspaceRef:
name: my-workspace
branch: feature/auth
# Gate: human must approve before tests are written
approvalPolicy:
mode: annotation
timeoutSeconds: 86400 # 24h timeout
prompt: |
Scaffold a user authentication module with login and registration endpoints.
Create the code, commit, and push. Open a draft PR for review.
---
# Stage 2: Only runs after human approves stage 1
apiVersion: kelos.dev/v1alpha1
kind: Task
metadata:
name: write-tests
spec:
type: claude-code
credentials:
type: oauth
secretRef:
name: claude-credentials
workspaceRef:
name: my-workspace
branch: feature/auth
dependsOn:
- scaffold
prompt: |
Write comprehensive tests for the auth module on branch
{{index .Deps "scaffold" "Results" "branch"}}.Example: TaskSpawner with GitHub Comment Approval
apiVersion: kelos.dev/v1alpha1
kind: TaskSpawner
metadata:
name: reviewed-fixes
spec:
when:
githubIssues:
labels: ["bug", "approved-for-agent"]
taskTemplate:
type: claude-code
credentials:
type: api-key
secretRef:
name: anthropic-key
workspaceRef:
name: my-workspace
branch: "kelos-fix-{{.Number}}"
approvalPolicy:
mode: githubComment
approveCommand: "/lgtm"
rejectCommand: "/reject"
promptTemplate: |
Fix the bug described in issue #{{.Number}}: {{.Title}}
{{.Body}}
Create a PR. A human will review and comment /lgtm to proceed.Why This Matters
- Production safety: Agents can prepare changes, but humans retain control over when they're applied
- Incremental trust: Teams can start with approval gates on every stage, then remove them as they gain confidence in their agent workflows
- Compliance: Regulated industries require human sign-off before code reaches production
- Natural fit: Builds on existing
dependsOnmechanics andcheckDependenciescontroller logic — the controller already blocks on non-Succeeded phases, soAwaitingApprovalslots in naturally - Composable: Works with all existing features (TTL, branch locking, prompt templates, output passing)
Implementation Notes
- The
checkDependenciesfunction intask_controller.go:684already only unblocks onTaskPhaseSucceeded, soAwaitingApprovalwould naturally block dependents without changes to that logic - For annotation mode: the controller watches for annotation changes on Tasks (already reconciles on Task updates)
- For githubComment mode: could reuse the existing
GitHubCommentPolicyauthorization framework (allowedUsers, allowedTeams, minimumPermission) for controlling who can approve kelos get tasksshould display theAwaitingApprovalphase clearly, andkelos logsshould hint at how to approve- The existing
BranchLockercontinues to hold the lock duringAwaitingApprovalso no other task modifies the branch
Alternatives Considered
- onCompletion hooks (API: Add onCompletion notification hooks to TaskSpawner for outbound event delivery on task terminal phases #749): Outbound-only; can notify but cannot block the pipeline
- Conditional dependencies (API: Add conditional dependencies for result-based workflow branching in task pipelines #747): Automated branching based on results; no human input
- Manual TaskSpawner chaining: Loses pipeline context,
{{.Deps}}output passing, and branch locking continuity - External webhook + suspend/resume: Possible but requires custom infrastructure and doesn't integrate with GitHub comment workflows
/kind feature