fix(preprocess): deterministic Path-A for pre-validated harness + nudge prose-only agent turns by iraj465 · Pull Request #274 · AMD-AGI/GEAK

iraj465 · 2026-06-10T14:16:57Z

Summary

Two robust, workload-agnostic fixes for the v3 preprocess + agent loop. Both were found while running the gpt-oss-120b fused_moe kernel-optimization flow in mixed mode, but neither is specific to that workload — they are general correctness fixes.

1. Pre-validated harness → deterministic Path-A bypass

run_preprocess_v3 always drove preprocess through the LLM orchestrator, even when the caller supplied a pre-validated harness. The orchestrator's Step-0 "shapes pre-check" then either:

diverted a shape-bearing task to the harness-generator (regenerate a harness from scratch), or
simply failed to converge (100+ LLM steps),

…burning the entire preprocess budget (900s soft cap) without ever producing a benchmark_baseline, so the kernel run aborted.

A pre-validated harness already encodes its authoritative shapes internally — the whole A1 sequence (collect_baseline → collect_profile → render_commandment) is deterministic and needs no LLM. _run_prevalidated_path_a() runs it directly and returns PreprocessResult(path_taken="A"), preserving the existing worktree-bypass validate_harness gate. Profiling stays advisory/non-fatal (matches the orchestrator escape-hatch contract). Opt-out: GEAK_NO_PREVALIDATED_BYPASS=1. A prompt-level exemption in the Step-0 classifier is kept as a secondary guard.

Result: preprocess completes in ~260s instead of timing out at 900s.

2. Prose-only agent turns are nudged, not silently accepted

DefaultAgent.parse_action returned {"output":"","returncode":0} for a turn with no fenced bash and no tool call, and the final if all_action["output"] or all_action["returncode"] == 0: accepted it as a successful no-op. A model that believes it already finished (e.g. narrates "Done." / "tasks submitted") then repeats that prose every step with no corrective signal, looping until the step limit.

Observed in the heterogeneous task-planner: 143 prose turns → 0 tool calls → LimitsExceeded. Fix: track whether any action (bash / tool / skill) actually dispatched; if not, raise FormatError (a NonTerminatingException) so the model is nudged to emit a real action. Multi-action turns still raise the existing format error. test_empty_actions_handling still passes (the nudge is non-terminating → the next turn submits).

Test plan

tests/agents/test_default.py — new test_prose_only_turn_is_nudged_not_silently_accepted; full suite (13) passes
tests/run/test_preprocess_v3_bugfixes.py — new test_prevalidated_harness_bypasses_llm_orchestrator + test_prevalidated_bypass_opt_out_env; full suite (16) passes
End-to-end: gpt-oss-120b fused_moe mixed-mode run now completes preprocess (263s) and the planner submits tasks that dispatch to workers

🤖 Generated with Claude Code

…ge prose-only agent turns Two robust, workload-agnostic fixes for the v3 preprocess + agent loop that caused fused_moe (and any shape-bearing) kernel runs to stall in mixed mode. 1. Pre-validated harness deterministic bypass (adapter.py, orchestrator.py). When the caller supplies a harness it already validated end-to-end, the entire A1 preprocess (collect_baseline -> collect_profile -> render_commandment) is deterministic — there is nothing for the LLM orchestrator to decide. Driving it through the LLM anyway let the Step-0 classifier misroute a shape-bearing task to the harness-GENERATOR (regenerate from scratch) or fail to converge, burning the whole preprocess budget (900s soft cap) with no benchmark_baseline. `_run_prevalidated_path_a()` runs the deterministic sequence directly and returns PreprocessResult(path_taken="A"), keeping the same worktree-bypass validate_harness gate. Opt-out: GEAK_NO_PREVALIDATED_BYPASS=1. A prompt-level exemption in the orchestrator Step-0 classifier is kept as a secondary guard. Result: preprocess completes in ~260s instead of timing out. 2. Prose-only agent turns are nudged, not silently accepted (default.py). `parse_action` returned {"output":"","returncode":0} for a turn with no fenced bash and no tool call, and the `returncode == 0` check accepted it as a successful no-op. A model that believes it already finished then repeats prose every step with no corrective signal, looping until the step limit (observed: heterogeneous task-planner, 143 prose turns -> 0 tool calls -> LimitsExceeded). Track whether any action (bash/tool/skill) actually dispatched; if not, raise FormatError (NonTerminating) so the model is told to emit a real action. Multi-action turns still raise the existing error. Tests: new prose-only-nudge regression in tests/agents/test_default.py; new pre-validated bypass + opt-out tests in tests/run/test_preprocess_v3_bugfixes.py. All agent + preprocess suites pass. Co-Authored-By: Claude Opus 4 <noreply@anthropic.com>

iraj465 requested review from Umangatamd, amd-ethany, chao-xu-spec, jianghui-jianghui, sdubagun-amd and yueliu14 as code owners June 10, 2026 14:16

Base automatically changed from gwiab-scheduler to main June 12, 2026 12:01

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(preprocess): deterministic Path-A for pre-validated harness + nudge prose-only agent turns#274

fix(preprocess): deterministic Path-A for pre-validated harness + nudge prose-only agent turns#274
iraj465 wants to merge 1 commit into
mainfrom
fix/preprocess-prevalidated-and-prose-stall

iraj465 commented Jun 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

iraj465 commented Jun 10, 2026

Summary

1. Pre-validated harness → deterministic Path-A bypass

2. Prose-only agent turns are nudged, not silently accepted

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant