Skip to content

Simplify harness trigger and async run::start#174

Merged
ytallo merged 18 commits into
mainfrom
feat/harness-trigger-run-start-simplify
May 25, 2026
Merged

Simplify harness trigger and async run::start#174
ytallo merged 18 commits into
mainfrom
feat/harness-trigger-run-start-simplify

Conversation

@ytallo
Copy link
Copy Markdown
Contributor

@ytallo ytallo commented May 20, 2026

Summary

Simplifies the agent-run entrypoint and restructures the turn-orchestrator into a durable, per-state queue-driven FSM. The console now sends a flat harness::trigger payload and the harness always starts a run via run::start; the turn loop advances through per-state turn::{state} handlers enqueued on a per-session FIFO queue, behind a single dispatchWithHook approval chokepoint.

Harness ingress & run::start

  • Console sends messages via harness::trigger with a flat payload only; the harness always invokes run::start (no client-supplied function_id).
  • run::start uses Zod defaults, drops the unused cwd / cwd_hash from the run payload and iii state, and no longer replays synthetic agent::events at kickoff.
  • Removes the synchronous run::start_and_wait path (the on-terminal waiter, its policy entry, and sync_default_timeout_ms).

Turn-orchestrator restructure

  • Splits the monolithic dispatcher into per-state queue handlers (turn::{state}), colocated with their registration.
  • Enqueues turn wakes on a durable turn-step FIFO queue (grouped by session_id, concurrency 1) and gates durable steps so each session advances in order.
  • Routes the approval chokepoint through dispatchWithHook, replacing the agent::trigger dispatcher.
  • Parses turn_state trigger payloads at the Zod boundary; removes the isStepableRecordWrite wrapper and inlines parseStepableWrite.

Turn FSM

Each state is a registered turn::{state} function run via runTransition and enqueued on the turn-step FIFO queue by wakeState. failed is reachable from any state via an unhandled handler throw (not shown as edges).

stateDiagram-v2
  [*] --> provisioning
  provisioning --> assistant_streaming
  assistant_streaming --> function_execute: has function calls
  assistant_streaming --> steering_check: no function calls
  assistant_streaming --> tearing_down: error or aborted
  function_execute --> function_awaiting_approval: any call needs approval
  function_execute --> steering_check: batch complete
  function_execute --> tearing_down: all calls terminate session
  function_awaiting_approval --> function_execute: all decisions written
  steering_check --> assistant_streaming: continue turn
  steering_check --> tearing_down: stop or max turns
  tearing_down --> stopped
  stopped --> [*]
  failed --> [*]
Loading

Previous FSM

stateDiagram-v2
    [*] --> provisioning

    provisioning --> awaiting_assistant : system prompt built

    awaiting_assistant --> assistant_streaming : turn_count++ / emit turn_start
    awaiting_assistant --> tearing_down : max_turns reached (emit turn_end)

    assistant_streaming --> assistant_finished : stream done (final or synthetic error)

    assistant_finished --> tearing_down : stop_reason error/aborted (emit turn_end)
    assistant_finished --> steering_check : no function calls
    assistant_finished --> function_prepare : has function calls

    function_prepare --> function_execute : unwrap + persist prepared

    function_execute --> function_awaiting_approval : call needs approval (pending)
    function_execute --> function_finalize : all calls executed

    function_awaiting_approval --> function_awaiting_approval : decisions still pending (wait)
    function_awaiting_approval --> function_execute : decisions resolved (allow/deny applied)

    function_finalize --> tearing_down : all results terminate (emit turn_end)
    function_finalize --> steering_check : has non-terminal results

    steering_check --> awaiting_assistant : steering msg drained
    steering_check --> awaiting_assistant : followup msg drained
    steering_check --> awaiting_assistant : continue after function
    steering_check --> tearing_down : abort flag set
    steering_check --> tearing_down : end_turn (nothing queued)

    tearing_down --> stopped : stop sandbox + emit agent_end

    stopped --> [*]
Loading

Event translation

  • Renames message_end -> message_complete across harness, console, and ACP.
  • Consolidates event translation into a single createAgentEventTranslator used by the console backend.

Config, docs & cleanup

  • Adds harness/engine.config.yaml defining the turn-step FIFO queue.
  • Updates iii-permissions.yaml and the architecture / turn-orchestrator docs to the new function surface (run::start, turn::{state}, turn::get_state).
  • Removes the generated harness-flow.html and pnpm-lock.yaml; ignores the local harness-node/ build output; applies biome 2.4.10 formatting.

Test plan

  • biome ci harness passes locally (0 errors)
  • CI: harness node lint + test green

Summary by CodeRabbit

  • New Features

    • More reliable message completion handling with clearer streaming vs batch behavior and improved assistant stop/reason reporting.
    • Unified event translation for consistent UI updates and tool-call lifecycle signals.
  • Bug Fixes

    • Fewer duplicated messages and improved idempotency when persisting assistant/function results.
    • Better handling of transient provider/channel failures and clearer error messages.
  • Chores

    • Simplified trigger payload shape for starting chat turns.
    • Documentation updates for orchestration and harness behavior.

Review Change Stack

Console calls harness::trigger with a flat payload; harness always
starts run::start. Remove run::start_and_wait, cwd state keys, and
initial agent::events replay at kickoff so the FSM owns turn lifecycle.
@vercel
Copy link
Copy Markdown

vercel Bot commented May 20, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
workers Ready Ready Preview, Comment May 25, 2026 6:14pm

Request Review

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 20, 2026

Caution

Review failed

The pull request is closed.

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 85bea7cb-0b6c-4df5-bcc1-2226cf545c36

📥 Commits

Reviewing files that changed from the base of the PR and between b93ad52 and 608742e.

📒 Files selected for processing (1)
  • harness/tests/turn-orchestrator/approval-resume.test.ts

📝 Walkthrough

Walkthrough

Unifies event translation around a new message_complete event and replaces legacy translators. Repackages the turn orchestrator into a durable, queue-driven FSM with Zod schemas, wake/persist helpers, new state handlers, simplified harness trigger, many test updates, and docs/config adjustments.

Changes

Agent event model and translators

Layer / File(s) Summary
Events contract and ACP mapping
acp/README.md, acp/src/handler.rs, console/web/src/types/iii-agent-event.ts
Switches assistant lifecycle mapping to message_complete, adjusts ACP handler and type unions.
Console translator factory and wiring
console/web/src/lib/backend/translate.ts, console/web/src/lib/backend/real.ts
Adds stateful createAgentEventTranslator(), centralizes event -> StreamEvent[] mapping, removes legacy helpers, and updates real stream wiring.
Translator tests and ACP tests
console/web/src/lib/backend/translate.test.ts, acp/src/handler.rs
Migrates tests to validate message_complete handling, compaction, mirroring, and stop-reason semantics.

Durable turn orchestrator and wake queue

Layer / File(s) Summary
Schemas, run request, and types
harness/src/turn-orchestrator/schemas.ts, .../run-request.ts, harness/src/types/agent-event.ts
Adds Zod schemas for registered functions, typed RunRequest parsing, and updates AgentEvent union with message_complete.
Transition runner, persistence, and emit
harness/src/turn-orchestrator/run-transition.ts, persistence.ts, turn-state-write.ts
Adds runTransition, persistRecord/saveRecord changes (emit + conditional wake), and emitTurnStateChanged helper.
Wake queue and runtime wiring
harness/src/turn-orchestrator/wake.ts, harness/engine.config.yaml, harness/src/runtime/iii.ts
Implements FIFO enqueue wake (turn-step), gating predicates, engine config, and re-exports TriggerAction.
Entrypoints and registration
harness/src/turn-orchestrator/run-start.ts, get-state.ts, register.ts
Type-safe run::start and turn::get_state handlers; simplified register sequence and bootstrap wiring.
Abort and approval adapters
harness/src/turn-orchestrator/on-abort-signal.ts, approval-resume.ts
Schema-driven abort-signal parsing, wakeFromRecord usage for approvals/resume, updated registrations and logging.
Agent trigger chokepoint
harness/src/turn-orchestrator/agent-trigger.ts
Splits dispatchWithHook and triggerFunctionCall, normalizes inputs, and zod-validates error shapes.
Function execute & approval states
harness/src/turn-orchestrator/states/function-execute.ts, function-awaiting-approval.ts
New durable handlers for function execution, awaiting-approval processing, commit/finalize semantics, and registration.
Assistant streaming & finished transitions
harness/src/turn-orchestrator/states/assistant-streaming.ts, assistant-finished.ts
Provider streaming, partial message_update deltas, body_streamed tracking, message_complete emission, and function-call preparation.
Provisioning, steering, tearing down
harness/src/turn-orchestrator/states/*.ts
Provisioning parses directory bodies and advances to streaming; steering-check routes inboxes; tearing-down emits agent_end and stops.

Harness bridge, bootstrap, and docs

Layer / File(s) Summary
Harness trigger bridge
harness/src/harness/trigger.ts
Replaces dynamic function_id bridge with typed flat payload that forwards payload to run::start.
Bootstrap and engine config
harness/src/harness/register.ts, harness/engine.config.yaml
Reorders FanoutState init before config load and adds local engine worker config.
Docs, permissions, gitignore
harness/docs/*, harness/README.md, .gitignore, iii-permissions.yaml
Updates architecture and worker docs, removes run::start_and_wait deny, and adjusts gitignore entries.

Tests aligned to durable orchestrator

Layer / File(s) Summary
Translator and ACP tests
console/web/src/lib/backend/translate.test.ts, acp/src/handler.rs
Validates message_complete, compaction, and mirroring behavior.
Harness trigger and policy tests
harness/tests/harness/*
Validates flat WS payload and updated denylist.
Integration: saveRecord wake & reads
harness/tests/integration/on-record-written.e2e.test.ts
Asserts enqueue behavior from persistence.saveRecord and read-elimination.
Integration: approval resume & abort-signal
harness/tests/integration/approval-resume.e2e.test.ts
Switches tests to wakeTriggers and microtask flushing for approval resumes and abort writes.
Unit tests for abort adapter, wake, emit
harness/tests/turn-orchestrator/*on-abort-signal.test.ts, *wake*.test.ts, turn-state-write.test.ts
Covers schema parsing, enqueueing, and non-throwing emit behavior.
Provisioning, assistant, function, steering tests
harness/tests/turn-orchestrator/*
Extensive revisions to cover provisioning, streaming, completion, function execution, approvals, steering, teardown, and run-transition behaviors.
Run-start/run-request/run-transition/get-state tests
harness/tests/turn-orchestrator/*run-*.test.ts
Validates schemas, registration, enqueueing, runRequest parsing, and runTransition semantics.

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~120 minutes

Possibly related PRs

Suggested reviewers

  • sergiofilhowz

Poem

"A rabbit hops the queue so light,
message_complete shines through the night.
Schemas bloom and wake calls sing,
transitions hum — the orchestrator springs.
Carrot cheers for durable things."

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/harness-trigger-run-start-simplify

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 20, 2026

skill-check — worker

0 verified, 12 skipped (no docs/).

Layer Result
structure
vale
ai

Three for three. Nicely done.

ytallo added 4 commits May 20, 2026 20:08
Replace manual typeof/Record parsing with shared TurnStateWriteEventSchema
and parseTurnStateWrite, matching the run-start/trigger refactor style.
Each state-trigger adapter now exports register() with inline function ids, matching run-start.ts. register.ts only composes modules; exported ID constants removed.
…ng is type-enforced.

Share Zod turn_state write parsing between adapters; add handler tests for direct invokes that bypass the engine condition gate.
…ype safety

- Simplified `parseStepableWrite` to only return `session_id`, removing the state from its output.
- Inlined the handling of stepable writes directly in `handleStepableRecordWrite`, eliminating the fallback to durable publish.
- Updated tests to reflect changes in the `parseStepableWrite` output and removed obsolete tests related to fallback logic.
- Introduced `StepPayloadSchema` for consistent payload validation in the `turn::step` function, ensuring only valid session IDs are processed.
- Added new tests for `StepPayloadSchema` to validate input shapes and error handling.

These changes enhance clarity and maintainability of the turn orchestrator's state management logic.
ytallo added 2 commits May 22, 2026 19:27
Replace turn::step_requested publishes with queue enqueues in approval resume
and abort handling, add turn::should_step filtering for unknown or terminal
sessions, and tighten get_state with Zod payload parsing.
Replace the monolithic turn::step subscriber with turn::{state} functions
enqueued on turn-step, consolidate UI turn_state_changed emits into
persistence saves, and merge assistant/function states for a simpler FSM.
Reconcile the turn-orchestrator per-state queue refactor with main's
harness-node/ -> harness/ rename and parallel changes:

- Map the refactor (per-state handlers, turn-step FIFO queue, Zod
  boundary parsing, simplified harness::trigger) onto the harness/ tree.
- Adopt main's idempotency guards: function_result dedup lands in
  function-execute.ts via the rename merge; the assistant-message guard
  is ported into the split assistant-finished.ts handler.
- Take main's new providers (kimi/lmstudio/llamacpp), functionNotFoundHint
  hint, and system-prompt guidance (auto-merged, untouched by the refactor).
- Drop on-record-written/on-terminal/on-turn-state-changed/subscriber/
  transitions removed by the refactor; keep main's provider tests.
- Update functions error-result test for the new executedCalls clear, and
  re-add idempotency regression tests adapted to the split handlers.

typecheck clean; 826 tests pass.
…ranslations

- Updated the event type from `message_end` to `message_complete` in the README and codebase for consistency.
- Adjusted the translation logic in `handler.rs` to handle the new event type, ensuring proper rendering of assistant messages.
- Modified tests to reflect the change in event type, ensuring all related functionality is covered.
- Cleaned up comments and documentation to align with the new terminology.
- Deleted the pnpm-lock.yaml file from the project, which may indicate a shift in package management strategy or a cleanup of unused files.
…anslator

- Replaced the previous createTurnStateTranslator with a new createAgentEventTranslator function that handles all agent event translations, including `turn_state_changed`.
- Updated the realStream function to utilize the new translator, simplifying the event handling process.
- Removed deprecated translation functions and cleaned up related comments and documentation for clarity.
- Adjusted tests to ensure coverage of the new translation logic and verify correct event handling.
- Deleted the harness-flow.html file, which contained the end-to-end information flow documentation for the project. This may indicate a shift in documentation strategy or a move towards a different format for presenting this information.
@ytallo ytallo marked this pull request as ready for review May 25, 2026 08:32
CI lints with biome 2.4.10 while the repo pins ^1.9.4; reformat the
files whose 2.x layout differed to clear the 7 format errors.
harness::trigger no longer forwards a client-supplied function_id; it
takes a flat payload and always invokes run::start. Update harness.md and
the architecture telemetry/ingestion-bridge section to match.
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 12

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
harness/src/turn-orchestrator/approval-resume.ts (1)

83-98: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Only unregister the resume hook after the wake succeeds.

The decision is persisted before wakeFromRecord(), so retries are already safe. If the wake throws, Line 98 still unregisters the callback and the session can stay stuck in function_awaiting_approval with no way to retry locally.

Suggested fix
   try {
     await wakeFromRecord(iii, session_id);
+    unregisterApprovalResume(fnId);
   } catch (err) {
     logger.warn('approval resume: turn step wake failed', { session_id, err: String(err) });
+    throw err;
   }
-
-  unregisterApprovalResume(fnId);
 }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@harness/src/turn-orchestrator/approval-resume.ts` around lines 83 - 98, The
unregisterApprovalResume(fnId) call is currently executed regardless of
wakeFromRecord() success, which can leave the session stuck if wake fails;
change the flow so that after computing key via pendingKey(session_id,
function_call_id) and persisting decision with stateSet (guarded by
hasStoredDecision(existing)), you only call unregisterApprovalResume(fnId)
inside the try block after await wakeFromRecord(iii, session_id) completes
successfully (keep the logger.warn in the catch and do not unregister there) so
the resume hook remains available for retries if wakeFromRecord throws.
harness/src/turn-orchestrator/persistence.ts (1)

228-239: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Keep the legacy staged-call key during rollout.

These loaders now accept only function_call. Any in-flight session persisted with the previous tool_call shape will deserialize as empty, which drops approval/idempotency state and can re-run or skip calls on resume.

Also applies to: 251-265

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@harness/src/turn-orchestrator/persistence.ts` around lines 228 - 239,
loadPreparedCalls currently only reads obj.function_call and will drop legacy
staged entries that use obj.tool_call; update loadPreparedCalls to accept either
obj.function_call or obj.tool_call (prefer function_call when both exist), map
the legacy tool_call into the same FunctionCall shape, and still preserve
blocked (obj.blocked) and pre_approved (obj.pre_approved) when pushing
PreparedEntry. Apply the same change to the other staged-call loader in this
module (the sibling loader that iterates staged items around the later block) so
both loaders are backwards-compatible with the legacy tool_call key.
🧹 Nitpick comments (2)
harness/src/turn-orchestrator/states/provisioning.ts (1)

81-85: ⚡ Quick win

Fetch the default skills concurrently.

This startup path currently inherits the serial fetchDefaultSkills() loop, so first-token latency grows with the sum of every directory::skills::get call. Promise.all(...) keeps the configured order and removes the unnecessary wall-clock delay.

♻️ Suggested refactor
 async function fetchDefaultSkills(iii: ISdk, uris: readonly string[]): Promise<DefaultSkillBody[]> {
-  const bodies: DefaultSkillBody[] = [];
-  for (const uri of uris) {
-    const id = uri.startsWith('iii://') ? uri.slice('iii://'.length) : uri;
-    const body = await fetchSkill(iii, id);
-    bodies.push(defaultSkillBody(uri, body));
-  }
-  return bodies;
+  return Promise.all(
+    uris.map(async (uri) => {
+      const id = uri.startsWith('iii://') ? uri.slice('iii://'.length) : uri;
+      const body = await fetchSkill(iii, id);
+      return defaultSkillBody(uri, body);
+    }),
+  );
 }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@harness/src/turn-orchestrator/states/provisioning.ts` around lines 81 - 85,
The code currently awaits fetchSkillsIndex and then fetchDefaultSkills serially
which increases startup latency; change the call site that builds the prompt
(where buildSystemPrompt is invoked with skillsIndex and bodies) to fetch both
values concurrently using Promise.all by calling
Promise.all([fetchSkillsIndex(iii), fetchDefaultSkills(iii,
cfg.system_default_skills)]) and destructuring the result into [skillsIndex,
bodies] so the order is preserved before calling buildSystemPrompt(bodies, null,
override, request.mode, skillsIndex).
harness/src/types/agent-event.ts (1)

26-29: ⚡ Quick win

Narrow message_complete.message to AssistantMessage.

AgentMessage also permits user/function/custom payloads, but this event represents assistant-stream completion. Keeping the wire type this broad weakens downstream exhaustiveness and lets invalid emits compile.

♻️ Proposed fix
-import type { AgentMessage, FunctionResultMessage } from './agent-message.js';
+import type { AgentMessage, AssistantMessage, FunctionResultMessage } from './agent-message.js';
@@
   | {
       type: 'message_complete';
-      message: AgentMessage;
+      message: AssistantMessage;
       /** When true, text/thinking were already delivered via message_update. */
       body_streamed?: boolean;
     }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@harness/src/types/agent-event.ts` around lines 26 - 29, The message_complete
event's message field is too broad: change the type of the message field on the
'message_complete' event from AgentMessage to AssistantMessage so only assistant
messages can be emitted; update the declaration of the union/variant where type:
'message_complete' is defined (the message field) to use AssistantMessage and
adjust any related imports/exports or type guards (e.g., places expecting
AgentMessage for 'message_complete') to preserve exhaustiveness and prevent
invalid emits.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@console/web/src/lib/backend/translate.ts`:
- Around line 96-97: The switch case in translate.ts that maps the raw
'agent_end' event to a StreamEvent [{ kind: 'assistant-end' }] causes a
duplicate close because assistant replies already close on 'message_complete';
update the 'agent_end' branch in the translateRawEvent/translate (the switch
handling event kinds) to not emit an additional 'assistant-end' (e.g., return no
StreamEvent/empty array or null) so realStream() can still terminate on the raw
'agent_end' without producing a redundant assistant segment in the UI.

In `@harness/src/harness/trigger.ts`:
- Around line 23-27: HarnessTriggerInputSchema currently uses z.object(...)
which strips unknown keys; change it to a strict schema so unexpected/stale
fields are rejected. Locate the HarnessTriggerInputSchema declaration and call
the Zod strict API on that object (i.e., convert the z.object({...}) definition
to use .strict() so unknown keys like legacy function_id will cause validation
to fail), keeping RunStartPayloadSchema and the existing optional
session_id/message_id shape unchanged.

In `@harness/src/turn-orchestrator/on-abort-signal.ts`:
- Around line 37-45: The handler currently swallows failures from wakeFromRecord
in execute(iii, write) by logging and returning success; instead, after logging
the warning (logger.warn) re-throw the caught error so the caller/adapter can
retry the abort wake; modify the catch block to log the error with session_id
and then throw the original err (or rethrow) rather than returning silently.

In `@harness/src/turn-orchestrator/persistence.ts`:
- Around line 85-87: The current flow lets wakeState enqueue failures be
swallowed (so saveRecord can return without a scheduled wake); change behavior
so durable-wake failures surface as errors: in saveRecord (or the caller that
runs shouldWakeStep), call wakeState(iii, rec.session_id, rec.state) and
propagate any error it returns/throws (do not ignore/log-and-continue), or alter
wakeState in harness/src/turn-orchestrator/wake.ts to rethrow enqueue failures
(or return an explicit error) so saveRecord can detect it and fail the save.
Ensure you reference shouldWakeStep and wakeState and make the call await and
let exceptions bubble up instead of treating wake as best-effort.
- Around line 66-74: The code currently calls stateSet(...) and then always
emits emitTurnStateChanged(...), but stateSet swallows errors so a failed write
still triggers the event; change the flow so that failed state writes prevent
emitting: either make stateSet propagate/throw on error or wrap the await
stateSet(...) call in a try/catch and on catch do not call emitTurnStateChanged
(instead rethrow or return an error), ensuring
emitTurnStateChanged(rec.session_id, ...) is only invoked when stateSet
successfully completes; reference the stateSet, turnStateKey, and
emitTurnStateChanged symbols when implementing the guard.

In `@harness/src/turn-orchestrator/schemas.ts`:
- Around line 21-22: The schema currently allows empty provider and model
strings; update the provider and model validators in the schema (the zod fields
named provider and model) to reject empty/whitespace-only values by using
trimmed non-empty constraints such as z.string().trim().min(1) or
z.string().nonempty(). Do this change on the provider and model entries so
invalid requests are rejected at the schema boundary.
- Line 24: RunStartPayloadSchema currently uses messages:
z.custom<AgentMessage[]>((v) => Array.isArray(v)).default([]) which only checks
the value is an array and allows invalid entries; replace this with a proper
messages: z.array(...) using a validated AgentMessageSchema (or a
z.discriminatedUnion on the "role" field that defines each variant's required
fields and nested content validation) so each element is runtime-validated;
update or add a reusable AgentMessageSchema (or per-role schemas) and use it in
RunStartPayloadSchema.messages to reject malformed entries at the run::start
boundary instead of casting later.

In `@harness/src/turn-orchestrator/states/assistant-streaming.ts`:
- Around line 149-212: The read loop can hang if iii.trigger (triggerPromise)
resolves or rejects without a terminal 'done'/'error' event; after creating
triggerPromise (the iii.trigger call) attach a finally handler that sets done =
true and wakes the reader by calling resolveNext (clearing it) if present so
readPromise's loop unblocks; ensure this is added alongside the existing .catch
so both success and failure paths will signal completion to the read loop (refer
to triggerPromise, readPromise, done, resolveNext, messageQueue, and
iii.trigger).

In `@harness/src/turn-orchestrator/states/function-execute.ts`:
- Around line 97-116: The code uses e.is_error when pushing into
function_results but publishAfter (called on e.function_call/result) can replace
result with merged, so is_error may be stale; update the logic after obtaining
merged (in the loop that iterates executed and calls publishAfter) to derive
is_error from the final result/merged (e.g., inspect the merged object's
error/denial shape or a boolean property on the resulting FunctionResult)
instead of using e.is_error, and set the is_error field in the pushed object
from that computed value so function_results reflects post-hook changes.

In `@harness/src/turn-orchestrator/states/steering-check.ts`:
- Around line 129-137: The steering/followup branch currently drains and appends
inbox items destructively (using steering/followup, emitTurnEndOnce,
persistence.loadMessages and persistence.saveMessages) before the transition is
made durable, risking data loss on failure; instead implement a claim/ack or
durable staging step: atomically move or persist the drained items (e.g., write
them to a staged buffer on the record like rec.staged_inbox or call a new
persistence.stageMessages API) before removing them from the session inbox, only
clear/ack the original inbox after persistence.saveMessages and
persistence.saveRecord succeed, and keep rec.function_results reset and
transitionTo(rec, 'assistant_streaming') only after the durable acknowledgement
completes so retries see the staged items.

In `@harness/src/turn-orchestrator/states/tearing-down.ts`:
- Around line 17-19: The code calls emit(iii, rec.session_id, { type:
'agent_end', messages }) and immediately calls transitionTo(rec, 'stopped'), but
emit() can swallow stream::set failures so the terminal event may never be
delivered; update the teardown in tearing-down.ts to either propagate emit's
error (i.e., await and bubble up any failure from emit) or implement a durable
terminal-emitted/idempotency marker and retry logic: after
persistence.loadMessages(...) call emit(...) and if it fails, persist a
terminal-marker tied to rec.session_id (or retry emit a configurable number of
times with backoff) and only call transitionTo(rec, 'stopped') once emit has
succeeded or the durable marker confirms the terminal event was recorded;
reference functions/variables: persistence.loadMessages, emit, transitionTo,
rec.session_id, and the 'agent_end' message type.

In `@harness/src/turn-orchestrator/wake.ts`:
- Around line 29-37: The current wakeState call catches failures from
iii.trigger (the enqueue to TURN_STEP_QUEUE) and only logs a warning, which can
stall the session; change the catch to log the error using the existing logger
(keeping session_id and state) and then rethrow the original error so callers
can observe/fail fast; specifically update the try/catch around iii.trigger (the
call that uses function_id: turnFnId(state) and action: TriggerAction.Enqueue({
queue: TURN_STEP_QUEUE })) to logger.warn(..., { session_id, state, err })
followed by rethrowing the caught err.

---

Outside diff comments:
In `@harness/src/turn-orchestrator/approval-resume.ts`:
- Around line 83-98: The unregisterApprovalResume(fnId) call is currently
executed regardless of wakeFromRecord() success, which can leave the session
stuck if wake fails; change the flow so that after computing key via
pendingKey(session_id, function_call_id) and persisting decision with stateSet
(guarded by hasStoredDecision(existing)), you only call
unregisterApprovalResume(fnId) inside the try block after await
wakeFromRecord(iii, session_id) completes successfully (keep the logger.warn in
the catch and do not unregister there) so the resume hook remains available for
retries if wakeFromRecord throws.

In `@harness/src/turn-orchestrator/persistence.ts`:
- Around line 228-239: loadPreparedCalls currently only reads obj.function_call
and will drop legacy staged entries that use obj.tool_call; update
loadPreparedCalls to accept either obj.function_call or obj.tool_call (prefer
function_call when both exist), map the legacy tool_call into the same
FunctionCall shape, and still preserve blocked (obj.blocked) and pre_approved
(obj.pre_approved) when pushing PreparedEntry. Apply the same change to the
other staged-call loader in this module (the sibling loader that iterates staged
items around the later block) so both loaders are backwards-compatible with the
legacy tool_call key.

---

Nitpick comments:
In `@harness/src/turn-orchestrator/states/provisioning.ts`:
- Around line 81-85: The code currently awaits fetchSkillsIndex and then
fetchDefaultSkills serially which increases startup latency; change the call
site that builds the prompt (where buildSystemPrompt is invoked with skillsIndex
and bodies) to fetch both values concurrently using Promise.all by calling
Promise.all([fetchSkillsIndex(iii), fetchDefaultSkills(iii,
cfg.system_default_skills)]) and destructuring the result into [skillsIndex,
bodies] so the order is preserved before calling buildSystemPrompt(bodies, null,
override, request.mode, skillsIndex).

In `@harness/src/types/agent-event.ts`:
- Around line 26-29: The message_complete event's message field is too broad:
change the type of the message field on the 'message_complete' event from
AgentMessage to AssistantMessage so only assistant messages can be emitted;
update the declaration of the union/variant where type: 'message_complete' is
defined (the message field) to use AssistantMessage and adjust any related
imports/exports or type guards (e.g., places expecting AgentMessage for
'message_complete') to preserve exhaustiveness and prevent invalid emits.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: d2ba998b-d58d-4f60-bba3-a1f1d9199527

📥 Commits

Reviewing files that changed from the base of the PR and between a28ff9d and b93ad52.

📒 Files selected for processing (71)
  • .gitignore
  • acp/README.md
  • acp/src/handler.rs
  • console/web/src/lib/backend/real.ts
  • console/web/src/lib/backend/translate.test.ts
  • console/web/src/lib/backend/translate.ts
  • console/web/src/types/iii-agent-event.ts
  • harness/README.md
  • harness/docs/architecture.md
  • harness/docs/workers/harness.md
  • harness/docs/workers/turn-orchestrator.md
  • harness/engine.config.yaml
  • harness/src/harness/register.ts
  • harness/src/harness/trigger.ts
  • harness/src/runtime/iii.ts
  • harness/src/turn-orchestrator/agent-trigger.ts
  • harness/src/turn-orchestrator/approval-resume.ts
  • harness/src/turn-orchestrator/config.ts
  • harness/src/turn-orchestrator/get-state.ts
  • harness/src/turn-orchestrator/on-abort-signal.ts
  • harness/src/turn-orchestrator/on-record-written.ts
  • harness/src/turn-orchestrator/on-terminal.ts
  • harness/src/turn-orchestrator/on-turn-state-changed.ts
  • harness/src/turn-orchestrator/persistence.ts
  • harness/src/turn-orchestrator/register.ts
  • harness/src/turn-orchestrator/run-request.ts
  • harness/src/turn-orchestrator/run-start.ts
  • harness/src/turn-orchestrator/run-transition.ts
  • harness/src/turn-orchestrator/schemas.ts
  • harness/src/turn-orchestrator/state.ts
  • harness/src/turn-orchestrator/states/assistant-finished.ts
  • harness/src/turn-orchestrator/states/assistant-streaming.ts
  • harness/src/turn-orchestrator/states/assistant.ts
  • harness/src/turn-orchestrator/states/function-awaiting-approval.ts
  • harness/src/turn-orchestrator/states/function-execute.ts
  • harness/src/turn-orchestrator/states/functions.ts
  • harness/src/turn-orchestrator/states/index.ts
  • harness/src/turn-orchestrator/states/provisioning.ts
  • harness/src/turn-orchestrator/states/steering-check.ts
  • harness/src/turn-orchestrator/states/tearing-down.ts
  • harness/src/turn-orchestrator/subscriber.ts
  • harness/src/turn-orchestrator/transitions.ts
  • harness/src/turn-orchestrator/turn-state-write.ts
  • harness/src/turn-orchestrator/wake.ts
  • harness/src/types/agent-event.ts
  • harness/tests/harness/policy.test.ts
  • harness/tests/harness/trigger.test.ts
  • harness/tests/integration/approval-resume.e2e.test.ts
  • harness/tests/integration/on-record-written.e2e.test.ts
  • harness/tests/integration/wire-parity.test.ts
  • harness/tests/turn-orchestrator/agent-trigger.test.ts
  • harness/tests/turn-orchestrator/approval-resume.test.ts
  • harness/tests/turn-orchestrator/assistant.test.ts
  • harness/tests/turn-orchestrator/awaiting-approval.test.ts
  • harness/tests/turn-orchestrator/config.test.ts
  • harness/tests/turn-orchestrator/functions.test.ts
  • harness/tests/turn-orchestrator/get-state.test.ts
  • harness/tests/turn-orchestrator/on-abort-signal.test.ts
  • harness/tests/turn-orchestrator/on-record-written.test.ts
  • harness/tests/turn-orchestrator/on-terminal.test.ts
  • harness/tests/turn-orchestrator/on-turn-state-changed.test.ts
  • harness/tests/turn-orchestrator/provisioning.test.ts
  • harness/tests/turn-orchestrator/run-request.test.ts
  • harness/tests/turn-orchestrator/run-start.test.ts
  • harness/tests/turn-orchestrator/run-transition.test.ts
  • harness/tests/turn-orchestrator/state.test.ts
  • harness/tests/turn-orchestrator/steering.test.ts
  • harness/tests/turn-orchestrator/tearing-down.test.ts
  • harness/tests/turn-orchestrator/turn-state-write.test.ts
  • harness/tests/turn-orchestrator/wake.test.ts
  • iii-permissions.yaml
💤 Files with no reviewable changes (13)
  • harness/tests/turn-orchestrator/on-turn-state-changed.test.ts
  • harness/tests/integration/wire-parity.test.ts
  • harness/src/turn-orchestrator/states/assistant.ts
  • harness/src/turn-orchestrator/states/functions.ts
  • iii-permissions.yaml
  • harness/tests/turn-orchestrator/on-record-written.test.ts
  • harness/src/turn-orchestrator/on-turn-state-changed.ts
  • harness/tests/turn-orchestrator/on-terminal.test.ts
  • harness/src/turn-orchestrator/subscriber.ts
  • harness/tests/harness/policy.test.ts
  • harness/src/turn-orchestrator/transitions.ts
  • harness/src/turn-orchestrator/on-record-written.ts
  • harness/src/turn-orchestrator/on-terminal.ts

Comment on lines +96 to +97
case 'agent_end':
return [{ kind: 'assistant-end' }]
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Don't emit a second assistant-end on agent_end.

With this refactor, the assistant already closes on message_complete, so translating teardown's agent_end into another assistant-end will double-close the last reply. realStream() still terminates on the raw agent_end, so this extra StreamEvent is redundant and can create an empty trailing assistant segment in the UI.

Suggested fix
      case 'agent_end':
-        return [{ kind: 'assistant-end' }]
+        return []
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
case 'agent_end':
return [{ kind: 'assistant-end' }]
case 'agent_end':
return []
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@console/web/src/lib/backend/translate.ts` around lines 96 - 97, The switch
case in translate.ts that maps the raw 'agent_end' event to a StreamEvent [{
kind: 'assistant-end' }] causes a duplicate close because assistant replies
already close on 'message_complete'; update the 'agent_end' branch in the
translateRawEvent/translate (the switch handling event kinds) to not emit an
additional 'assistant-end' (e.g., return no StreamEvent/empty array or null) so
realStream() can still terminate on the raw 'agent_end' without producing a
redundant assistant segment in the UI.

Comment on lines +23 to +27
const HarnessTriggerInputSchema = z.object({
session_id: z.string().optional(),
message_id: z.string().optional(),
payload: RunStartPayloadSchema,
});
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

🧩 Analysis chain

🌐 Web query:

In Zod 3.23.8, does z.object({...}).parse(...)strip unknown keys by default unless.strict() is used?

💡 Result:

Yes. In Zod (including 3.23.8), z.object({...}).parse(...) uses the default “unknownKeys” policy of stripping unrecognized keys from the parsed output; it does not throw just because unknown keys exist. You only get an error for unknown keys when you explicitly add.strict.[1][2][3] So the behavior is: - Default: unknown keys are stripped from the result.[1][2][3] -.strict: unknown keys cause a validation error.[1][2][3] If you want to keep unknown keys instead of stripping them, you can use.passthrough.[1][3]

Citations:


Make HarnessTriggerInputSchema strict to reject unknown/stale client fields.

In harness/src/harness/trigger.ts (lines 23-27), z.object(...) strips unknown keys by default in Zod 3.23.8, so deprecated shapes like { function_id, payload } can still be accepted (with the extra keys removed) instead of failing the new contract.

Proposed fix
-const HarnessTriggerInputSchema = z.object({
-  session_id: z.string().optional(),
-  message_id: z.string().optional(),
-  payload: RunStartPayloadSchema,
-});
+const HarnessTriggerInputSchema = z
+  .object({
+    session_id: z.string().optional(),
+    message_id: z.string().optional(),
+    payload: RunStartPayloadSchema,
+  })
+  .strict();
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
const HarnessTriggerInputSchema = z.object({
session_id: z.string().optional(),
message_id: z.string().optional(),
payload: RunStartPayloadSchema,
});
const HarnessTriggerInputSchema = z
.object({
session_id: z.string().optional(),
message_id: z.string().optional(),
payload: RunStartPayloadSchema,
})
.strict();
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@harness/src/harness/trigger.ts` around lines 23 - 27,
HarnessTriggerInputSchema currently uses z.object(...) which strips unknown
keys; change it to a strict schema so unexpected/stale fields are rejected.
Locate the HarnessTriggerInputSchema declaration and call the Zod strict API on
that object (i.e., convert the z.object({...}) definition to use .strict() so
unknown keys like legacy function_id will cause validation to fail), keeping
RunStartPayloadSchema and the existing optional session_id/message_id shape
unchanged.

Comment on lines +37 to 45
export async function execute(iii: ISdk, write: ParsedAbortSignalWrite): Promise<void> {
try {
await iii.trigger<unknown, unknown>({
function_id: 'iii::durable::publish',
payload: { topic: STEP_TOPIC, data: { session_id } },
});
await wakeFromRecord(iii, write.session_id);
} catch (err) {
logger.warn('turn::on_abort_signal: publish failed', {
session_id,
logger.warn('turn::on_abort_signal: wake failed', {
session_id: write.session_id,
err: String(err),
});
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Retry abort wakes instead of acknowledging failures.

This adapter is the only reaction to the abort_signal write. If wakeFromRecord() throws and we return success anyway, that trigger event is gone and the abort may never schedule the follow-up FSM step. Re-throw after logging so the state trigger can retry.

Suggested fix
 export async function execute(iii: ISdk, write: ParsedAbortSignalWrite): Promise<void> {
   try {
     await wakeFromRecord(iii, write.session_id);
   } catch (err) {
     logger.warn('turn::on_abort_signal: wake failed', {
       session_id: write.session_id,
       err: String(err),
     });
+    throw err;
   }
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
export async function execute(iii: ISdk, write: ParsedAbortSignalWrite): Promise<void> {
try {
await iii.trigger<unknown, unknown>({
function_id: 'iii::durable::publish',
payload: { topic: STEP_TOPIC, data: { session_id } },
});
await wakeFromRecord(iii, write.session_id);
} catch (err) {
logger.warn('turn::on_abort_signal: publish failed', {
session_id,
logger.warn('turn::on_abort_signal: wake failed', {
session_id: write.session_id,
err: String(err),
});
}
export async function execute(iii: ISdk, write: ParsedAbortSignalWrite): Promise<void> {
try {
await wakeFromRecord(iii, write.session_id);
} catch (err) {
logger.warn('turn::on_abort_signal: wake failed', {
session_id: write.session_id,
err: String(err),
});
throw err;
}
}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@harness/src/turn-orchestrator/on-abort-signal.ts` around lines 37 - 45, The
handler currently swallows failures from wakeFromRecord in execute(iii, write)
by logging and returning success; instead, after logging the warning
(logger.warn) re-throw the caught error so the caller/adapter can retry the
abort wake; modify the catch block to log the error with session_id and then
throw the original err (or rethrow) rather than returning silently.

Comment on lines 66 to +74
await stateSet(iii, turnStateKey(rec.session_id), rec);

await emitTurnStateChanged(
iii,
rec.session_id,
eventType,
rec as unknown as Record<string, unknown>,
prev !== null ? (prev as unknown as Record<string, unknown>) : undefined,
);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Don’t emit turn_state_changed after a failed state write.

stateSet() swallows state::set errors on Lines 36-44, so this path still publishes a new turn_state_changed event even when session/<sid>/turn_state was never updated. That can desync the console from durable state.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@harness/src/turn-orchestrator/persistence.ts` around lines 66 - 74, The code
currently calls stateSet(...) and then always emits emitTurnStateChanged(...),
but stateSet swallows errors so a failed write still triggers the event; change
the flow so that failed state writes prevent emitting: either make stateSet
propagate/throw on error or wrap the await stateSet(...) call in a try/catch and
on catch do not call emitTurnStateChanged (instead rethrow or return an error),
ensuring emitTurnStateChanged(rec.session_id, ...) is only invoked when stateSet
successfully completes; reference the stateSet, turnStateKey, and
emitTurnStateChanged symbols when implementing the guard.

Comment on lines +85 to +87
if (shouldWakeStep(prev?.state ?? null, rec.state)) {
await wakeState(iii, rec.session_id, rec.state);
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Surface durable-wake failures here instead of treating them as best effort.

wakeState() in harness/src/turn-orchestrator/wake.ts currently catches and logs enqueue failures, so saveRecord() can persist a stepable state and return with no wake scheduled. A transient queue error would strand the session until some unrelated write happens.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@harness/src/turn-orchestrator/persistence.ts` around lines 85 - 87, The
current flow lets wakeState enqueue failures be swallowed (so saveRecord can
return without a scheduled wake); change behavior so durable-wake failures
surface as errors: in saveRecord (or the caller that runs shouldWakeStep), call
wakeState(iii, rec.session_id, rec.state) and propagate any error it
returns/throws (do not ignore/log-and-continue), or alter wakeState in
harness/src/turn-orchestrator/wake.ts to rethrow enqueue failures (or return an
explicit error) so saveRecord can detect it and fail the save. Ensure you
reference shouldWakeStep and wakeState and make the call await and let
exceptions bubble up instead of treating wake as best-effort.

Comment on lines +149 to +212
let triggerError: string | null = null;
const triggerPromise = iii
.trigger<unknown, unknown>({
function_id: targetFn,
payload: input,
timeoutMs: 300_000,
})
.catch((err) => {
logger.warn('provider stream trigger failed', { targetFn, err: String(err) });
triggerError = formatProviderError(err);
done = true;
if (resolveNext) {
const fn = resolveNext;
resolveNext = null;
fn();
}
return null;
});

const readPromise = (async (): Promise<AssistantMessage | null> => {
let final: AssistantMessage | null = null;
while (!done) {
while (messageQueue.length > 0) {
const text = messageQueue.shift();
if (text === undefined) break;
let event: AssistantMessageEvent | null = null;
try {
event = JSON.parse(text) as AssistantMessageEvent;
} catch (err) {
logger.warn('decode AssistantMessageEvent failed', {
session_id: rec.session_id,
err: String(err),
});
continue;
}
const partial = eventPartial(event);
if (partial) final = partial;
if (event.type !== 'done' && event.type !== 'error') {
if (partial) {
await emit(iii, rec.session_id, {
type: 'message_update',
message: partial,
llm_event: event,
});
if (event.type === 'text_delta' || event.type === 'thinking_delta') {
rec.assistant_body_streamed = true;
}
}
continue;
}
if (event.type === 'done') final = event.message;
else final = event.error;
done = true;
break;
}
if (done) break;
await new Promise<void>((r) => {
resolveNext = r;
});
}
return final;
})();

const [, finalMsg] = await Promise.all([triggerPromise, readPromise]);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Prevent indefinite wait when provider completes without a terminal stream event.

At Line 170 and Line 205, the read loop can block forever if the trigger resolves but no done/error frame is ever enqueued. This stalls the FSM transition.

Proposed fix
   let triggerError: string | null = null;
+  let triggerSettled = false;
   const triggerPromise = iii
     .trigger<unknown, unknown>({
       function_id: targetFn,
       payload: input,
       timeoutMs: 300_000,
     })
     .catch((err) => {
       logger.warn('provider stream trigger failed', { targetFn, err: String(err) });
       triggerError = formatProviderError(err);
       done = true;
       if (resolveNext) {
         const fn = resolveNext;
         resolveNext = null;
         fn();
       }
       return null;
-    });
+    })
+    .finally(() => {
+      triggerSettled = true;
+      if (resolveNext) {
+        const fn = resolveNext;
+        resolveNext = null;
+        fn();
+      }
+    });

   const readPromise = (async (): Promise<AssistantMessage | null> => {
     let final: AssistantMessage | null = null;
     while (!done) {
       while (messageQueue.length > 0) {
         const text = messageQueue.shift();
@@
       }
       if (done) break;
+      if (triggerSettled && messageQueue.length === 0) {
+        done = true;
+        break;
+      }
       await new Promise<void>((r) => {
         resolveNext = r;
       });
     }
     return final;
   })();
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@harness/src/turn-orchestrator/states/assistant-streaming.ts` around lines 149
- 212, The read loop can hang if iii.trigger (triggerPromise) resolves or
rejects without a terminal 'done'/'error' event; after creating triggerPromise
(the iii.trigger call) attach a finally handler that sets done = true and wakes
the reader by calling resolveNext (clearing it) if present so readPromise's loop
unblocks; ensure this is added alongside the existing .catch so both success and
failure paths will signal completion to the read loop (refer to triggerPromise,
readPromise, done, resolveNext, messageQueue, and iii.trigger).

Comment on lines +97 to +116
for (const e of executed) {
let result = e.result;
const merged = await publishAfter(iii, e.function_call, result);
if (
merged &&
typeof merged === 'object' &&
Array.isArray((merged as Record<string, unknown>).content)
) {
result = merged as FunctionResult;
}
if (!result.terminate) all_terminate = false;
function_results.push({
role: 'function_result',
function_call_id: e.function_call.id,
function_id: e.function_call.function_id,
content: result.content,
details: result.details,
is_error: e.is_error,
timestamp: Date.now(),
});
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Keep is_error consistent with the post-hook merged result.

At Line 114, is_error is taken from e.is_error even after Line 99–106 may replace result. If hooks change details to an error/denial shape, is_error can become stale.

Proposed fix
   for (const e of executed) {
     let result = e.result;
@@
     if (
       merged &&
       typeof merged === 'object' &&
       Array.isArray((merged as Record<string, unknown>).content)
     ) {
       result = merged as FunctionResult;
     }
+    const is_error = isErrorResult(result);
     if (!result.terminate) all_terminate = false;
     function_results.push({
       role: 'function_result',
       function_call_id: e.function_call.id,
       function_id: e.function_call.function_id,
       content: result.content,
       details: result.details,
-      is_error: e.is_error,
+      is_error,
       timestamp: Date.now(),
     });
   }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@harness/src/turn-orchestrator/states/function-execute.ts` around lines 97 -
116, The code uses e.is_error when pushing into function_results but
publishAfter (called on e.function_call/result) can replace result with merged,
so is_error may be stale; update the logic after obtaining merged (in the loop
that iterates executed and calls publishAfter) to derive is_error from the final
result/merged (e.g., inspect the merged object's error/denial shape or a boolean
property on the resulting FunctionResult) instead of using e.is_error, and set
the is_error field in the pushed object from that computed value so
function_results reflects post-hook changes.

Comment on lines +129 to +137
case 'steering':
case 'followup': {
const inbox = decision === 'steering' ? steering : followup;
await emitTurnEndOnce(iii, rec);
const messages = await persistence.loadMessages(iii, rec.session_id);
messages.push(...followup);
messages.push(...inbox);
await persistence.saveMessages(iii, rec.session_id, messages);
rec.function_results = [];
transitionTo(rec, 'awaiting_assistant');
transitionTo(rec, 'assistant_streaming');
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Avoid destructive inbox drains before the transition is durable.

This branch consumes messages that were already removed by session-inbox::drain on Lines 101-103. If saveMessages() here or persistence.saveRecord() in harness/src/turn-orchestrator/run-transition.ts fails afterward, the retried step still sees steering_check but those user inputs are gone. Please switch this to a claim/ack flow, or durably stage the drained items before removing them from the inbox.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@harness/src/turn-orchestrator/states/steering-check.ts` around lines 129 -
137, The steering/followup branch currently drains and appends inbox items
destructively (using steering/followup, emitTurnEndOnce,
persistence.loadMessages and persistence.saveMessages) before the transition is
made durable, risking data loss on failure; instead implement a claim/ack or
durable staging step: atomically move or persist the drained items (e.g., write
them to a staged buffer on the record like rec.staged_inbox or call a new
persistence.stageMessages API) before removing them from the session inbox, only
clear/ack the original inbox after persistence.saveMessages and
persistence.saveRecord succeed, and keep rec.function_results reset and
transitionTo(rec, 'assistant_streaming') only after the durable acknowledgement
completes so retries see the staged items.

Comment on lines +17 to 19
const messages: AgentMessage[] = await persistence.loadMessages(iii, rec.session_id);
await emit(iii, rec.session_id, { type: 'agent_end', messages });
transitionTo(rec, 'stopped');
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Don't mark the session stopped when agent_end is only best-effort.

emit() swallows stream::set failures, so Line 18 can fail silently and Line 19 still advances the record to stopped. That permanently drops the terminal event for downstream consumers. Please propagate the emit failure, or persist a terminal-emitted/idempotency marker and retry agent_end before finalizing teardown.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@harness/src/turn-orchestrator/states/tearing-down.ts` around lines 17 - 19,
The code calls emit(iii, rec.session_id, { type: 'agent_end', messages }) and
immediately calls transitionTo(rec, 'stopped'), but emit() can swallow
stream::set failures so the terminal event may never be delivered; update the
teardown in tearing-down.ts to either propagate emit's error (i.e., await and
bubble up any failure from emit) or implement a durable
terminal-emitted/idempotency marker and retry logic: after
persistence.loadMessages(...) call emit(...) and if it fails, persist a
terminal-marker tied to rec.session_id (or retry emit a configurable number of
times with backoff) and only call transitionTo(rec, 'stopped') once emit has
succeeded or the durable marker confirms the terminal event was recorded;
reference functions/variables: persistence.loadMessages, emit, transitionTo,
rec.session_id, and the 'agent_end' message type.

Comment on lines +29 to +37
try {
await iii.trigger({
function_id: turnFnId(state),
payload: { session_id },
action: TriggerAction.Enqueue({ queue: TURN_STEP_QUEUE }),
});
} catch (err) {
logger.warn('wakeState failed', { session_id, state, err: String(err) });
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Don’t swallow enqueue failures in wakeState; propagate them after logging.

If enqueue fails and this function only warns, the session can stall indefinitely with no guaranteed follow-up wake. This is a reliability break for the durable FSM path.

Suggested fix
 export async function wakeState(iii: ISdk, session_id: string, state: TurnState): Promise<void> {
   try {
     await iii.trigger({
       function_id: turnFnId(state),
       payload: { session_id },
       action: TriggerAction.Enqueue({ queue: TURN_STEP_QUEUE }),
     });
   } catch (err) {
     logger.warn('wakeState failed', { session_id, state, err: String(err) });
+    throw err;
   }
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
try {
await iii.trigger({
function_id: turnFnId(state),
payload: { session_id },
action: TriggerAction.Enqueue({ queue: TURN_STEP_QUEUE }),
});
} catch (err) {
logger.warn('wakeState failed', { session_id, state, err: String(err) });
}
try {
await iii.trigger({
function_id: turnFnId(state),
payload: { session_id },
action: TriggerAction.Enqueue({ queue: TURN_STEP_QUEUE }),
});
} catch (err) {
logger.warn('wakeState failed', { session_id, state, err: String(err) });
throw err;
}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@harness/src/turn-orchestrator/wake.ts` around lines 29 - 37, The current
wakeState call catches failures from iii.trigger (the enqueue to
TURN_STEP_QUEUE) and only logs a warning, which can stall the session; change
the catch to log the error using the existing logger (keeping session_id and
state) and then rethrow the original error so callers can observe/fail fast;
specifically update the try/catch around iii.trigger (the call that uses
function_id: turnFnId(state) and action: TriggerAction.Enqueue({ queue:
TURN_STEP_QUEUE })) to logger.warn(..., { session_id, state, err }) followed by
rethrowing the caught err.

…run-start-simplify

# Conflicts:
#	harness/tests/turn-orchestrator/approval-resume.test.ts
@ytallo ytallo merged commit b330907 into main May 25, 2026
15 of 16 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants