Simplify harness trigger and async run::start by ytallo · Pull Request #174 · iii-hq/workers

ytallo · 2026-05-20T23:01:14Z

Summary

Simplifies the agent-run entrypoint and restructures the turn-orchestrator into a durable, per-state queue-driven FSM. The console now sends a flat harness::trigger payload and the harness always starts a run via run::start; the turn loop advances through per-state turn::{state} handlers enqueued on a per-session FIFO queue, behind a single dispatchWithHook approval chokepoint.

Harness ingress & `run::start`

Console sends messages via harness::trigger with a flat payload only; the harness always invokes run::start (no client-supplied function_id).
run::start uses Zod defaults, drops the unused cwd / cwd_hash from the run payload and iii state, and no longer replays synthetic agent::events at kickoff.
Removes the synchronous run::start_and_wait path (the on-terminal waiter, its policy entry, and sync_default_timeout_ms).

Turn-orchestrator restructure

Splits the monolithic dispatcher into per-state queue handlers (turn::{state}), colocated with their registration.
Enqueues turn wakes on a durable turn-step FIFO queue (grouped by session_id, concurrency 1) and gates durable steps so each session advances in order.
Routes the approval chokepoint through dispatchWithHook, replacing the agent::trigger dispatcher.
Parses turn_state trigger payloads at the Zod boundary; removes the isStepableRecordWrite wrapper and inlines parseStepableWrite.

Turn FSM

Each state is a registered turn::{state} function run via runTransition and enqueued on the turn-step FIFO queue by wakeState. failed is reachable from any state via an unhandled handler throw (not shown as edges).

stateDiagram-v2
  [*] --> provisioning
  provisioning --> assistant_streaming
  assistant_streaming --> function_execute: has function calls
  assistant_streaming --> steering_check: no function calls
  assistant_streaming --> tearing_down: error or aborted
  function_execute --> function_awaiting_approval: any call needs approval
  function_execute --> steering_check: batch complete
  function_execute --> tearing_down: all calls terminate session
  function_awaiting_approval --> function_execute: all decisions written
  steering_check --> assistant_streaming: continue turn
  steering_check --> tearing_down: stop or max turns
  tearing_down --> stopped
  stopped --> [*]
  failed --> [*]

Previous FSM

stateDiagram-v2
    [*] --> provisioning

    provisioning --> awaiting_assistant : system prompt built

    awaiting_assistant --> assistant_streaming : turn_count++ / emit turn_start
    awaiting_assistant --> tearing_down : max_turns reached (emit turn_end)

    assistant_streaming --> assistant_finished : stream done (final or synthetic error)

    assistant_finished --> tearing_down : stop_reason error/aborted (emit turn_end)
    assistant_finished --> steering_check : no function calls
    assistant_finished --> function_prepare : has function calls

    function_prepare --> function_execute : unwrap + persist prepared

    function_execute --> function_awaiting_approval : call needs approval (pending)
    function_execute --> function_finalize : all calls executed

    function_awaiting_approval --> function_awaiting_approval : decisions still pending (wait)
    function_awaiting_approval --> function_execute : decisions resolved (allow/deny applied)

    function_finalize --> tearing_down : all results terminate (emit turn_end)
    function_finalize --> steering_check : has non-terminal results

    steering_check --> awaiting_assistant : steering msg drained
    steering_check --> awaiting_assistant : followup msg drained
    steering_check --> awaiting_assistant : continue after function
    steering_check --> tearing_down : abort flag set
    steering_check --> tearing_down : end_turn (nothing queued)

    tearing_down --> stopped : stop sandbox + emit agent_end

    stopped --> [*]

Event translation

Renames message_end -> message_complete across harness, console, and ACP.
Consolidates event translation into a single createAgentEventTranslator used by the console backend.

Config, docs & cleanup

Adds harness/engine.config.yaml defining the turn-step FIFO queue.
Updates iii-permissions.yaml and the architecture / turn-orchestrator docs to the new function surface (run::start, turn::{state}, turn::get_state).
Removes the generated harness-flow.html and pnpm-lock.yaml; ignores the local harness-node/ build output; applies biome 2.4.10 formatting.

Test plan

biome ci harness passes locally (0 errors)
CI: harness node lint + test green

Summary by CodeRabbit

New Features
- More reliable message completion handling with clearer streaming vs batch behavior and improved assistant stop/reason reporting.
- Unified event translation for consistent UI updates and tool-call lifecycle signals.
Bug Fixes
- Fewer duplicated messages and improved idempotency when persisting assistant/function results.
- Better handling of transient provider/channel failures and clearer error messages.
Chores
- Simplified trigger payload shape for starting chat turns.
- Documentation updates for orchestration and harness behavior.

Console calls harness::trigger with a flat payload; harness always starts run::start. Remove run::start_and_wait, cwd state keys, and initial agent::events replay at kickoff so the FSM owns turn lifecycle.

vercel · 2026-05-20T23:01:16Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
workers	Ready	Preview, Comment	May 25, 2026 6:14pm

coderabbitai · 2026-05-20T23:01:20Z

Caution

Review failed

The pull request is closed.

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 85bea7cb-0b6c-4df5-bcc1-2226cf545c36

📥 Commits

Reviewing files that changed from the base of the PR and between b93ad52 and 608742e.

📒 Files selected for processing (1)

harness/tests/turn-orchestrator/approval-resume.test.ts

📝 Walkthrough

Walkthrough

Unifies event translation around a new message_complete event and replaces legacy translators. Repackages the turn orchestrator into a durable, queue-driven FSM with Zod schemas, wake/persist helpers, new state handlers, simplified harness trigger, many test updates, and docs/config adjustments.

Changes

Agent event model and translators

Layer / File(s)	Summary
Events contract and ACP mapping `acp/README.md`, `acp/src/handler.rs`, `console/web/src/types/iii-agent-event.ts`	Switches assistant lifecycle mapping to `message_complete`, adjusts ACP handler and type unions.
Console translator factory and wiring `console/web/src/lib/backend/translate.ts`, `console/web/src/lib/backend/real.ts`	Adds stateful `createAgentEventTranslator()`, centralizes event -> StreamEvent[] mapping, removes legacy helpers, and updates real stream wiring.
Translator tests and ACP tests `console/web/src/lib/backend/translate.test.ts`, `acp/src/handler.rs`	Migrates tests to validate `message_complete` handling, compaction, mirroring, and stop-reason semantics.

Durable turn orchestrator and wake queue

Layer / File(s)	Summary
Schemas, run request, and types `harness/src/turn-orchestrator/schemas.ts`, `.../run-request.ts`, `harness/src/types/agent-event.ts`	Adds Zod schemas for registered functions, typed RunRequest parsing, and updates AgentEvent union with `message_complete`.
Transition runner, persistence, and emit `harness/src/turn-orchestrator/run-transition.ts`, `persistence.ts`, `turn-state-write.ts`	Adds runTransition, persistRecord/saveRecord changes (emit + conditional wake), and emitTurnStateChanged helper.
Wake queue and runtime wiring `harness/src/turn-orchestrator/wake.ts`, `harness/engine.config.yaml`, `harness/src/runtime/iii.ts`	Implements FIFO enqueue wake (turn-step), gating predicates, engine config, and re-exports TriggerAction.
Entrypoints and registration `harness/src/turn-orchestrator/run-start.ts`, `get-state.ts`, `register.ts`	Type-safe run::start and turn::get_state handlers; simplified register sequence and bootstrap wiring.
Abort and approval adapters `harness/src/turn-orchestrator/on-abort-signal.ts`, `approval-resume.ts`	Schema-driven abort-signal parsing, wakeFromRecord usage for approvals/resume, updated registrations and logging.
Agent trigger chokepoint `harness/src/turn-orchestrator/agent-trigger.ts`	Splits dispatchWithHook and triggerFunctionCall, normalizes inputs, and zod-validates error shapes.
Function execute & approval states `harness/src/turn-orchestrator/states/function-execute.ts`, `function-awaiting-approval.ts`	New durable handlers for function execution, awaiting-approval processing, commit/finalize semantics, and registration.
Assistant streaming & finished transitions `harness/src/turn-orchestrator/states/assistant-streaming.ts`, `assistant-finished.ts`	Provider streaming, partial `message_update` deltas, body_streamed tracking, `message_complete` emission, and function-call preparation.
Provisioning, steering, tearing down `harness/src/turn-orchestrator/states/*.ts`	Provisioning parses directory bodies and advances to streaming; steering-check routes inboxes; tearing-down emits agent_end and stops.

Harness bridge, bootstrap, and docs

Layer / File(s)	Summary
Harness trigger bridge `harness/src/harness/trigger.ts`	Replaces dynamic function_id bridge with typed flat payload that forwards `payload` to `run::start`.
Bootstrap and engine config `harness/src/harness/register.ts`, `harness/engine.config.yaml`	Reorders FanoutState init before config load and adds local engine worker config.
Docs, permissions, gitignore `harness/docs/*`, `harness/README.md`, `.gitignore`, `iii-permissions.yaml`	Updates architecture and worker docs, removes `run::start_and_wait` deny, and adjusts gitignore entries.

Tests aligned to durable orchestrator

Layer / File(s)	Summary
Translator and ACP tests `console/web/src/lib/backend/translate.test.ts`, `acp/src/handler.rs`	Validates `message_complete`, compaction, and mirroring behavior.
Harness trigger and policy tests `harness/tests/harness/*`	Validates flat WS payload and updated denylist.
Integration: saveRecord wake & reads `harness/tests/integration/on-record-written.e2e.test.ts`	Asserts enqueue behavior from persistence.saveRecord and read-elimination.
Integration: approval resume & abort-signal `harness/tests/integration/approval-resume.e2e.test.ts`	Switches tests to wakeTriggers and microtask flushing for approval resumes and abort writes.
Unit tests for abort adapter, wake, emit `harness/tests/turn-orchestrator/on-abort-signal.test.ts`, `wake*.test.ts`, `turn-state-write.test.ts`	Covers schema parsing, enqueueing, and non-throwing emit behavior.
Provisioning, assistant, function, steering tests `harness/tests/turn-orchestrator/*`	Extensive revisions to cover provisioning, streaming, completion, function execution, approvals, steering, teardown, and run-transition behaviors.
Run-start/run-request/run-transition/get-state tests `harness/tests/turn-orchestrator/run-.test.ts`	Validates schemas, registration, enqueueing, runRequest parsing, and runTransition semantics.

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~120 minutes

Possibly related PRs

iii-hq/workers#159: Related console/backend translation and pending-approval mirror work.
iii-hq/workers#156: Overlaps on approval resume and orchestrator changes.
iii-hq/workers#151: Prior work on message lifecycle translation adjustments.

Suggested reviewers

sergiofilhowz

Poem

"A rabbit hops the queue so light,
message_complete shines through the night.
Schemas bloom and wake calls sing,
transitions hum — the orchestrator springs.
Carrot cheers for durable things."

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch feat/harness-trigger-run-start-simplify

github-actions · 2026-05-20T23:01:35Z

skill-check — worker

0 verified, 12 skipped (no docs/).

Layer	Result
structure	✓
vale	✓
ai	✓

Three for three. Nicely done.

Replace manual typeof/Record parsing with shared TurnStateWriteEventSchema and parseTurnStateWrite, matching the run-start/trigger refactor style.

Each state-trigger adapter now exports register() with inline function ids, matching run-start.ts. register.ts only composes modules; exported ID constants removed.

… the register condition.

…ng is type-enforced. Share Zod turn_state write parsing between adapters; add handler tests for direct invokes that bypass the engine condition gate.

…ype safety - Simplified `parseStepableWrite` to only return `session_id`, removing the state from its output. - Inlined the handling of stepable writes directly in `handleStepableRecordWrite`, eliminating the fallback to durable publish. - Updated tests to reflect changes in the `parseStepableWrite` output and removed obsolete tests related to fallback logic. - Introduced `StepPayloadSchema` for consistent payload validation in the `turn::step` function, ensuring only valid session IDs are processed. - Added new tests for `StepPayloadSchema` to validate input shapes and error handling. These changes enhance clarity and maintainability of the turn orchestrator's state management logic.

Replace turn::step_requested publishes with queue enqueues in approval resume and abort handling, add turn::should_step filtering for unknown or terminal sessions, and tighten get_state with Zod payload parsing.

Replace the monolithic turn::step subscriber with turn::{state} functions enqueued on turn-step, consolidate UI turn_state_changed emits into persistence saves, and merge assistant/function states for a simpler FSM.

Reconcile the turn-orchestrator per-state queue refactor with main's harness-node/ -> harness/ rename and parallel changes: - Map the refactor (per-state handlers, turn-step FIFO queue, Zod boundary parsing, simplified harness::trigger) onto the harness/ tree. - Adopt main's idempotency guards: function_result dedup lands in function-execute.ts via the rename merge; the assistant-message guard is ported into the split assistant-finished.ts handler. - Take main's new providers (kimi/lmstudio/llamacpp), functionNotFoundHint hint, and system-prompt guidance (auto-merged, untouched by the refactor). - Drop on-record-written/on-terminal/on-turn-state-changed/subscriber/ transitions removed by the refactor; keep main's provider tests. - Update functions error-result test for the new executedCalls clear, and re-add idempotency regression tests adapted to the split handlers. typecheck clean; 826 tests pass.

…ranslations - Updated the event type from `message_end` to `message_complete` in the README and codebase for consistency. - Adjusted the translation logic in `handler.rs` to handle the new event type, ensuring proper rendering of assistant messages. - Modified tests to reflect the change in event type, ensuring all related functionality is covered. - Cleaned up comments and documentation to align with the new terminology.

- Deleted the pnpm-lock.yaml file from the project, which may indicate a shift in package management strategy or a cleanup of unused files.

…anslator - Replaced the previous createTurnStateTranslator with a new createAgentEventTranslator function that handles all agent event translations, including `turn_state_changed`. - Updated the realStream function to utilize the new translator, simplifying the event handling process. - Removed deprecated translation functions and cleaned up related comments and documentation for clarity. - Adjusted tests to ensure coverage of the new translation logic and verify correct event handling.

- Deleted the harness-flow.html file, which contained the end-to-end information flow documentation for the project. This may indicate a shift in documentation strategy or a move towards a different format for presenting this information.

CI lints with biome 2.4.10 while the repo pins ^1.9.4; reformat the files whose 2.x layout differed to clear the 7 format errors.

harness::trigger no longer forwards a client-supplied function_id; it takes a flat payload and always invokes run::start. Update harness.md and the architecture telemetry/ingestion-bridge section to match.

coderabbitai

Actionable comments posted: 12

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)

harness/src/turn-orchestrator/approval-resume.ts (1)

83-98: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Only unregister the resume hook after the wake succeeds.

The decision is persisted before wakeFromRecord(), so retries are already safe. If the wake throws, Line 98 still unregisters the callback and the session can stay stuck in function_awaiting_approval with no way to retry locally.

Suggested fix

   try {
     await wakeFromRecord(iii, session_id);
+    unregisterApprovalResume(fnId);
   } catch (err) {
     logger.warn('approval resume: turn step wake failed', { session_id, err: String(err) });
+    throw err;
   }
-
-  unregisterApprovalResume(fnId);
 }

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@harness/src/turn-orchestrator/approval-resume.ts` around lines 83 - 98, The
unregisterApprovalResume(fnId) call is currently executed regardless of
wakeFromRecord() success, which can leave the session stuck if wake fails;
change the flow so that after computing key via pendingKey(session_id,
function_call_id) and persisting decision with stateSet (guarded by
hasStoredDecision(existing)), you only call unregisterApprovalResume(fnId)
inside the try block after await wakeFromRecord(iii, session_id) completes
successfully (keep the logger.warn in the catch and do not unregister there) so
the resume hook remains available for retries if wakeFromRecord throws.

harness/src/turn-orchestrator/persistence.ts (1)

228-239: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Keep the legacy staged-call key during rollout.

These loaders now accept only function_call. Any in-flight session persisted with the previous tool_call shape will deserialize as empty, which drops approval/idempotency state and can re-run or skip calls on resume.

Also applies to: 251-265
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@harness/src/turn-orchestrator/persistence.ts` around lines 228 - 239,
loadPreparedCalls currently only reads obj.function_call and will drop legacy
staged entries that use obj.tool_call; update loadPreparedCalls to accept either
obj.function_call or obj.tool_call (prefer function_call when both exist), map
the legacy tool_call into the same FunctionCall shape, and still preserve
blocked (obj.blocked) and pre_approved (obj.pre_approved) when pushing
PreparedEntry. Apply the same change to the other staged-call loader in this
module (the sibling loader that iterates staged items around the later block) so
both loaders are backwards-compatible with the legacy tool_call key.

🧹 Nitpick comments (2)

harness/src/turn-orchestrator/states/provisioning.ts (1)

81-85: ⚡ Quick win

Fetch the default skills concurrently.

This startup path currently inherits the serial fetchDefaultSkills() loop, so first-token latency grows with the sum of every directory::skills::get call. Promise.all(...) keeps the configured order and removes the unnecessary wall-clock delay.

♻️ Suggested refactor

 async function fetchDefaultSkills(iii: ISdk, uris: readonly string[]): Promise<DefaultSkillBody[]> {
-  const bodies: DefaultSkillBody[] = [];
-  for (const uri of uris) {
-    const id = uri.startsWith('iii://') ? uri.slice('iii://'.length) : uri;
-    const body = await fetchSkill(iii, id);
-    bodies.push(defaultSkillBody(uri, body));
-  }
-  return bodies;
+  return Promise.all(
+    uris.map(async (uri) => {
+      const id = uri.startsWith('iii://') ? uri.slice('iii://'.length) : uri;
+      const body = await fetchSkill(iii, id);
+      return defaultSkillBody(uri, body);
+    }),
+  );
 }

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@harness/src/turn-orchestrator/states/provisioning.ts` around lines 81 - 85,
The code currently awaits fetchSkillsIndex and then fetchDefaultSkills serially
which increases startup latency; change the call site that builds the prompt
(where buildSystemPrompt is invoked with skillsIndex and bodies) to fetch both
values concurrently using Promise.all by calling
Promise.all([fetchSkillsIndex(iii), fetchDefaultSkills(iii,
cfg.system_default_skills)]) and destructuring the result into [skillsIndex,
bodies] so the order is preserved before calling buildSystemPrompt(bodies, null,
override, request.mode, skillsIndex).

harness/src/types/agent-event.ts (1)

26-29: ⚡ Quick win

Narrow message_complete.message to AssistantMessage.

AgentMessage also permits user/function/custom payloads, but this event represents assistant-stream completion. Keeping the wire type this broad weakens downstream exhaustiveness and lets invalid emits compile.

♻️ Proposed fix

-import type { AgentMessage, FunctionResultMessage } from './agent-message.js';
+import type { AgentMessage, AssistantMessage, FunctionResultMessage } from './agent-message.js';
@@
   | {
       type: 'message_complete';
-      message: AgentMessage;
+      message: AssistantMessage;
       /** When true, text/thinking were already delivered via message_update. */
       body_streamed?: boolean;
     }

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@harness/src/types/agent-event.ts` around lines 26 - 29, The message_complete
event's message field is too broad: change the type of the message field on the
'message_complete' event from AgentMessage to AssistantMessage so only assistant
messages can be emitted; update the declaration of the union/variant where type:
'message_complete' is defined (the message field) to use AssistantMessage and
adjust any related imports/exports or type guards (e.g., places expecting
AgentMessage for 'message_complete') to preserve exhaustiveness and prevent
invalid emits.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@console/web/src/lib/backend/translate.ts`:
- Around line 96-97: The switch case in translate.ts that maps the raw
'agent_end' event to a StreamEvent [{ kind: 'assistant-end' }] causes a
duplicate close because assistant replies already close on 'message_complete';
update the 'agent_end' branch in the translateRawEvent/translate (the switch
handling event kinds) to not emit an additional 'assistant-end' (e.g., return no
StreamEvent/empty array or null) so realStream() can still terminate on the raw
'agent_end' without producing a redundant assistant segment in the UI.

In `@harness/src/harness/trigger.ts`:
- Around line 23-27: HarnessTriggerInputSchema currently uses z.object(...)
which strips unknown keys; change it to a strict schema so unexpected/stale
fields are rejected. Locate the HarnessTriggerInputSchema declaration and call
the Zod strict API on that object (i.e., convert the z.object({...}) definition
to use .strict() so unknown keys like legacy function_id will cause validation
to fail), keeping RunStartPayloadSchema and the existing optional
session_id/message_id shape unchanged.

In `@harness/src/turn-orchestrator/on-abort-signal.ts`:
- Around line 37-45: The handler currently swallows failures from wakeFromRecord
in execute(iii, write) by logging and returning success; instead, after logging
the warning (logger.warn) re-throw the caught error so the caller/adapter can
retry the abort wake; modify the catch block to log the error with session_id
and then throw the original err (or rethrow) rather than returning silently.

In `@harness/src/turn-orchestrator/persistence.ts`:
- Around line 85-87: The current flow lets wakeState enqueue failures be
swallowed (so saveRecord can return without a scheduled wake); change behavior
so durable-wake failures surface as errors: in saveRecord (or the caller that
runs shouldWakeStep), call wakeState(iii, rec.session_id, rec.state) and
propagate any error it returns/throws (do not ignore/log-and-continue), or alter
wakeState in harness/src/turn-orchestrator/wake.ts to rethrow enqueue failures
(or return an explicit error) so saveRecord can detect it and fail the save.
Ensure you reference shouldWakeStep and wakeState and make the call await and
let exceptions bubble up instead of treating wake as best-effort.
- Around line 66-74: The code currently calls stateSet(...) and then always
emits emitTurnStateChanged(...), but stateSet swallows errors so a failed write
still triggers the event; change the flow so that failed state writes prevent
emitting: either make stateSet propagate/throw on error or wrap the await
stateSet(...) call in a try/catch and on catch do not call emitTurnStateChanged
(instead rethrow or return an error), ensuring
emitTurnStateChanged(rec.session_id, ...) is only invoked when stateSet
successfully completes; reference the stateSet, turnStateKey, and
emitTurnStateChanged symbols when implementing the guard.

In `@harness/src/turn-orchestrator/schemas.ts`:
- Around line 21-22: The schema currently allows empty provider and model
strings; update the provider and model validators in the schema (the zod fields
named provider and model) to reject empty/whitespace-only values by using
trimmed non-empty constraints such as z.string().trim().min(1) or
z.string().nonempty(). Do this change on the provider and model entries so
invalid requests are rejected at the schema boundary.
- Line 24: RunStartPayloadSchema currently uses messages:
z.custom<AgentMessage[]>((v) => Array.isArray(v)).default([]) which only checks
the value is an array and allows invalid entries; replace this with a proper
messages: z.array(...) using a validated AgentMessageSchema (or a
z.discriminatedUnion on the "role" field that defines each variant's required
fields and nested content validation) so each element is runtime-validated;
update or add a reusable AgentMessageSchema (or per-role schemas) and use it in
RunStartPayloadSchema.messages to reject malformed entries at the run::start
boundary instead of casting later.

In `@harness/src/turn-orchestrator/states/assistant-streaming.ts`:
- Around line 149-212: The read loop can hang if iii.trigger (triggerPromise)
resolves or rejects without a terminal 'done'/'error' event; after creating
triggerPromise (the iii.trigger call) attach a finally handler that sets done =
true and wakes the reader by calling resolveNext (clearing it) if present so
readPromise's loop unblocks; ensure this is added alongside the existing .catch
so both success and failure paths will signal completion to the read loop (refer
to triggerPromise, readPromise, done, resolveNext, messageQueue, and
iii.trigger).

In `@harness/src/turn-orchestrator/states/function-execute.ts`:
- Around line 97-116: The code uses e.is_error when pushing into
function_results but publishAfter (called on e.function_call/result) can replace
result with merged, so is_error may be stale; update the logic after obtaining
merged (in the loop that iterates executed and calls publishAfter) to derive
is_error from the final result/merged (e.g., inspect the merged object's
error/denial shape or a boolean property on the resulting FunctionResult)
instead of using e.is_error, and set the is_error field in the pushed object
from that computed value so function_results reflects post-hook changes.

In `@harness/src/turn-orchestrator/states/steering-check.ts`:
- Around line 129-137: The steering/followup branch currently drains and appends
inbox items destructively (using steering/followup, emitTurnEndOnce,
persistence.loadMessages and persistence.saveMessages) before the transition is
made durable, risking data loss on failure; instead implement a claim/ack or
durable staging step: atomically move or persist the drained items (e.g., write
them to a staged buffer on the record like rec.staged_inbox or call a new
persistence.stageMessages API) before removing them from the session inbox, only
clear/ack the original inbox after persistence.saveMessages and
persistence.saveRecord succeed, and keep rec.function_results reset and
transitionTo(rec, 'assistant_streaming') only after the durable acknowledgement
completes so retries see the staged items.

In `@harness/src/turn-orchestrator/states/tearing-down.ts`:
- Around line 17-19: The code calls emit(iii, rec.session_id, { type:
'agent_end', messages }) and immediately calls transitionTo(rec, 'stopped'), but
emit() can swallow stream::set failures so the terminal event may never be
delivered; update the teardown in tearing-down.ts to either propagate emit's
error (i.e., await and bubble up any failure from emit) or implement a durable
terminal-emitted/idempotency marker and retry logic: after
persistence.loadMessages(...) call emit(...) and if it fails, persist a
terminal-marker tied to rec.session_id (or retry emit a configurable number of
times with backoff) and only call transitionTo(rec, 'stopped') once emit has
succeeded or the durable marker confirms the terminal event was recorded;
reference functions/variables: persistence.loadMessages, emit, transitionTo,
rec.session_id, and the 'agent_end' message type.

In `@harness/src/turn-orchestrator/wake.ts`:
- Around line 29-37: The current wakeState call catches failures from
iii.trigger (the enqueue to TURN_STEP_QUEUE) and only logs a warning, which can
stall the session; change the catch to log the error using the existing logger
(keeping session_id and state) and then rethrow the original error so callers
can observe/fail fast; specifically update the try/catch around iii.trigger (the
call that uses function_id: turnFnId(state) and action: TriggerAction.Enqueue({
queue: TURN_STEP_QUEUE })) to logger.warn(..., { session_id, state, err })
followed by rethrowing the caught err.

---

Outside diff comments:
In `@harness/src/turn-orchestrator/approval-resume.ts`:
- Around line 83-98: The unregisterApprovalResume(fnId) call is currently
executed regardless of wakeFromRecord() success, which can leave the session
stuck if wake fails; change the flow so that after computing key via
pendingKey(session_id, function_call_id) and persisting decision with stateSet
(guarded by hasStoredDecision(existing)), you only call
unregisterApprovalResume(fnId) inside the try block after await
wakeFromRecord(iii, session_id) completes successfully (keep the logger.warn in
the catch and do not unregister there) so the resume hook remains available for
retries if wakeFromRecord throws.

In `@harness/src/turn-orchestrator/persistence.ts`:
- Around line 228-239: loadPreparedCalls currently only reads obj.function_call
and will drop legacy staged entries that use obj.tool_call; update
loadPreparedCalls to accept either obj.function_call or obj.tool_call (prefer
function_call when both exist), map the legacy tool_call into the same
FunctionCall shape, and still preserve blocked (obj.blocked) and pre_approved
(obj.pre_approved) when pushing PreparedEntry. Apply the same change to the
other staged-call loader in this module (the sibling loader that iterates staged
items around the later block) so both loaders are backwards-compatible with the
legacy tool_call key.

---

Nitpick comments:
In `@harness/src/turn-orchestrator/states/provisioning.ts`:
- Around line 81-85: The code currently awaits fetchSkillsIndex and then
fetchDefaultSkills serially which increases startup latency; change the call
site that builds the prompt (where buildSystemPrompt is invoked with skillsIndex
and bodies) to fetch both values concurrently using Promise.all by calling
Promise.all([fetchSkillsIndex(iii), fetchDefaultSkills(iii,
cfg.system_default_skills)]) and destructuring the result into [skillsIndex,
bodies] so the order is preserved before calling buildSystemPrompt(bodies, null,
override, request.mode, skillsIndex).

In `@harness/src/types/agent-event.ts`:
- Around line 26-29: The message_complete event's message field is too broad:
change the type of the message field on the 'message_complete' event from
AgentMessage to AssistantMessage so only assistant messages can be emitted;
update the declaration of the union/variant where type: 'message_complete' is
defined (the message field) to use AssistantMessage and adjust any related
imports/exports or type guards (e.g., places expecting AgentMessage for
'message_complete') to preserve exhaustiveness and prevent invalid emits.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: d2ba998b-d58d-4f60-bba3-a1f1d9199527

📥 Commits

Reviewing files that changed from the base of the PR and between a28ff9d and b93ad52.

📒 Files selected for processing (71)

.gitignore
acp/README.md
acp/src/handler.rs
console/web/src/lib/backend/real.ts
console/web/src/lib/backend/translate.test.ts
console/web/src/lib/backend/translate.ts
console/web/src/types/iii-agent-event.ts
harness/README.md
harness/docs/architecture.md
harness/docs/workers/harness.md
harness/docs/workers/turn-orchestrator.md
harness/engine.config.yaml
harness/src/harness/register.ts
harness/src/harness/trigger.ts
harness/src/runtime/iii.ts
harness/src/turn-orchestrator/agent-trigger.ts
harness/src/turn-orchestrator/approval-resume.ts
harness/src/turn-orchestrator/config.ts
harness/src/turn-orchestrator/get-state.ts
harness/src/turn-orchestrator/on-abort-signal.ts
harness/src/turn-orchestrator/on-record-written.ts
harness/src/turn-orchestrator/on-terminal.ts
harness/src/turn-orchestrator/on-turn-state-changed.ts
harness/src/turn-orchestrator/persistence.ts
harness/src/turn-orchestrator/register.ts
harness/src/turn-orchestrator/run-request.ts
harness/src/turn-orchestrator/run-start.ts
harness/src/turn-orchestrator/run-transition.ts
harness/src/turn-orchestrator/schemas.ts
harness/src/turn-orchestrator/state.ts
harness/src/turn-orchestrator/states/assistant-finished.ts
harness/src/turn-orchestrator/states/assistant-streaming.ts
harness/src/turn-orchestrator/states/assistant.ts
harness/src/turn-orchestrator/states/function-awaiting-approval.ts
harness/src/turn-orchestrator/states/function-execute.ts
harness/src/turn-orchestrator/states/functions.ts
harness/src/turn-orchestrator/states/index.ts
harness/src/turn-orchestrator/states/provisioning.ts
harness/src/turn-orchestrator/states/steering-check.ts
harness/src/turn-orchestrator/states/tearing-down.ts
harness/src/turn-orchestrator/subscriber.ts
harness/src/turn-orchestrator/transitions.ts
harness/src/turn-orchestrator/turn-state-write.ts
harness/src/turn-orchestrator/wake.ts
harness/src/types/agent-event.ts
harness/tests/harness/policy.test.ts
harness/tests/harness/trigger.test.ts
harness/tests/integration/approval-resume.e2e.test.ts
harness/tests/integration/on-record-written.e2e.test.ts
harness/tests/integration/wire-parity.test.ts
harness/tests/turn-orchestrator/agent-trigger.test.ts
harness/tests/turn-orchestrator/approval-resume.test.ts
harness/tests/turn-orchestrator/assistant.test.ts
harness/tests/turn-orchestrator/awaiting-approval.test.ts
harness/tests/turn-orchestrator/config.test.ts
harness/tests/turn-orchestrator/functions.test.ts
harness/tests/turn-orchestrator/get-state.test.ts
harness/tests/turn-orchestrator/on-abort-signal.test.ts
harness/tests/turn-orchestrator/on-record-written.test.ts
harness/tests/turn-orchestrator/on-terminal.test.ts
harness/tests/turn-orchestrator/on-turn-state-changed.test.ts
harness/tests/turn-orchestrator/provisioning.test.ts
harness/tests/turn-orchestrator/run-request.test.ts
harness/tests/turn-orchestrator/run-start.test.ts
harness/tests/turn-orchestrator/run-transition.test.ts
harness/tests/turn-orchestrator/state.test.ts
harness/tests/turn-orchestrator/steering.test.ts
harness/tests/turn-orchestrator/tearing-down.test.ts
harness/tests/turn-orchestrator/turn-state-write.test.ts
harness/tests/turn-orchestrator/wake.test.ts
iii-permissions.yaml

💤 Files with no reviewable changes (13)

harness/tests/turn-orchestrator/on-turn-state-changed.test.ts
harness/tests/integration/wire-parity.test.ts
harness/src/turn-orchestrator/states/assistant.ts
harness/src/turn-orchestrator/states/functions.ts
iii-permissions.yaml
harness/tests/turn-orchestrator/on-record-written.test.ts
harness/src/turn-orchestrator/on-turn-state-changed.ts
harness/tests/turn-orchestrator/on-terminal.test.ts
harness/src/turn-orchestrator/subscriber.ts
harness/tests/harness/policy.test.ts
harness/src/turn-orchestrator/transitions.ts
harness/src/turn-orchestrator/on-record-written.ts
harness/src/turn-orchestrator/on-terminal.ts

coderabbitai · 2026-05-25T10:03:21Z

+      case 'agent_end':
+        return [{ kind: 'assistant-end' }]


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Don't emit a second assistant-end on agent_end.

With this refactor, the assistant already closes on message_complete, so translating teardown's agent_end into another assistant-end will double-close the last reply. realStream() still terminates on the raw agent_end, so this extra StreamEvent is redundant and can create an empty trailing assistant segment in the UI.

Suggested fix

case 'agent_end': - return [{ kind: 'assistant-end' }] + return []

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

case 'agent_end':

return [{ kind: 'assistant-end' }]

case 'agent_end':

return []

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@console/web/src/lib/backend/translate.ts` around lines 96 - 97, The switch case in translate.ts that maps the raw 'agent_end' event to a StreamEvent [{ kind: 'assistant-end' }] causes a duplicate close because assistant replies already close on 'message_complete'; update the 'agent_end' branch in the translateRawEvent/translate (the switch handling event kinds) to not emit an additional 'assistant-end' (e.g., return no StreamEvent/empty array or null) so realStream() can still terminate on the raw 'agent_end' without producing a redundant assistant segment in the UI.

coderabbitai · 2026-05-25T10:03:21Z

+const HarnessTriggerInputSchema = z.object({
+  session_id: z.string().optional(),
+  message_id: z.string().optional(),
+  payload: RunStartPayloadSchema,
+});


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

🧩 Analysis chain

🌐 Web query:

In Zod 3.23.8, does z.object({...}).parse(...)strip unknown keys by default unless.strict() is used?

💡 Result:

Yes. In Zod (including 3.23.8), z.object({...}).parse(...) uses the default “unknownKeys” policy of stripping unrecognized keys from the parsed output; it does not throw just because unknown keys exist. You only get an error for unknown keys when you explicitly add.strict.[1][2][3] So the behavior is: - Default: unknown keys are stripped from the result.[1][2][3] -.strict: unknown keys cause a validation error.[1][2][3] If you want to keep unknown keys instead of stripping them, you can use.passthrough.[1][3]

Citations:

1: https://v3.zod.dev/?id=parse

2: https://zod.dev/api

3: https://github.com/colinhacks/zod/blob/cc8ad1981ba580d1250520fde8878073d4b7d40a/README.md

Make HarnessTriggerInputSchema strict to reject unknown/stale client fields.

In harness/src/harness/trigger.ts (lines 23-27), z.object(...) strips unknown keys by default in Zod 3.23.8, so deprecated shapes like { function_id, payload } can still be accepted (with the extra keys removed) instead of failing the new contract.

Proposed fix

-const HarnessTriggerInputSchema = z.object({ - session_id: z.string().optional(), - message_id: z.string().optional(), - payload: RunStartPayloadSchema, -}); +const HarnessTriggerInputSchema = z + .object({ + session_id: z.string().optional(), + message_id: z.string().optional(), + payload: RunStartPayloadSchema, + }) + .strict();

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

const HarnessTriggerInputSchema = z.object({

session_id: z.string().optional(),

message_id: z.string().optional(),

payload: RunStartPayloadSchema,

});

const HarnessTriggerInputSchema = z

.object({

session_id: z.string().optional(),

message_id: z.string().optional(),

payload: RunStartPayloadSchema,

})

.strict();

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@harness/src/harness/trigger.ts` around lines 23 - 27, HarnessTriggerInputSchema currently uses z.object(...) which strips unknown keys; change it to a strict schema so unexpected/stale fields are rejected. Locate the HarnessTriggerInputSchema declaration and call the Zod strict API on that object (i.e., convert the z.object({...}) definition to use .strict() so unknown keys like legacy function_id will cause validation to fail), keeping RunStartPayloadSchema and the existing optional session_id/message_id shape unchanged.

coderabbitai · 2026-05-25T10:03:21Z

+export async function execute(iii: ISdk, write: ParsedAbortSignalWrite): Promise<void> {
  try {
-    await iii.trigger<unknown, unknown>({
-      function_id: 'iii::durable::publish',
-      payload: { topic: STEP_TOPIC, data: { session_id } },
-    });
+    await wakeFromRecord(iii, write.session_id);
  } catch (err) {
-    logger.warn('turn::on_abort_signal: publish failed', {
-      session_id,
+    logger.warn('turn::on_abort_signal: wake failed', {
+      session_id: write.session_id,
      err: String(err),
    });
  }


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Retry abort wakes instead of acknowledging failures.

This adapter is the only reaction to the abort_signal write. If wakeFromRecord() throws and we return success anyway, that trigger event is gone and the abort may never schedule the follow-up FSM step. Re-throw after logging so the state trigger can retry.

Suggested fix

export async function execute(iii: ISdk, write: ParsedAbortSignalWrite): Promise<void> { try { await wakeFromRecord(iii, write.session_id); } catch (err) { logger.warn('turn::on_abort_signal: wake failed', { session_id: write.session_id, err: String(err), }); + throw err; } }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

export async function execute(iii: ISdk, write: ParsedAbortSignalWrite): Promise<void> {

try {

await iii.trigger<unknown, unknown>({

function_id: 'iii::durable::publish',

payload: { topic: STEP_TOPIC, data: { session_id } },

});

await wakeFromRecord(iii, write.session_id);

} catch (err) {

logger.warn('turn::on_abort_signal: publish failed', {

session_id,

logger.warn('turn::on_abort_signal: wake failed', {

session_id: write.session_id,

err: String(err),

});

}

export async function execute(iii: ISdk, write: ParsedAbortSignalWrite): Promise<void> {

try {

await wakeFromRecord(iii, write.session_id);

} catch (err) {

logger.warn('turn::on_abort_signal: wake failed', {

session_id: write.session_id,

err: String(err),

});

throw err;

}

}

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@harness/src/turn-orchestrator/on-abort-signal.ts` around lines 37 - 45, The handler currently swallows failures from wakeFromRecord in execute(iii, write) by logging and returning success; instead, after logging the warning (logger.warn) re-throw the caught error so the caller/adapter can retry the abort wake; modify the catch block to log the error with session_id and then throw the original err (or rethrow) rather than returning silently.

coderabbitai · 2026-05-25T10:03:22Z

  await stateSet(iii, turnStateKey(rec.session_id), rec);
+
+  await emitTurnStateChanged(
+    iii,
+    rec.session_id,
+    eventType,
+    rec as unknown as Record<string, unknown>,
+    prev !== null ? (prev as unknown as Record<string, unknown>) : undefined,
+  );


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Don’t emit turn_state_changed after a failed state write.

stateSet() swallows state::set errors on Lines 36-44, so this path still publishes a new turn_state_changed event even when session/<sid>/turn_state was never updated. That can desync the console from durable state.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@harness/src/turn-orchestrator/persistence.ts` around lines 66 - 74, The code currently calls stateSet(...) and then always emits emitTurnStateChanged(...), but stateSet swallows errors so a failed write still triggers the event; change the flow so that failed state writes prevent emitting: either make stateSet propagate/throw on error or wrap the await stateSet(...) call in a try/catch and on catch do not call emitTurnStateChanged (instead rethrow or return an error), ensuring emitTurnStateChanged(rec.session_id, ...) is only invoked when stateSet successfully completes; reference the stateSet, turnStateKey, and emitTurnStateChanged symbols when implementing the guard.

coderabbitai · 2026-05-25T10:03:22Z

+  if (shouldWakeStep(prev?.state ?? null, rec.state)) {
+    await wakeState(iii, rec.session_id, rec.state);
+  }


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Surface durable-wake failures here instead of treating them as best effort.

wakeState() in harness/src/turn-orchestrator/wake.ts currently catches and logs enqueue failures, so saveRecord() can persist a stepable state and return with no wake scheduled. A transient queue error would strand the session until some unrelated write happens.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@harness/src/turn-orchestrator/persistence.ts` around lines 85 - 87, The current flow lets wakeState enqueue failures be swallowed (so saveRecord can return without a scheduled wake); change behavior so durable-wake failures surface as errors: in saveRecord (or the caller that runs shouldWakeStep), call wakeState(iii, rec.session_id, rec.state) and propagate any error it returns/throws (do not ignore/log-and-continue), or alter wakeState in harness/src/turn-orchestrator/wake.ts to rethrow enqueue failures (or return an explicit error) so saveRecord can detect it and fail the save. Ensure you reference shouldWakeStep and wakeState and make the call await and let exceptions bubble up instead of treating wake as best-effort.

coderabbitai · 2026-05-25T10:03:22Z

+  let triggerError: string | null = null;
+  const triggerPromise = iii
+    .trigger<unknown, unknown>({
+      function_id: targetFn,
+      payload: input,
+      timeoutMs: 300_000,
+    })
+    .catch((err) => {
+      logger.warn('provider stream trigger failed', { targetFn, err: String(err) });
+      triggerError = formatProviderError(err);
+      done = true;
+      if (resolveNext) {
+        const fn = resolveNext;
+        resolveNext = null;
+        fn();
+      }
+      return null;
+    });
+
+  const readPromise = (async (): Promise<AssistantMessage | null> => {
+    let final: AssistantMessage | null = null;
+    while (!done) {
+      while (messageQueue.length > 0) {
+        const text = messageQueue.shift();
+        if (text === undefined) break;
+        let event: AssistantMessageEvent | null = null;
+        try {
+          event = JSON.parse(text) as AssistantMessageEvent;
+        } catch (err) {
+          logger.warn('decode AssistantMessageEvent failed', {
+            session_id: rec.session_id,
+            err: String(err),
+          });
+          continue;
+        }
+        const partial = eventPartial(event);
+        if (partial) final = partial;
+        if (event.type !== 'done' && event.type !== 'error') {
+          if (partial) {
+            await emit(iii, rec.session_id, {
+              type: 'message_update',
+              message: partial,
+              llm_event: event,
+            });
+            if (event.type === 'text_delta' || event.type === 'thinking_delta') {
+              rec.assistant_body_streamed = true;
+            }
+          }
+          continue;
+        }
+        if (event.type === 'done') final = event.message;
+        else final = event.error;
+        done = true;
+        break;
+      }
+      if (done) break;
+      await new Promise<void>((r) => {
+        resolveNext = r;
+      });
+    }
+    return final;
+  })();
+
+  const [, finalMsg] = await Promise.all([triggerPromise, readPromise]);


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Prevent indefinite wait when provider completes without a terminal stream event.

At Line 170 and Line 205, the read loop can block forever if the trigger resolves but no done/error frame is ever enqueued. This stalls the FSM transition.

Proposed fix

let triggerError: string | null = null; + let triggerSettled = false; const triggerPromise = iii .trigger<unknown, unknown>({ function_id: targetFn, payload: input, timeoutMs: 300_000, }) .catch((err) => { logger.warn('provider stream trigger failed', { targetFn, err: String(err) }); triggerError = formatProviderError(err); done = true; if (resolveNext) { const fn = resolveNext; resolveNext = null; fn(); } return null; - }); + }) + .finally(() => { + triggerSettled = true; + if (resolveNext) { + const fn = resolveNext; + resolveNext = null; + fn(); + } + }); const readPromise = (async (): Promise<AssistantMessage | null> => { let final: AssistantMessage | null = null; while (!done) { while (messageQueue.length > 0) { const text = messageQueue.shift(); @@ } if (done) break; + if (triggerSettled && messageQueue.length === 0) { + done = true; + break; + } await new Promise<void>((r) => { resolveNext = r; }); } return final; })();

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@harness/src/turn-orchestrator/states/assistant-streaming.ts` around lines 149 - 212, The read loop can hang if iii.trigger (triggerPromise) resolves or rejects without a terminal 'done'/'error' event; after creating triggerPromise (the iii.trigger call) attach a finally handler that sets done = true and wakes the reader by calling resolveNext (clearing it) if present so readPromise's loop unblocks; ensure this is added alongside the existing .catch so both success and failure paths will signal completion to the read loop (refer to triggerPromise, readPromise, done, resolveNext, messageQueue, and iii.trigger).

coderabbitai · 2026-05-25T10:03:22Z

+  for (const e of executed) {
+    let result = e.result;
+    const merged = await publishAfter(iii, e.function_call, result);
+    if (
+      merged &&
+      typeof merged === 'object' &&
+      Array.isArray((merged as Record<string, unknown>).content)
+    ) {
+      result = merged as FunctionResult;
+    }
+    if (!result.terminate) all_terminate = false;
+    function_results.push({
+      role: 'function_result',
+      function_call_id: e.function_call.id,
+      function_id: e.function_call.function_id,
+      content: result.content,
+      details: result.details,
+      is_error: e.is_error,
+      timestamp: Date.now(),
+    });


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Keep is_error consistent with the post-hook merged result.

At Line 114, is_error is taken from e.is_error even after Line 99–106 may replace result. If hooks change details to an error/denial shape, is_error can become stale.

Proposed fix

for (const e of executed) { let result = e.result; @@ if ( merged && typeof merged === 'object' && Array.isArray((merged as Record<string, unknown>).content) ) { result = merged as FunctionResult; } + const is_error = isErrorResult(result); if (!result.terminate) all_terminate = false; function_results.push({ role: 'function_result', function_call_id: e.function_call.id, function_id: e.function_call.function_id, content: result.content, details: result.details, - is_error: e.is_error, + is_error, timestamp: Date.now(), }); }

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@harness/src/turn-orchestrator/states/function-execute.ts` around lines 97 - 116, The code uses e.is_error when pushing into function_results but publishAfter (called on e.function_call/result) can replace result with merged, so is_error may be stale; update the logic after obtaining merged (in the loop that iterates executed and calls publishAfter) to derive is_error from the final result/merged (e.g., inspect the merged object's error/denial shape or a boolean property on the resulting FunctionResult) instead of using e.is_error, and set the is_error field in the pushed object from that computed value so function_results reflects post-hook changes.

coderabbitai · 2026-05-25T10:03:22Z

+    case 'steering':
    case 'followup': {
+      const inbox = decision === 'steering' ? steering : followup;
      await emitTurnEndOnce(iii, rec);
      const messages = await persistence.loadMessages(iii, rec.session_id);
-      messages.push(...followup);
+      messages.push(...inbox);
      await persistence.saveMessages(iii, rec.session_id, messages);
      rec.function_results = [];
-      transitionTo(rec, 'awaiting_assistant');
+      transitionTo(rec, 'assistant_streaming');


⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Avoid destructive inbox drains before the transition is durable.

This branch consumes messages that were already removed by session-inbox::drain on Lines 101-103. If saveMessages() here or persistence.saveRecord() in harness/src/turn-orchestrator/run-transition.ts fails afterward, the retried step still sees steering_check but those user inputs are gone. Please switch this to a claim/ack flow, or durably stage the drained items before removing them from the inbox.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@harness/src/turn-orchestrator/states/steering-check.ts` around lines 129 - 137, The steering/followup branch currently drains and appends inbox items destructively (using steering/followup, emitTurnEndOnce, persistence.loadMessages and persistence.saveMessages) before the transition is made durable, risking data loss on failure; instead implement a claim/ack or durable staging step: atomically move or persist the drained items (e.g., write them to a staged buffer on the record like rec.staged_inbox or call a new persistence.stageMessages API) before removing them from the session inbox, only clear/ack the original inbox after persistence.saveMessages and persistence.saveRecord succeed, and keep rec.function_results reset and transitionTo(rec, 'assistant_streaming') only after the durable acknowledgement completes so retries see the staged items.

coderabbitai · 2026-05-25T10:03:22Z

+  const messages: AgentMessage[] = await persistence.loadMessages(iii, rec.session_id);
  await emit(iii, rec.session_id, { type: 'agent_end', messages });
  transitionTo(rec, 'stopped');


⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Don't mark the session stopped when agent_end is only best-effort.

emit() swallows stream::set failures, so Line 18 can fail silently and Line 19 still advances the record to stopped. That permanently drops the terminal event for downstream consumers. Please propagate the emit failure, or persist a terminal-emitted/idempotency marker and retry agent_end before finalizing teardown.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@harness/src/turn-orchestrator/states/tearing-down.ts` around lines 17 - 19, The code calls emit(iii, rec.session_id, { type: 'agent_end', messages }) and immediately calls transitionTo(rec, 'stopped'), but emit() can swallow stream::set failures so the terminal event may never be delivered; update the teardown in tearing-down.ts to either propagate emit's error (i.e., await and bubble up any failure from emit) or implement a durable terminal-emitted/idempotency marker and retry logic: after persistence.loadMessages(...) call emit(...) and if it fails, persist a terminal-marker tied to rec.session_id (or retry emit a configurable number of times with backoff) and only call transitionTo(rec, 'stopped') once emit has succeeded or the durable marker confirms the terminal event was recorded; reference functions/variables: persistence.loadMessages, emit, transitionTo, rec.session_id, and the 'agent_end' message type.

coderabbitai · 2026-05-25T10:03:22Z

+  try {
+    await iii.trigger({
+      function_id: turnFnId(state),
+      payload: { session_id },
+      action: TriggerAction.Enqueue({ queue: TURN_STEP_QUEUE }),
+    });
+  } catch (err) {
+    logger.warn('wakeState failed', { session_id, state, err: String(err) });
+  }


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Don’t swallow enqueue failures in wakeState; propagate them after logging.

If enqueue fails and this function only warns, the session can stall indefinitely with no guaranteed follow-up wake. This is a reliability break for the durable FSM path.

Suggested fix

export async function wakeState(iii: ISdk, session_id: string, state: TurnState): Promise<void> { try { await iii.trigger({ function_id: turnFnId(state), payload: { session_id }, action: TriggerAction.Enqueue({ queue: TURN_STEP_QUEUE }), }); } catch (err) { logger.warn('wakeState failed', { session_id, state, err: String(err) }); + throw err; } }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

try {

await iii.trigger({

function_id: turnFnId(state),

payload: { session_id },

action: TriggerAction.Enqueue({ queue: TURN_STEP_QUEUE }),

});

} catch (err) {

logger.warn('wakeState failed', { session_id, state, err: String(err) });

}

try {

await iii.trigger({

function_id: turnFnId(state),

payload: { session_id },

action: TriggerAction.Enqueue({ queue: TURN_STEP_QUEUE }),

});

} catch (err) {

logger.warn('wakeState failed', { session_id, state, err: String(err) });

throw err;

}

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@harness/src/turn-orchestrator/wake.ts` around lines 29 - 37, The current wakeState call catches failures from iii.trigger (the enqueue to TURN_STEP_QUEUE) and only logs a warning, which can stall the session; change the catch to log the error using the existing logger (keeping session_id and state) and then rethrow the original error so callers can observe/fail fast; specifically update the try/catch around iii.trigger (the call that uses function_id: turnFnId(state) and action: TriggerAction.Enqueue({ queue: TURN_STEP_QUEUE })) to logger.warn(..., { session_id, state, err }) followed by rethrowing the caught err.

…run-start-simplify # Conflicts: # harness/tests/turn-orchestrator/approval-resume.test.ts

Simplify harness ingress and async run::start path.

b53f7f7

Console calls harness::trigger with a flat payload; harness always starts run::start. Remove run::start_and_wait, cwd state keys, and initial agent::events replay at kickoff so the FSM owns turn lifecycle.

vercel Bot deployed to Preview May 20, 2026 23:01 View deployment

ytallo added 4 commits May 20, 2026 20:08

Simplify turn_state trigger adapters with Zod boundary parsing.

160e87b

Replace manual typeof/Record parsing with shared TurnStateWriteEventSchema and parseTurnStateWrite, matching the run-start/trigger refactor style.

Colocate turn-orchestrator registration with handler modules.

59c79f5

Each state-trigger adapter now exports register() with inline function ids, matching run-start.ts. register.ts only composes modules; exported ID constants removed.

Remove isStepableRecordWrite wrapper and inline parseStepableWrite in…

1bffc94

… the register condition.

Split stepOnStepableWrite from the raw event adapter so parse narrowi…

11c95b1

…ng is type-enforced. Share Zod turn_state write parsing between adapters; add handler tests for direct invokes that bypass the engine condition gate.

vercel Bot deployed to Preview May 20, 2026 23:26 View deployment

vercel Bot deployed to Preview May 21, 2026 20:46 View deployment

ytallo added 2 commits May 22, 2026 19:27

Enqueue turn wakes on the turn-step FIFO queue and gate durable steps.

0cb0c82

Replace turn::step_requested publishes with queue enqueues in approval resume and abort handling, add turn::should_step filtering for unknown or terminal sessions, and tighten get_state with Zod payload parsing.

Split turn-orchestrator into per-state queue handlers.

c983544

Replace the monolithic turn::step subscriber with turn::{state} functions enqueued on turn-step, consolidate UI turn_state_changed emits into persistence saves, and merge assistant/function states for a simpler FSM.

vercel Bot deployed to Preview May 22, 2026 22:28 View deployment

vercel Bot deployed to Preview May 22, 2026 22:47 View deployment

style: apply biome formatting to register and steering test

4467fab

vercel Bot deployed to Preview May 23, 2026 21:20 View deployment

vercel Bot deployed to Preview May 24, 2026 10:03 View deployment

chore: remove pnpm-lock.yaml file

360ed05

- Deleted the pnpm-lock.yaml file from the project, which may indicate a shift in package management strategy or a cleanup of unused files.

vercel Bot deployed to Preview May 24, 2026 10:07 View deployment

vercel Bot deployed to Preview May 24, 2026 19:48 View deployment

delete: remove harness-flow.html file

624dac1

- Deleted the harness-flow.html file, which contained the end-to-end information flow documentation for the project. This may indicate a shift in documentation strategy or a move towards a different format for presenting this information.

vercel Bot deployed to Preview May 24, 2026 19:49 View deployment

ytallo marked this pull request as ready for review May 25, 2026 08:32

chore: ignore local harness-node build output

cdf44e1

vercel Bot deployed to Preview May 25, 2026 08:35 View deployment

style: apply biome 2.4.10 formatting to turn-orchestrator files

92d13c6

CI lints with biome 2.4.10 while the repo pins ^1.9.4; reformat the files whose 2.x layout differed to clear the 7 format errors.

vercel Bot deployed to Preview May 25, 2026 08:41 View deployment

docs(harness): document harness::trigger flat run::start ingress

b93ad52

harness::trigger no longer forwards a client-supplied function_id; it takes a flat payload and always invokes run::start. Update harness.md and the architecture telemetry/ingestion-bridge section to match.

vercel Bot deployed to Preview May 25, 2026 09:43 View deployment

coderabbitai Bot reviewed May 25, 2026

View reviewed changes

ytallo mentioned this pull request May 25, 2026

Turn-orchestrator: leaner durable FSM (reactive approval, finishSession teardown, readability) #185

Merged

5 tasks

andersonleal approved these changes May 25, 2026

View reviewed changes

Merge remote-tracking branch 'origin/main' into feat/harness-trigger-…

608742e

…run-start-simplify # Conflicts: # harness/tests/turn-orchestrator/approval-resume.test.ts

vercel Bot deployed to Preview May 25, 2026 18:14 View deployment

ytallo merged commit b330907 into main May 25, 2026
15 of 16 checks passed

coderabbitai Bot mentioned this pull request May 28, 2026

fix(turn-orchestrator): serialize concurrent approval wakes #198

Merged

3 tasks

Conversation

ytallo commented May 20, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Harness ingress & run::start

Turn-orchestrator restructure

Turn FSM

Previous FSM

Event translation

Config, docs & cleanup

Test plan

Summary by CodeRabbit

Uh oh!

vercel Bot commented May 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

coderabbitai Bot commented May 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review failed

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

Uh oh!

github-actions Bot commented May 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

skill-check — worker

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 25, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 25, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 25, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 25, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 25, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 25, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 25, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 25, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 25, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 25, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ytallo commented May 20, 2026 •

edited by coderabbitai Bot

Loading

Harness ingress & `run::start`

vercel Bot commented May 20, 2026 •

edited

Loading

coderabbitai Bot commented May 20, 2026 •

edited

Loading

github-actions Bot commented May 20, 2026 •

edited

Loading