Skip to content

Centaur marks Codex compaction/runtime failure as successful completion #357

@GoonMachine

Description

@GoonMachine

Summary

A Codex turn hit a remote context compaction/runtime error, but Centaur finalized the execution as completed and used an interim progress message as the final result.

Observed

Execution: exe_f195c63bdaa246eb
Thread key: slack:T03USE6Q064:C083QLFNXPY:1780358702.780989

Relevant event sequence:

item.started contextCompaction
error: Error running remote compact task: unexpected status 502 Bad Gateway: bad gateway,
url: https://chatgpt.com/backend-api/codex/responses/compact
turn.done result: "I found four open overlay PRs that touch `tools/github_metadata`: #9,
#15, #16, and #5 indirectly through silent trace prompts/tests. I’m pulling their changed
files now to compare overlap method-by-method."
execution status: completed
terminal_reason: completed

Centaur also emitted error_observed, but the execution summary still ended as completed.

Expected

If a Codex turn encounters a terminal runtime/compaction error, Centaur should not silently finalize the run as successful with stale or interim assistant text.

The execution should either:

  • be marked failed/degraded with the Codex error preserved, or
  • only be marked completed if a later terminal success clearly supersedes the error.

Why It Matters

Slack users see a partial progress update as if the task completed. Downstream trace workflows also see a successful terminal state even though the Codex runtime failed.

Possible Root Cause

The Codex error appears to be normalized into an observation, but the later synthetic turn.done does not carry is_error or error.

There may also be error text loss during normalization: the raw event had the 502 message, while the projected error_observed had error_chars = 13, which looks like Unknown error.

Relevant code paths to inspect:

  • services/api/api/sandbox/normalize.py
  • services/api/api/agent.py
  • services/api/api/runtime_control.py

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions