Skip to content

ci: verify managed run replay in workflow#17

Merged
willamhou merged 1 commit into
mainfrom
codex/managed-run-replay-ci
Apr 23, 2026
Merged

ci: verify managed run replay in workflow#17
willamhou merged 1 commit into
mainfrom
codex/managed-run-replay-ci

Conversation

@willamhou

@willamhou willamhou commented Apr 23, 2026

Copy link
Copy Markdown
Owner

Summary

  • extend the managed run workflow to verify a run, replay it, and then verify the replayed run
  • factor the example shell helpers so replay and verification share the same run-selection logic
  • update beta docs and examples to reflect replay as part of the managed runtime surface

Why

This keeps the CI/example wiring in a separate review from the runtime changes, while still exercising the replay loop that the managed beta now supports.

Validation

  • bash -n examples/verify-managed-run.sh examples/replay-managed-run.sh examples/lib/managed-run-helpers.sh
  • bash examples/verify-managed-run.sh --help
  • bash examples/replay-managed-run.sh --help

Notes

  • The runtime correctness fixes from managed: add run replay and durable terminal cleanup #16 are now on main, so this PR only carries the workflow/examples/docs layer.
  • I did not run a live end-to-end gateway + secrets replay flow from this shell; the workflow wiring and local helper behavior were verified instead.

@willamhou willamhou marked this pull request as ready for review April 23, 2026 07:14

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: a24c5ffb5f

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +137 to +140
terminal_status="$(
wait_for_managed_run_terminal "${RUN_ID}" "${WAIT_TIMEOUT_SECS}" "${POLL_MS}"
)"
echo "Run ${RUN_ID} reached terminal status: ${terminal_status}" >&2

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Fail --wait when managed run does not complete

In --wait mode the script captures terminal_status and logs it, but never enforces that the status is completed before continuing to hermes runs verify --strict. Because strict verification only checks Signet receipts/audit integrity (not run outcome), a replay that ends in failed, cancelled, or timed_out after emitting receipts can still make this step succeed, so the new workflow can report replay verification as green even when the replay execution itself failed.

Useful? React with 👍 / 👎.

@willamhou willamhou changed the base branch from codex/managed-run-replay-cleanup to main April 23, 2026 07:32
@willamhou willamhou force-pushed the codex/managed-run-replay-ci branch from 777fcd6 to 7ad71d0 Compare April 23, 2026 07:34
@willamhou willamhou merged commit 08092f9 into main Apr 23, 2026
2 checks passed
@willamhou willamhou deleted the codex/managed-run-replay-ci branch April 23, 2026 08:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant