Skip to content

fix: include task execution errors in task logs#60

Merged
kaiitunnz merged 1 commit into
mainfrom
fix/task-error-logs
May 29, 2026
Merged

fix: include task execution errors in task logs#60
kaiitunnz merged 1 commit into
mainfrom
fix/task-error-logs

Conversation

@kaiitunnz
Copy link
Copy Markdown
Collaborator

@kaiitunnz kaiitunnz commented May 29, 2026

Purpose

Surface task execution failures in the task log stream that users read via flowmesh logs. Today a failure only reaches the task's error status field; the log stream gets a bare Task <id> failed line with no reason. This is the "Task error logs" item from the 2026 Q2 roadmap RFC (#48) — debugging a failed task, especially one whose spec fails an executor-side validation check, currently requires inspecting task status separately from the logs.

The gap had two causes: the worker's failure handler logged a non-substantive message, and TaskLogEmitter.emit() built its payload from record.getMessage() only, silently discarding record.exc_info so no traceback ever reached the stream.

Changes

  • src/worker/utils/logging.pyTaskLogEmitter.emit() now appends the formatted traceback to the emitted message whenever record.exc_info is set, using a shared logging.Formatter and caching into record.exc_text (consistent with stdlib Formatter.format). Any logger.exception(...) under the attached handler now carries its traceback into both the gRPC stream and the JSONL sink.
  • src/worker/runner.py — the task failure handler distinguishes controlled ExecutionError (logs a clean Task <id> failed: <reason> line, no traceback) from unexpected exceptions (logs the full traceback). TaskCancelledError keeps its own separate branch since it terminates as CANCELLED, not FAILED.
  • tests/worker/test_task_log_emitter.py — new unit tests asserting emit() includes the traceback and exception text when exc_info is present, and emits just the formatted message otherwise.

Test Plan

uv run pytest tests/worker/test_task_log_emitter.py tests/worker/test_connector_logging.py
uv run pre-commit run --files src/worker/runner.py src/worker/utils/logging.py tests/worker/test_task_log_emitter.py

Also verified end-to-end against a local stack: brought up the server with CPU workers, submitted a workflow whose spec passes the server parser but fails an executor-side validation check, and confirmed the executor's error reason now appears in both the task log stream (flowmesh task logs show) and the workflow log stream.

Test Result

  • pytest tests/worker/test_task_log_emitter.py tests/worker/test_connector_logging.py — 3 passed.
  • pre-commit on the changed files — gitleaks, isort, black, ruff, codespell, mypy all passed.
  • End-to-end on a local stack — the executor's validation error (echo executor mapping item must contain either 'expr' ...) appeared in both the task and workflow log streams, where before this change the stream showed only a bare Task <id> failed. The task's terminal error status field still reports the dispatcher's generic max_attempts_exceeded, which is exactly why surfacing the real reason in the logs matters.

Executor failures only reached the task's error field, never the task
log stream users read via `flowmesh logs`: the failure handler logged a
bare "Task failed" line and TaskLogEmitter dropped record exc_info. Emit
the formatted traceback for unexpected exceptions and a clean message
for controlled ExecutionError, so the reason is visible in the stream.

Signed-off-by: Noppanat Wadlom <noppanat.wad@gmail.com>
@kaiitunnz kaiitunnz requested a review from timzsu as a code owner May 29, 2026 10:28
@timzsu timzsu mentioned this pull request May 29, 2026
12 tasks
Copy link
Copy Markdown
Collaborator

@timzsu timzsu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@kaiitunnz kaiitunnz merged commit b5174c6 into main May 29, 2026
12 checks passed
@kaiitunnz kaiitunnz deleted the fix/task-error-logs branch May 29, 2026 11:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants