fix: summarize tool messages during role reversal in user simulator (Python) by Aryansharma28 · Pull Request #224 · langwatch/scenario

Aryansharma28 · 2026-02-17T15:42:48Z

What this fixes

The user simulator builds its prompt by calling reverse_roles() on the agent's conversation history. Two classes of messages cannot simply have their role flipped:

role: "tool" messages — tool result messages. Relabelling them as role: "user" produces an invalid request that both OpenAI and Anthropic APIs reject outright.
role: "assistant" messages with tool_calls — tool call messages. Same problem: tool_calls is not valid on a user message.

The old code worked around this by keeping tool-call messages as-is and dropping tool-result messages. That caused two separate bugs:

Consecutive assistant roles in the reversed history (the user simulator's context) when the agent made tool calls
Lost tool context — the user simulator had no idea what the agent had just done

What this changes

Both message types are now summarized as plain text and attributed to role: "user" (i.e. the agent's perspective after reversal):

{"role": "assistant", "tool_calls": [...]} → [Called tool search_products with: {"query": "headphones"}]
{"role": "tool", "content": "..."} → [Tool result from search_products: [...]]

This is exactly what the JS messageRoleReversal() already does. This PR brings the Python implementation into alignment.

A bare {"role": "assistant"} message with no content key at all is silently dropped — some models emit these and passing them through would produce an invalid {"role": "user"} message (Anthropic rejects it).

Files changed

python/scenario/_utils/utils.py — rewrites reverse_roles() with three new helper functions (_stringify_value, _has_tool_content, _summarize_tool_message)
python/tests/test_reverse_roles.py — 31 unit tests covering all message shapes and edge cases
python/examples/test_multiturn_tool_calls.py — E2E example: 10-turn shopping conversation where the agent makes tool calls on multiple turns, verifying the user simulator handles the full history correctly

Align Python reverse_roles() with JS messageRoleReversal(). Tool messages are now summarized as plain text instead of being kept as-is or dropped, preventing consecutive assistant roles and lost tool context in the user simulator's conversation history. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

drewdrewthis

@Aryansharma28 tests

Align Python reverse_roles() with JS messageRoleReversal(). Tool messages are now summarized as plain text instead of being kept as-is or dropped, preventing consecutive assistant roles and lost tool context in the user simulator's conversation history. - Use isinstance() instead of type() == dict - Add 30 unit tests covering all helper functions and edge cases Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

drewdrewthis

The title of the PR doesn't not say that it is summarizing tool calls. this is actually a bit concerning and I think needs further justification?

- _stringify_value: remove special None branch; json.dumps(None) = "null" which is JSON-consistent and avoids a Python-specific "None" string - _summarize_tool_message: document that text content is intentionally dropped when an assistant message has both content and tool_calls - reverse_roles: restore the guard that drops bare {"role": "assistant"} messages with no content key — Anthropic rejects them if passed through as {"role": "user"} with no content (regression from the original fix) - test_reverse_roles: update test_none expectation to "null", loosen test_non_serializable_fallback off CPython-specific repr, rename test_message_without_content_preserved to clarify content=None vs missing key, add test_bare_role_only_message_is_dropped for the guard - test_multiturn_tool_calls: fix mutable default argument in shopping_agent Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Aryansharma28 · 2026-03-05T21:37:09Z

Hi @drewdrewthis — thanks for the review comments, addressing both below.

Re: tests — unit tests are now included in python/tests/test_reverse_roles.py (31 tests covering all message shapes), and there's also an E2E example in python/examples/test_multiturn_tool_calls.py that runs a 10-turn shopping conversation with real tool calls to verify the user simulator handles the full history end-to-end.

Re: the title / summarization concern — totally fair, the original title was vague. Updated the title and PR body to explain this properly, but here's the reasoning:

The user simulator builds its prompt by calling reverse_roles() on the agent's conversation history. The problem is that two message types cannot be re-labelled with a different role — they have to be converted to plain text:

role: "tool" (tool result messages) — both OpenAI and Anthropic APIs reject a user message that has the tool role. You can't just flip the role.
role: "assistant" with tool_calls — same issue, tool_calls is not valid on a user message.

So the options were:

Drop them → user simulator loses all tool context and doesn't know what the agent did
Keep them as-is → consecutive assistant roles in the history, which confuses the user simulator
Summarize as plain text → the user simulator sees [Called tool X with: ...] / [Tool result from X: ...] as ordinary user messages — valid API request, and the simulator has full context

Option 3 is what the JS messageRoleReversal() already does. This PR brings Python into alignment with that.

Aryansharma28 requested a review from rogeriochaves February 17, 2026 15:42

drewdrewthis reviewed Feb 17, 2026

View reviewed changes

Aryansharma28 and others added 3 commits February 17, 2026 17:12

test: add 10-turn e2e test with tool calls for role reversal

9f09d88

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

fix: resolve pyright type errors in reverse_roles and tests

3043702

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Aryansharma28 requested a review from drewdrewthis February 17, 2026 17:57

drewdrewthis reviewed Feb 18, 2026

View reviewed changes

Aryansharma28 changed the title ~~fix: role reversal for tool messages in user simulator (Python)~~ fix: summarize tool messages during role reversal in user simulator (Python) Mar 5, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: summarize tool messages during role reversal in user simulator (Python)#224

fix: summarize tool messages during role reversal in user simulator (Python)#224
Aryansharma28 wants to merge 5 commits intomainfrom
feat/role-reversal-py

Aryansharma28 commented Feb 17, 2026 •

edited

Loading

Uh oh!

drewdrewthis left a comment

Uh oh!

drewdrewthis left a comment

Uh oh!

Aryansharma28 commented Mar 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Aryansharma28 commented Feb 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What this fixes

What this changes

Files changed

Uh oh!

drewdrewthis left a comment

Choose a reason for hiding this comment

Uh oh!

drewdrewthis left a comment

Choose a reason for hiding this comment

Uh oh!

Aryansharma28 commented Mar 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Aryansharma28 commented Feb 17, 2026 •

edited

Loading