Skip to content

fix(reflect): unwrap JSON answer envelopes#2345

Open
xmh1011 wants to merge 2 commits into
vectorize-io:mainfrom
xmh1011:fix/issue-2298-reflect-json-envelope
Open

fix(reflect): unwrap JSON answer envelopes#2345
xmh1011 wants to merge 2 commits into
vectorize-io:mainfrom
xmh1011:fix/issue-2298-reflect-json-envelope

Conversation

@xmh1011

@xmh1011 xmh1011 commented Jun 22, 2026

Copy link
Copy Markdown
Contributor

Summary

  • unwrap reflect responses when a model returns the done-tool JSON envelope as visible answer text
  • keep normal JSON answers intact by requiring reflect envelope keys and tightening trailing id cleanup
  • strengthen the reflect tool prompt against JSON-envelope answers

Closes #2298

Tests

  • uv run --no-sync pytest -q tests/test_reflect_agent.py::TestCleanAnswerText tests/test_reflect_agent.py::TestCleanDoneAnswer tests/test_reflect_prompt_builder.py
  • uv run --no-sync ruff check hindsight_api/engine/reflect/agent.py hindsight_api/engine/reflect/prompts.py tests/test_reflect_agent.py tests/test_reflect_prompt_builder.py
  • uv run --no-sync ruff format --check hindsight_api/engine/reflect/agent.py hindsight_api/engine/reflect/prompts.py tests/test_reflect_agent.py tests/test_reflect_prompt_builder.py
  • pre-commit hooks completed during git commit

@nicoloboschi nicoloboschi left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to reproduce #2298 against the exact model from the issue (ollama-cloud/gpt-oss:120b-cloud, per @benfrank241's comment) before reviewing, and the investigation changes what the right fix looks like.

Reproduction: 35/35 clean runs, no envelope

I replicated the exact reflect LLM call — real build_system_prompt_for_tools + the real done tool schema (including the directive-compliance field from _build_done_tool_with_directives) — and ran a minimal tool loop against gpt-oss:120b-cloud:

Effort Runs Envelope leaks Clean native done()
low 6 0 6
medium 8 + 15 (heavy "prove per-directive compliance or be REJECTED" directives) 0 23
high 6 0 6

Every run produced a proper native done tool call with clean markdown in answer. I could not trigger the JSON envelope even with aggressive compliance-checklist directives. (Tested through a current ollama; harmony tool-call parsing surfaces the call natively.)

The root-cause framing in #2298 is wrong

directive_compliance is not a field the model invents to "prove" compliance. It's a real, required Hindsight fieldtools_schema.py::_build_done_tool_with_directives() adds it to the done tool whenever directives are present. So the envelope the reporter saw:

{"answer": "...", "directive_compliance": "...", "memory_ids": [...], "mental_model_ids": [], "observation_ids": []}

is literally the done tool-call arguments, serialized as text instead of delivered as a native tool call. This is a tool-call delivery/parsing issue specific to gpt-oss's harmony format (the repo already band-aids this — see the done<|channel|>commentary handling in _normalize_tool_name and the done({...})-as-text regex _DONE_CALL_PATTERN). It is not the prompt-anchoring problem the issue hypothesizes. The leak appears to be environment-specific (older ollama / the ollama.com endpoint where the harmony channel isn't parsed into native tool calls).

What that means for this PR

  • The unwrap post-processing is the genuinely useful part. It recovers the answer from a leaked done-args envelope at _clean_answer_text (the no-native-tool text path) and _clean_done_answer — exactly where a leaked call lands. Including directive_compliance in the allowlist is correct (it's a real schema field, not overfitting), and the subset guard sensibly avoids clobbering legitimate JSON answers. This is the same family of recovery shim the repo already ships for harmony leaks. I'd suggest renaming/recommenting it to reflect the real mechanism — "recover a done tool-call that was rendered as text" — rather than "model-invented envelope," so the next reader isn't misled.

  • The prompt line is the weak half, and slightly self-contradictory. It instructs the model to "never include fields like directive_compliance ... in the answer text," while the done tool schema in the same prompt requires the model to produce directive_compliance. Since the actual failure is at the parsing layer (a tool call that wasn't parsed as one), a generation-side instruction is aimed at the wrong layer — not harmful, but unlikely to be what fixes the reported symptom. I'd drop or soften it.

Suggestion

Land the unwrap shim (it's a reasonable, correctly-shaped band-aid for a provider leak that's outside this codebase to fix properly), but reframe the PR/issue away from "the model invents a JSON envelope to prove compliance." The durable fix is at the harmony tool-call parsing layer / an ollama version floor; worth a note in the issue since the symptom isn't reproducible on a current parser.

@xmh1011

xmh1011 commented Jun 23, 2026

Copy link
Copy Markdown
Contributor Author

@nicoloboschi Addressed in c8cace0.

  • Reframed the recovery helper/tests around leaked done tool-call arguments rendered as JSON text, including directive_compliance as a real done-schema field.
  • Removed the new prompt instruction that discouraged directive_compliance/done fields in answer text, since that conflicted with the required tool schema and the useful fix is the parser-side recovery shim.
  • Re-ran targeted reflect tests and lint.

Validation:

  • uv run --no-sync pytest -q tests/test_reflect_agent.py::TestCleanAnswerText tests/test_reflect_agent.py::TestCleanDoneAnswer tests/test_reflect_prompt_builder.py
  • uv run --no-sync ruff check hindsight_api/engine/reflect/agent.py hindsight_api/engine/reflect/prompts.py tests/test_reflect_agent.py tests/test_reflect_prompt_builder.py
  • uv run --no-sync ruff format --check hindsight_api/engine/reflect/agent.py hindsight_api/engine/reflect/prompts.py tests/test_reflect_agent.py tests/test_reflect_prompt_builder.py
  • ./scripts/hooks/lint.sh

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

reflect: model emits JSON envelope as answer text on 0.8.3

2 participants