Skip to content

[codex] Preserve cache_control when merging system prompts#1946

Open
yovinchen wants to merge 4 commits intomainfrom
codex/issue-1881-cache-control
Open

[codex] Preserve cache_control when merging system prompts#1946
yovinchen wants to merge 4 commits intomainfrom
codex/issue-1881-cache-control

Conversation

@yovinchen
Copy link
Copy Markdown
Collaborator

@yovinchen yovinchen commented Apr 8, 2026

Summary

  • preserve message-level cache_control when fragmented system prompts are merged into a single leading OpenAI chat system message
  • intentionally drop cache_control only when merged system fragments carry conflicting cache hints
  • add regression coverage for both the preserved and conflicting merge cases

Why

PR #1942 fixed strict OpenAI-compatible backends that reject fragmented system prompts, but the merge path rebuilt the combined system message with only role and content. That dropped cache_control from the original system blocks and could disable prompt-prefix caching for cache-aware upstreams.

This follow-up keeps the strict-backend compatibility fix while preserving cache hints whenever they can be inherited unambiguously.

Validation

  • cargo test proxy::providers::transform --manifest-path src-tauri/Cargo.toml

Closes #1942
Closes #1945

Strict OpenAI-compatible chat backends still need fragmented Claude\nsystem prompts collapsed into one leading system message, but that\nnormalization should not silently drop stable cache hints. Preserve\nmessage-level cache_control when the merged system fragments agree,\nand fall back to omitting it when the fragments conflict.\n\nConstraint: Must keep single-system normalization for Nvidia/Qwen-style chat backends\nRejected: Always copy the first cache_control | could misrepresent conflicting cache boundaries\nConfidence: high\nScope-risk: narrow\nReversibility: clean\nDirective: If system prompt merging changes again, preserve cache_control whenever the merged metadata is unambiguous\nTested: cargo test proxy::providers::transform --manifest-path src-tauri/Cargo.toml\nNot-tested: End-to-end prompt caching behavior against cache-aware OpenAI-compatible upstreams\nRelated: #1881
@yovinchen
Copy link
Copy Markdown
Collaborator Author

@codex review

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: b11e36963f

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +231 to +235
if let Some(cache_control) = message.get("cache_control") {
match &inherited_cache_control {
None => inherited_cache_control = Some(cache_control.clone()),
Some(existing) if existing == cache_control => {}
Some(_) => cache_control_conflict = true,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Treat missing cache_control as merge conflict

When system fragments are merged, this branch only marks a conflict if two present cache_control values differ, but it ignores fragments where cache_control is absent. That means one cached fragment plus one uncached fragment will still produce a merged message with cache_control on the entire combined prompt. In mixed-prefix scenarios (static cached prefix + dynamic uncached suffix), this changes cache scope and can cache content that was intentionally left uncached; treating present/absent mixes as ambiguous and dropping cache_control would preserve original intent more safely.

Useful? React with 👍 / 👎.

The follow-up cache hint fix still treated mixed present/absent\ncache_control across fragmented system prompts as inheritable, which\nexpanded the cache scope after prompt collapse. Treat that mix as\nambiguous and only preserve cache_control when every merged fragment\nexplicitly agrees on the same value.\n\nConstraint: Must preserve strict-backend system prompt normalization from #1942\nRejected: Inherit first present cache_control | widens cache scope when later fragments were intentionally uncached\nConfidence: high\nScope-risk: narrow\nReversibility: clean\nDirective: Any future merged-system cache hint logic should treat missing cache_control as semantically significant\nTested: cargo test proxy::providers::transform --manifest-path src-tauri/Cargo.toml\nNot-tested: End-to-end upstream caching behavior against cache-aware relays\nRelated: #1881\nRelated: #1946
@yovinchen
Copy link
Copy Markdown
Collaborator Author

@codex review

@chatgpt-codex-connector
Copy link
Copy Markdown

Codex Review: Didn't find any major issues. Hooray!

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Reflow the two long cache-control regression assertions in transform.rs so the neighboring merge cases stay rustfmt-aligned and easier to scan.

This keeps the preserved code change separate from the untracked Markdown design notes the user did not want committed.

Constraint: Exclude Markdown design files from the commit while preserving the local code change
Rejected: Include docs in the same commit | user explicitly asked to leave Markdown files out
Confidence: high
Scope-risk: narrow
Reversibility: clean
Directive: Treat this as a readability-only test change; do not infer runtime behavior changes from it
Tested: cargo test --manifest-path src-tauri/Cargo.toml test_anthropic_to_openai_drops_ --lib
Tested: cargo check --manifest-path src-tauri/Cargo.toml --tests
Tested: pnpm format:check
Tested: pnpm typecheck
Not-tested: Full application integration and manual flows
@yovinchen yovinchen force-pushed the codex/issue-1881-cache-control branch from 223f976 to 5bccb4f Compare April 11, 2026 15:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG] Claude Code fails with "$.input_tokens" error and disconnects when using local LLM / custom CLAUDE_BASE_URL

1 participant