Skip to content

DRAFT: Decouple LiteLLM extra-headers from Custom LLM via router_settings (alternative to #669)#674

Draft
jpshackelford wants to merge 1 commit into
mainfrom
extra-headers-router-alt
Draft

DRAFT: Decouple LiteLLM extra-headers from Custom LLM via router_settings (alternative to #669)#674
jpshackelford wants to merge 1 commit into
mainfrom
extra-headers-router-alt

Conversation

@jpshackelford
Copy link
Copy Markdown
Member

Summary

Alternative implementation of #664, layered on top of #669. Instead of attaching extra_headers to the custom-llm model_list entry (which is gated on llm_provider == "custom" and therefore only helps Custom-LLM users), this renders the static headers into router_settings.default_litellm_params.extra_headers on the bundled LiteLLM Proxy. LiteLLM Router merges those defaults into every completion call via kwargs.setdefault(), so they apply uniformly across all providers (Anthropic, OpenAI, Azure, Bedrock, Vertex, Gemini, Groq, OpenRouter, DeepSeek, Mistral, custom) without per-model template duplication.

Posted against alona/custom-llm-extra-headers so it can be reviewed as a delta on top of #669 — same UX patterns (enable checkbox + regex-validated JSON textarea + format label), same mustFromJson | mustToJson render hygiene, just a different attach point inside the proxy config.

Why this might be worth considering

The current PR ties the new headers feature to Custom LLM. Operators whose enterprise gateway sits in front of, say, Azure or Anthropic native are not helped — they would need to switch to Custom LLM (losing provider-specific niceties) or hand-edit the chart.

LiteLLM Proxy already supports a global default-headers mechanism that #664's design overlooked:

  • Router.__init__ accepts default_litellm_params (litellm/router.py:277). The proxy passes any key from router_settings.* that matches a Router.__init__ kwarg straight through (litellm/proxy/proxy_server.py:4505-4540, using Router.get_valid_args() which is derived from inspect.getfullargspec(Router.__init__)).
  • Defaults are merged into every call via _update_kwargs_with_default_litellm_params (litellm/router.py:3040), called from _update_kwargs_with_deployment for every chat completion and embedding routed through the Router. setdefault semantics mean per-request kwargs win; otherwise the global default is injected.
  • Every major provider's transformation accepts extra_headers in its optional-params allow-list: OpenAI chat (gpt_transformation.py:160), Anthropic (anthropic/chat/transformation.py:444, 1550-1551), Azure (azure/chat/gpt_transformation.py:102), Bedrock Converse (bedrock/chat/converse_transformation.py:507), Vertex/Gemini (vertex_ai/gemini/vertex_and_google_ai_studio_gemini.py:329), Cohere, Perplexity, OCI, Volcengine, SAP, Bytez, BFL.
  • There's an existing LiteLLM test that explicitly verifies this propagation path: tests/router_unit_tests/test_router_embedding_headers.py:106-149 ("This is the main fix - ensuring that headers set in default_litellm_params are included in the embedding request").

What changed vs #669

replicated/config.yaml — three fields renamed and ungated:

#669 This PR
custom_llm_extra_headers_enabled (gated on llm_provider == "custom") extra_request_headers_enabled (no provider gate)
custom_llm_extra_headers extra_request_headers
custom_llm_extra_headers_format extra_request_headers_format
  • Help text rewritten to reflect provider-agnostic scope and to call out that provider-managed Authorization / x-api-key headers cannot be overridden through this mechanism (LiteLLM provider transformations construct those in their own validate_environment step).
  • Cross-reference in litellm_forward_client_headers help_text updated to point at "Extra Upstream HTTP Headers" instead of "Custom LLM Extra HTTP Headers".

replicated/openhands.yaml — moved the render site:

  • Removed the extra_headers: line from the custom-llm model_list entry inside the llm_provider == "custom" block.
  • Added under the unconditional litellm-helm.proxy_config block (sibling of general_settings and model_list):
    router_settings:
      default_litellm_params:
        extra_headers: repl{{ if ConfigOptionEquals "extra_request_headers_enabled" "1" }}repl{{ ConfigOption "extra_request_headers" | trim | default "{}" | mustFromJson | mustToJson }}repl{{ else }}{}repl{{ end }}

What was preserved from #669

  • Automations split out into its own config group.
  • litellm_forward_client_headers moved into Advanced Options with the rewritten, accurate forwarding help text.
  • Two-step enable-checkbox + textarea UX.
  • Validation regex on the textarea, plus the mustFromJson | mustToJson render-time validator with trim and explicit {} when disabled.

Behavior trade-offs

Aspect #669 This PR
Works with llm_provider=custom
Works with Anthropic / OpenAI / Azure / Bedrock / Vertex / Google / Groq / OpenRouter / DeepSeek / Mistral
Coupled to custom_model field yes (transitively) no
Template surface 1 line per provider × 1 provider 1 line total
Per-deployment override semantics clean (per-model_list entry) coarse — see caveat below
Bedrock SigV4 caveat not exposed (Bedrock path not touched) exposed

Caveats inherited from LiteLLM

  1. Bedrock SigV4 signing: Bedrock signs requests with AWS SigV4 over a canonical header set. Custom headers added via extra_headers may or may not be included in the signature depending on the HTTP client path; if they're added after signing they'll be on the wire but unsigned. For an APIM-style gateway sitting in front of AWS this is usually fine (the gateway looks at its own header, then forwards the still-signed request to AWS), but it's worth a smoke test before relying on it for Bedrock.
  2. Authorization / x-api-key collision: Provider transformations construct these in validate_environment and will overwrite anything injected here. Custom non-standard names (Ocp-Apim-Subscription-Key, X-Tenant-Id, X-Gateway-Tenant, etc.) survive. Help text in this PR calls this out.
  3. Per-deployment override semantics: _update_kwargs_with_default_litellm_params uses kwargs.setdefault, so per-request kwargs win over the default. But at router.py:2880 the input is built as input_kwargs = {**litellm_params, ..., **kwargs} — kwargs is spread last, so if a deployment had its own litellm_params.extra_headers and the global default was also set, the kwargs version (from defaults) would override the per-deployment one. For the stopgap "one set of static headers applied uniformly" use case this is fine; if anyone wanted per-deployment header sets later, they'd need a different mechanism.

Validation done

  • YAML structural parse of both replicated/config.yaml and replicated/openhands.yaml after stripping repl{{...}} / {{repl...}} template directives — both parse cleanly.
  • python3 scripts/check_secret_checksum.py passes (the new fields are bool / textarea / label, not password, so they correctly stay out of the secretsChecksum chain — same as [PLTF-2876] Support custom LLM extra headers in Replicated #669).
  • No stale references to custom_llm_extra_headers remain in replicated/, charts/, or scripts/.

Not validated: I have not deployed this to a Replicated environment. Before merge:

  • Deploy to a replicated-* test cluster and verify the rendered LiteLLM ConfigMap contains the headers under router_settings.default_litellm_params.extra_headers (not under any model_list[*].litellm_params).
  • Repeat [PLTF-2876] Support custom LLM extra headers in Replicated #669's header-capture smoke test against the Custom LLM path to confirm parity.
  • Run the same smoke test against at least one non-Custom provider (Anthropic or OpenAI) to confirm the global path actually works on a real cluster — this is the whole point of the change.

Open questions for review

  1. Bikeshed name: extra_request_headers vs extra_upstream_headers vs litellm_extra_headers. Open to suggestions.
  2. Keep both? If there's a real use case for per-provider (not just global) header sets, it might make sense to keep the per-custom-llm field from [PLTF-2876] Support custom LLM extra headers in Replicated #669 and add this global one. I went with replacement to keep the surface minimal — happy to layer instead.
  3. Should we also touch the secretsChecksum chain? Self-hosted: support static extra HTTP headers on Custom LLM provider for gateway-style auth (stopgap) #664 originally proposed adding the header field to that sha256 chain; [PLTF-2876] Support custom LLM extra headers in Replicated #669 didn't, and this PR follows [PLTF-2876] Support custom LLM extra headers in Replicated #669. The LiteLLM ConfigMap's own checksum annotation handles the pod re-roll. If you want the openhands pod to also roll when headers change, that's a one-line add.

This PR was created by an AI agent (OpenHands) on behalf of @jpshackelford. Marked DRAFT because it has not been deployed to a real Replicated cluster yet — the LiteLLM source-code reasoning is solid but the rendered-template behavior on a live install needs human verification.

@jpshackelford can click here to continue refining the PR

@jpshackelford jpshackelford force-pushed the extra-headers-router-alt branch from c5b5c82 to 0c9ee72 Compare May 27, 2026 20:35
Base automatically changed from alona/custom-llm-extra-headers to main May 27, 2026 20:56
…ttings

Alternative to the per-model approach: instead of attaching extra_headers
to the custom-llm model_list entry (which is gated on llm_provider ==
"custom" and therefore only helps Custom-LLM users), render the static
headers into router_settings.default_litellm_params.extra_headers on the
bundled LiteLLM Proxy. LiteLLM Router merges those defaults into every
completion call via kwargs.setdefault(), so they apply uniformly across
all providers (Anthropic, OpenAI, Azure, Bedrock, Vertex, custom, ...)
without per-model template duplication.

Config changes:
- Rename custom_llm_extra_headers{,_enabled,_format} ->
  extra_request_headers{,_enabled,_format} and drop the
  llm_provider == "custom" gate.
- Update the cross-reference in litellm_forward_client_headers help_text.
- Reword help_text to reflect provider-agnostic scope and note that
  provider-managed headers (Authorization, x-api-key) cannot be
  overridden via this mechanism.

Chart changes:
- Remove extra_headers from the custom-llm model_list entry.
- Add router_settings.default_litellm_params.extra_headers under the
  unconditional litellm-helm.proxy_config block, gated on
  extra_request_headers_enabled at render time via
  mustFromJson | mustToJson for validation.

Behavior: empty/disabled value renders as {} (no-op, backwards
compatible). With a value set, every outbound LiteLLM Proxy call to
any configured provider carries the static headers.

Caveats inherited from LiteLLM:
- Bedrock SigV4 signing may not include arbitrary headers in the
  signed canonical headers list; verify before relying on this for
  Bedrock-fronted gateways.
- Provider transformations construct their own Authorization /
  x-api-key headers in validate_environment and will overwrite any
  static values for those names. Custom non-standard names
  (Ocp-Apim-Subscription-Key, X-Tenant-Id, etc.) survive.

Co-authored-by: openhands <openhands@all-hands.dev>
@jpshackelford jpshackelford force-pushed the extra-headers-router-alt branch from 0c9ee72 to 9c38542 Compare May 28, 2026 11:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants