DRAFT: Decouple LiteLLM extra-headers from Custom LLM via router_settings (alternative to #669)#674
Draft
jpshackelford wants to merge 1 commit into
Draft
DRAFT: Decouple LiteLLM extra-headers from Custom LLM via router_settings (alternative to #669)#674jpshackelford wants to merge 1 commit into
jpshackelford wants to merge 1 commit into
Conversation
c5b5c82 to
0c9ee72
Compare
…ttings
Alternative to the per-model approach: instead of attaching extra_headers
to the custom-llm model_list entry (which is gated on llm_provider ==
"custom" and therefore only helps Custom-LLM users), render the static
headers into router_settings.default_litellm_params.extra_headers on the
bundled LiteLLM Proxy. LiteLLM Router merges those defaults into every
completion call via kwargs.setdefault(), so they apply uniformly across
all providers (Anthropic, OpenAI, Azure, Bedrock, Vertex, custom, ...)
without per-model template duplication.
Config changes:
- Rename custom_llm_extra_headers{,_enabled,_format} ->
extra_request_headers{,_enabled,_format} and drop the
llm_provider == "custom" gate.
- Update the cross-reference in litellm_forward_client_headers help_text.
- Reword help_text to reflect provider-agnostic scope and note that
provider-managed headers (Authorization, x-api-key) cannot be
overridden via this mechanism.
Chart changes:
- Remove extra_headers from the custom-llm model_list entry.
- Add router_settings.default_litellm_params.extra_headers under the
unconditional litellm-helm.proxy_config block, gated on
extra_request_headers_enabled at render time via
mustFromJson | mustToJson for validation.
Behavior: empty/disabled value renders as {} (no-op, backwards
compatible). With a value set, every outbound LiteLLM Proxy call to
any configured provider carries the static headers.
Caveats inherited from LiteLLM:
- Bedrock SigV4 signing may not include arbitrary headers in the
signed canonical headers list; verify before relying on this for
Bedrock-fronted gateways.
- Provider transformations construct their own Authorization /
x-api-key headers in validate_environment and will overwrite any
static values for those names. Custom non-standard names
(Ocp-Apim-Subscription-Key, X-Tenant-Id, etc.) survive.
Co-authored-by: openhands <openhands@all-hands.dev>
0c9ee72 to
9c38542
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Alternative implementation of #664, layered on top of #669. Instead of attaching
extra_headersto thecustom-llmmodel_listentry (which is gated onllm_provider == "custom"and therefore only helps Custom-LLM users), this renders the static headers intorouter_settings.default_litellm_params.extra_headerson the bundled LiteLLM Proxy. LiteLLM Router merges those defaults into every completion call viakwargs.setdefault(), so they apply uniformly across all providers (Anthropic, OpenAI, Azure, Bedrock, Vertex, Gemini, Groq, OpenRouter, DeepSeek, Mistral, custom) without per-model template duplication.Posted against
alona/custom-llm-extra-headersso it can be reviewed as a delta on top of #669 — same UX patterns (enable checkbox + regex-validated JSON textarea + format label), samemustFromJson | mustToJsonrender hygiene, just a different attach point inside the proxy config.Why this might be worth considering
The current PR ties the new headers feature to Custom LLM. Operators whose enterprise gateway sits in front of, say, Azure or Anthropic native are not helped — they would need to switch to Custom LLM (losing provider-specific niceties) or hand-edit the chart.
LiteLLM Proxy already supports a global default-headers mechanism that #664's design overlooked:
Router.__init__acceptsdefault_litellm_params(litellm/router.py:277). The proxy passes any key fromrouter_settings.*that matches aRouter.__init__kwarg straight through (litellm/proxy/proxy_server.py:4505-4540, usingRouter.get_valid_args()which is derived frominspect.getfullargspec(Router.__init__))._update_kwargs_with_default_litellm_params(litellm/router.py:3040), called from_update_kwargs_with_deploymentfor every chat completion and embedding routed through the Router.setdefaultsemantics mean per-request kwargs win; otherwise the global default is injected.extra_headersin its optional-params allow-list: OpenAI chat (gpt_transformation.py:160), Anthropic (anthropic/chat/transformation.py:444, 1550-1551), Azure (azure/chat/gpt_transformation.py:102), Bedrock Converse (bedrock/chat/converse_transformation.py:507), Vertex/Gemini (vertex_ai/gemini/vertex_and_google_ai_studio_gemini.py:329), Cohere, Perplexity, OCI, Volcengine, SAP, Bytez, BFL.tests/router_unit_tests/test_router_embedding_headers.py:106-149("This is the main fix - ensuring that headers set in default_litellm_params are included in the embedding request").What changed vs #669
replicated/config.yaml— three fields renamed and ungated:custom_llm_extra_headers_enabled(gated onllm_provider == "custom")extra_request_headers_enabled(no provider gate)custom_llm_extra_headersextra_request_headerscustom_llm_extra_headers_formatextra_request_headers_formatAuthorization/x-api-keyheaders cannot be overridden through this mechanism (LiteLLM provider transformations construct those in their ownvalidate_environmentstep).litellm_forward_client_headershelp_text updated to point at "Extra Upstream HTTP Headers" instead of "Custom LLM Extra HTTP Headers".replicated/openhands.yaml— moved the render site:extra_headers:line from thecustom-llmmodel_listentry inside thellm_provider == "custom"block.litellm-helm.proxy_configblock (sibling ofgeneral_settingsandmodel_list):What was preserved from #669
litellm_forward_client_headersmoved into Advanced Options with the rewritten, accurate forwarding help text.mustFromJson | mustToJsonrender-time validator withtrimand explicit{}when disabled.Behavior trade-offs
llm_provider=customcustom_modelfieldmodel_listentry)Caveats inherited from LiteLLM
extra_headersmay or may not be included in the signature depending on the HTTP client path; if they're added after signing they'll be on the wire but unsigned. For an APIM-style gateway sitting in front of AWS this is usually fine (the gateway looks at its own header, then forwards the still-signed request to AWS), but it's worth a smoke test before relying on it for Bedrock.Authorization/x-api-keycollision: Provider transformations construct these invalidate_environmentand will overwrite anything injected here. Custom non-standard names (Ocp-Apim-Subscription-Key,X-Tenant-Id,X-Gateway-Tenant, etc.) survive. Help text in this PR calls this out._update_kwargs_with_default_litellm_paramsuseskwargs.setdefault, so per-request kwargs win over the default. But atrouter.py:2880the input is built asinput_kwargs = {**litellm_params, ..., **kwargs}— kwargs is spread last, so if a deployment had its ownlitellm_params.extra_headersand the global default was also set, the kwargs version (from defaults) would override the per-deployment one. For the stopgap "one set of static headers applied uniformly" use case this is fine; if anyone wanted per-deployment header sets later, they'd need a different mechanism.Validation done
replicated/config.yamlandreplicated/openhands.yamlafter strippingrepl{{...}}/{{repl...}}template directives — both parse cleanly.python3 scripts/check_secret_checksum.pypasses (the new fields arebool/textarea/label, notpassword, so they correctly stay out of thesecretsChecksumchain — same as [PLTF-2876] Support custom LLM extra headers in Replicated #669).custom_llm_extra_headersremain inreplicated/,charts/, orscripts/.Not validated: I have not deployed this to a Replicated environment. Before merge:
replicated-*test cluster and verify the rendered LiteLLM ConfigMap contains the headers underrouter_settings.default_litellm_params.extra_headers(not under anymodel_list[*].litellm_params).Open questions for review
extra_request_headersvsextra_upstream_headersvslitellm_extra_headers. Open to suggestions.custom-llmfield from [PLTF-2876] Support custom LLM extra headers in Replicated #669 and add this global one. I went with replacement to keep the surface minimal — happy to layer instead.secretsChecksumchain? Self-hosted: support static extra HTTP headers on Custom LLM provider for gateway-style auth (stopgap) #664 originally proposed adding the header field to that sha256 chain; [PLTF-2876] Support custom LLM extra headers in Replicated #669 didn't, and this PR follows [PLTF-2876] Support custom LLM extra headers in Replicated #669. The LiteLLM ConfigMap's own checksum annotation handles the pod re-roll. If you want the openhands pod to also roll when headers change, that's a one-line add.This PR was created by an AI agent (OpenHands) on behalf of @jpshackelford. Marked DRAFT because it has not been deployed to a real Replicated cluster yet — the LiteLLM source-code reasoning is solid but the rendered-template behavior on a live install needs human verification.
@jpshackelford can click here to continue refining the PR