fix: litellm-compat for dynamiq integration tests (v0.4.8)#13
Conversation
Three drop-in gaps prevented dynamiq's test fixtures from passing against arcllm even though direct API calls worked. Exception positional args: litellm's exception classes take (message, llm_provider, model, ...) positionally. arcllm previously made these keyword-only. Tests construct errors as `RateLimitError(msg, "bedrock", "amazon.titan")` which raised "takes 2 positional arguments but 4 were given". - ArcLLMError: provider/model/status_code now positional after message; llm_provider stays keyword-only as the litellm-name alias - RateLimitError: accepts (message, provider, model) positionally - ProviderAPIError: detects litellm shape (status_code, message, ...) by type — first int positional becomes status_code - BadRequestError (renamed from InvalidRequestError to match the canonical litellm/OpenAI name; InvalidRequestError stays as alias): accepts (message, model, provider) per litellm AND (message, provider, model) per arcllm. Disambiguates by checking SUPPORTED_PROVIDERS — common provider names always resolve correctly. Streaming chunk serialisation: Choice.model_dump() omitted .delta. dynamiq's streaming callback reads chunk["choices"][0]["delta"]["content"] from the serialized dict, so it saw KeyError on every streamed event. token_counter overhead: Counts now follow OpenAI's per-message formula (3 + per-key + 1 for name + 3 priming) so totals match litellm's. Previous sum-of-fields undercount made dynamiq's history-summarisation logic preserve more context than the model could actually accept. ModelResponse defaults: - choices defaults to [Choice()] so fixtures that do ModelResponse()["choices"][0]["message"]["content"] = ... work - stream: bool = False added so ModelResponse(stream=True) is accepted - Choice.delta added so streaming fixtures can set delta on the same Choice class litellm uses for both modes Result: dynamiq main suite goes from 281 integration failures → 0 (1066 integration + 1149 unit, all passing). arcllm's own test suite unchanged (8 pre-existing Ollama integration failures only).
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit 6692eeb. Configure here.
| kwargs.setdefault("provider", arg2) | ||
| kwargs.setdefault("model", arg3) | ||
| elif arg2 is not None: | ||
| kwargs.setdefault("provider", arg2) |
There was a problem hiding this comment.
Single-arg BadRequestError skips documented disambiguation heuristic
Low Severity
When only arg2 is provided (without arg3), the code unconditionally treats it as provider via kwargs.setdefault("provider", arg2). However, the class docstring states the heuristic checks SUPPORTED_PROVIDERS to disambiguate — "if the second positional looks like a provider name, treat it as provider; otherwise treat it as model." This means BadRequestError("msg", "gpt-4") (litellm pattern where the second positional is a model) incorrectly assigns provider="gpt-4" instead of model="gpt-4".
Reviewed by Cursor Bugbot for commit 6692eeb. Configure here.


Summary
Three drop-in gaps prevented dynamiq's test fixtures from passing against arcllm even though direct API calls worked. After this PR dynamiq's full suite passes: 1149 unit + 1066 integration, 0 failures (was 281 integration failures).
What was wrong
1. Exception positional args
litellm's exception classes take
(message, llm_provider, model, ...)positionally. arcllm made these keyword-only aftermessage. Tests construct errors asRateLimitError("msg", "bedrock", "amazon.titan")— that raisedtakes 2 positional arguments but 4 were given.ArcLLMError:provider/model/status_codenow positional aftermessage;llm_providerstays keyword-only as the litellm-name alias.RateLimitError: accepts(message, provider, model)positionally.ProviderAPIError: detects litellm shape(status_code, message, ...)by type — first int positional becomesstatus_code.BadRequestError(renamed fromInvalidRequestErrorto match the canonical litellm/OpenAI name;InvalidRequestErrorstays as alias): accepts(message, model, provider)per litellm and(message, provider, model)per arcllm. Disambiguates by checking `SUPPORTED_PROVIDERS`.2. Streaming chunk serialisation
`Choice.model_dump()` omitted `.delta`. dynamiq's streaming callback reads `chunk["choices"][0]["delta"]["content"]` from the serialized dict, so it `KeyError`'d on every streamed event.
3. `token_counter` overhead
Counts now follow OpenAI's per-message formula (3 per message + 1 for `name` + 3 priming) so totals match litellm's. The previous sum-of-fields undercount made dynamiq's history-summarisation logic preserve more context than the model could actually accept.
4. `ModelResponse` shape (rolled in from earlier work)
Test plan
Note
Medium Risk
Medium risk because it changes public exception constructors and token-counting semantics, which can affect downstream error handling and context-window logic; intended to improve litellm/dynamiq compatibility but may alter behavior for existing callers relying on the old signatures/counts.
Overview
Improves LiteLLM/dynamiq drop-in compatibility by relaxing exception constructor signatures (more positional-arg tolerant), renaming
InvalidRequestErrorsemantics toBadRequestError(keeping aliases), and adding litellm-style positional parsing toProviderAPIError/RateLimitError.Fixes streaming fixture compatibility by serializing
Choice.deltainChoice.model_dump()and extending response shapes (Choice.delta,ModelResponse(stream=...)). Updatestoken_counter()to apply OpenAI chat per-message overhead (plus priming/name tokens) so message-based counts match litellm/OpenAI. Also bumps version to0.4.8in__init__andpyproject.toml.Reviewed by Cursor Bugbot for commit 6692eeb. Bugbot is set up for automated code reviews on this repo. Configure here.