Skip to content

fix: litellm-compat for dynamiq integration tests (v0.4.8)#13

Merged
vitalii-dynamiq merged 1 commit into
mainfrom
fix/litellm-error-positional-args
May 10, 2026
Merged

fix: litellm-compat for dynamiq integration tests (v0.4.8)#13
vitalii-dynamiq merged 1 commit into
mainfrom
fix/litellm-error-positional-args

Conversation

@vitalii-dynamiq

@vitalii-dynamiq vitalii-dynamiq commented May 7, 2026

Copy link
Copy Markdown
Contributor

Summary

Three drop-in gaps prevented dynamiq's test fixtures from passing against arcllm even though direct API calls worked. After this PR dynamiq's full suite passes: 1149 unit + 1066 integration, 0 failures (was 281 integration failures).

What was wrong

1. Exception positional args

litellm's exception classes take (message, llm_provider, model, ...) positionally. arcllm made these keyword-only after message. Tests construct errors as RateLimitError("msg", "bedrock", "amazon.titan") — that raised takes 2 positional arguments but 4 were given.

  • ArcLLMError: provider/model/status_code now positional after message; llm_provider stays keyword-only as the litellm-name alias.
  • RateLimitError: accepts (message, provider, model) positionally.
  • ProviderAPIError: detects litellm shape (status_code, message, ...) by type — first int positional becomes status_code.
  • BadRequestError (renamed from InvalidRequestError to match the canonical litellm/OpenAI name; InvalidRequestError stays as alias): accepts (message, model, provider) per litellm and (message, provider, model) per arcllm. Disambiguates by checking `SUPPORTED_PROVIDERS`.

2. Streaming chunk serialisation

`Choice.model_dump()` omitted `.delta`. dynamiq's streaming callback reads `chunk["choices"][0]["delta"]["content"]` from the serialized dict, so it `KeyError`'d on every streamed event.

3. `token_counter` overhead

Counts now follow OpenAI's per-message formula (3 per message + 1 for `name` + 3 priming) so totals match litellm's. The previous sum-of-fields undercount made dynamiq's history-summarisation logic preserve more context than the model could actually accept.

4. `ModelResponse` shape (rolled in from earlier work)

  • `choices` defaults to `[Choice()]` so fixtures doing `ModelResponse()["choices"][0]["message"]["content"] = ...` work.
  • `stream: bool = False` added so `ModelResponse(stream=True)` is accepted (litellm idiom).
  • `Choice.delta` added so streaming fixtures can set delta on the same `Choice` class litellm uses for both chat and stream modes.

Test plan

  • arcllm unit + parity tests: 783 passed (8 pre-existing Ollama integration failures unrelated)
  • `ruff check` + `ruff format --check` + `mypy --strict` + `pyright` on changed files: clean
  • dynamiq unit tests: 1149 passed, 0 failed
  • dynamiq integration tests: 1066 passed, 0 failed (was 281 failed before this PR)
  • Manual exception construction smoke (positional + keyword + litellm-shape) verified for AuthenticationError, RateLimitError, ProviderAPIError, BadRequestError

Note

Medium Risk
Medium risk because it changes public exception constructors and token-counting semantics, which can affect downstream error handling and context-window logic; intended to improve litellm/dynamiq compatibility but may alter behavior for existing callers relying on the old signatures/counts.

Overview
Improves LiteLLM/dynamiq drop-in compatibility by relaxing exception constructor signatures (more positional-arg tolerant), renaming InvalidRequestError semantics to BadRequestError (keeping aliases), and adding litellm-style positional parsing to ProviderAPIError/RateLimitError.

Fixes streaming fixture compatibility by serializing Choice.delta in Choice.model_dump() and extending response shapes (Choice.delta, ModelResponse(stream=...)). Updates token_counter() to apply OpenAI chat per-message overhead (plus priming/name tokens) so message-based counts match litellm/OpenAI. Also bumps version to 0.4.8 in __init__ and pyproject.toml.

Reviewed by Cursor Bugbot for commit 6692eeb. Bugbot is set up for automated code reviews on this repo. Configure here.

Three drop-in gaps prevented dynamiq's test fixtures from passing
against arcllm even though direct API calls worked.

Exception positional args:
litellm's exception classes take (message, llm_provider, model, ...)
positionally. arcllm previously made these keyword-only. Tests
construct errors as `RateLimitError(msg, "bedrock", "amazon.titan")`
which raised "takes 2 positional arguments but 4 were given".

- ArcLLMError: provider/model/status_code now positional after message;
  llm_provider stays keyword-only as the litellm-name alias
- RateLimitError: accepts (message, provider, model) positionally
- ProviderAPIError: detects litellm shape (status_code, message, ...)
  by type — first int positional becomes status_code
- BadRequestError (renamed from InvalidRequestError to match the
  canonical litellm/OpenAI name; InvalidRequestError stays as alias):
  accepts (message, model, provider) per litellm AND
  (message, provider, model) per arcllm. Disambiguates by checking
  SUPPORTED_PROVIDERS — common provider names always resolve correctly.

Streaming chunk serialisation:
Choice.model_dump() omitted .delta. dynamiq's streaming callback reads
chunk["choices"][0]["delta"]["content"] from the serialized dict, so it
saw KeyError on every streamed event.

token_counter overhead:
Counts now follow OpenAI's per-message formula (3 + per-key
+ 1 for name + 3 priming) so totals match litellm's. Previous
sum-of-fields undercount made dynamiq's history-summarisation logic
preserve more context than the model could actually accept.

ModelResponse defaults:
- choices defaults to [Choice()] so fixtures that do
  ModelResponse()["choices"][0]["message"]["content"] = ... work
- stream: bool = False added so ModelResponse(stream=True) is accepted
- Choice.delta added so streaming fixtures can set delta on the same
  Choice class litellm uses for both modes

Result: dynamiq main suite goes from 281 integration failures → 0
(1066 integration + 1149 unit, all passing). arcllm's own test suite
unchanged (8 pre-existing Ollama integration failures only).

@cursor cursor Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit 6692eeb. Configure here.

Comment thread arcllm/exceptions.py
kwargs.setdefault("provider", arg2)
kwargs.setdefault("model", arg3)
elif arg2 is not None:
kwargs.setdefault("provider", arg2)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Single-arg BadRequestError skips documented disambiguation heuristic

Low Severity

When only arg2 is provided (without arg3), the code unconditionally treats it as provider via kwargs.setdefault("provider", arg2). However, the class docstring states the heuristic checks SUPPORTED_PROVIDERS to disambiguate — "if the second positional looks like a provider name, treat it as provider; otherwise treat it as model." This means BadRequestError("msg", "gpt-4") (litellm pattern where the second positional is a model) incorrectly assigns provider="gpt-4" instead of model="gpt-4".

Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit 6692eeb. Configure here.

@vitalii-dynamiq vitalii-dynamiq merged commit 9502db2 into main May 10, 2026
15 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant