feat(litellm): Add async callbacks #5969
2 issues
code-review: Found 2 issues (1 high, 1 medium)
High
test_async_exception_handling mocks embeddings client instead of completions client - `tests/integrations/litellm/test_litellm.py:870-872`
The test mocks client.embeddings._client._client.send but calls litellm.acompletion() which uses the completions endpoint, not embeddings. The sync version test_exception_handling correctly mocks client.completions._client._client. This will cause the mock to not apply, potentially hitting real endpoints or failing unexpectedly.
Medium
Duplicate span creation when both sync and async callbacks are registered - `sentry_sdk/integrations/litellm.py:163-164`
Both _input_callback and _async_input_callback are added to litellm.input_callback (lines 334-337 in setup_once), but _async_input_callback simply calls _input_callback directly. If litellm invokes both callbacks for the same request, _input_callback will execute twice, creating duplicate spans for the same operation. This could lead to incorrect tracing data and span leaks since span.__enter__() is called multiple times but may not have matching __exit__() calls.
Duration: 5m 57s · Tokens: 1.6M in / 19.3k out · Cost: $3.23 (+extraction: $0.00, +merge: $0.00, +fix_gate: $0.00)
Annotations
Check failure on line 872 in tests/integrations/litellm/test_litellm.py
sentry-warden / warden: code-review
test_async_exception_handling mocks embeddings client instead of completions client
The test mocks `client.embeddings._client._client.send` but calls `litellm.acompletion()` which uses the completions endpoint, not embeddings. The sync version `test_exception_handling` correctly mocks `client.completions._client._client`. This will cause the mock to not apply, potentially hitting real endpoints or failing unexpectedly.
Check warning on line 164 in sentry_sdk/integrations/litellm.py
sentry-warden / warden: code-review
Duplicate span creation when both sync and async callbacks are registered
Both `_input_callback` and `_async_input_callback` are added to `litellm.input_callback` (lines 334-337 in setup_once), but `_async_input_callback` simply calls `_input_callback` directly. If litellm invokes both callbacks for the same request, `_input_callback` will execute twice, creating duplicate spans for the same operation. This could lead to incorrect tracing data and span leaks since `span.__enter__()` is called multiple times but may not have matching `__exit__()` calls.