Skip to content

⚡️ Speed up function _map_usage by 172% #2197

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

misrasaurabh1
Copy link
Contributor

Saurabh's comments - The type checks are expensive, also this seems to be used very frequently.
This speeds up the test models/test_anthropic.py::test_usage by 25%

📄 172% (1.72x) speedup for _map_usage in pydantic_ai_slim/pydantic_ai/models/anthropic.py

⏱️ Runtime : 1.12 milliseconds 413 microseconds (best of 76 runs)

📝 Explanation and details

Here is an optimized version of the provided program.
Key optimizations with rationale:

  • Avoids repeated/expensive isinstance() checks by merging logic.
  • Avoids unnecessary dictionary comprehensions and allocations if unused.
  • Minimizes details.get() calls and reuse of variables.
  • Uses local variable assignment to reduce attribute lookups.
  • Minimizes creation of empty dicts and Usage() when possible.
  • Uses tuple membership checks for event classes to condense branching ("flat is better than nested").
  • Moves model_dump().items() out of dict comprehension when it's not needed.
  • Handles the common/early-exit (no usage info) case with a constant.

Summary of changes:

  • Single type comparison and attribute fetch per code path; avoids multiple checks and data flows.
  • Uses a static _EMPTY_USAGE for "no details" paths—eliminates unnecessary object allocations.
  • Handles dictionary and token computation as fast/local as possible (no repeated dict-get, minimal fallbacks).
  • Preserves function signature, behavior, and all comment clarifications.

This implementation should provide improved (lower) latency per call, specifically for high-throughput scenarios.

Correctness verification report:

Test Status
⚙️ Existing Unit Tests 21 Passed
🌀 Generated Regression Tests 1231 Passed
⏪ Replay Tests 115 Passed
🔎 Concolic Coverage Tests 🔘 None Found
📊 Tests Coverage 100.0%
⚙️ Existing Unit Tests and Runtime
Test File::Test Function Original ⏱️ Optimized ⏱️ Speedup
models/test_anthropic.py::test_usage 55.6μs 44.3μs ✅25.3%
test_pytest_inlinesnapshotdisable_testsproviderstest_bedrock_py_testsproviderstest_google_gla_py_teststes__replay_test_0.py::test_pydantic_ai_models_anthropic__map_usage 17.8μs 19.8μs ⚠️-9.78%
test_pytest_inlinesnapshotdisable_teststest_messages_py_teststest_mcp_py_teststest_deps_py__replay_test_0.py::test_pydantic_ai_models_anthropic__map_usage 254μs 162μs ✅56.4%
🌀 Generated Regression Tests and Runtime
from types import SimpleNamespace

# imports
import pytest  # used for our unit tests
from pydantic_ai.models.anthropic import _map_usage

# function to test
# --- Mocking the required classes and usage.Usage for testability ---
# In real usage, these would be imported as in the provided code, but for testing, we define mocks.

class Usage:
    """Mock of usage.Usage, stores token counts and details."""
    def __init__(self, request_tokens=None, response_tokens=None, total_tokens=None, details=None):
        self.request_tokens = request_tokens
        self.response_tokens = response_tokens
        self.total_tokens = total_tokens
        self.details = details

    def __eq__(self, other):
        if not isinstance(other, Usage):
            return False
        return (
            self.request_tokens == other.request_tokens and
            self.response_tokens == other.response_tokens and
            self.total_tokens == other.total_tokens and
            self.details == other.details
        )

    def __repr__(self):
        return (f"Usage(request_tokens={self.request_tokens}, "
                f"response_tokens={self.response_tokens}, "
                f"total_tokens={self.total_tokens}, "
                f"details={self.details})")

# Helper for usage-like objects
class UsageData:
    def __init__(self, **kwargs):
        self.__dict__.update(kwargs)
    def model_dump(self):
        return self.__dict__.copy()

# Mock BetaMessage and event classes
class BetaMessage:
    def __init__(self, usage):
        self.usage = usage

class BetaRawMessageStartEvent:
    def __init__(self, message):
        self.message = message

class BetaRawMessageDeltaEvent:
    def __init__(self, usage):
        self.usage = usage

# Simulate other BetaRawMessageStreamEvent types (not containing usage)
class BetaRawMessageStreamEvent:
    pass
from pydantic_ai.models.anthropic import _map_usage

# ------------------- UNIT TESTS -------------------

# 1. BASIC TEST CASES

def test_basic_beta_message_all_tokens():
    """Test BetaMessage with all token types present."""
    usage_obj = UsageData(
        input_tokens=10,
        output_tokens=20,
        cache_creation_input_tokens=0,
        cache_read_input_tokens=0,
        other_info=123  # Non-int should be ignored
    )
    msg = BetaMessage(usage_obj)
    expected = Usage(
        request_tokens=10,
        response_tokens=20,
        total_tokens=30,
        details={'input_tokens': 10, 'output_tokens': 20, 'cache_creation_input_tokens': 0, 'cache_read_input_tokens': 0, 'other_info': 123}
    )
    codeflash_output = _map_usage(msg); result = codeflash_output # 2.96μs -> 567ns (422% faster)

def test_basic_start_event():
    """Test BetaRawMessageStartEvent with usage present."""
    usage_obj = UsageData(
        input_tokens=5,
        output_tokens=7,
        cache_creation_input_tokens=3,
        cache_read_input_tokens=2
    )
    msg = BetaRawMessageStartEvent(BetaMessage(usage_obj))
    expected = Usage(
        request_tokens=5+3+2,
        response_tokens=7,
        total_tokens=5+3+2+7,
        details={'input_tokens': 5, 'output_tokens': 7, 'cache_creation_input_tokens': 3, 'cache_read_input_tokens': 2}
    )
    codeflash_output = _map_usage(msg); result = codeflash_output # 2.65μs -> 552ns (380% faster)

def test_basic_delta_event():
    """Test BetaRawMessageDeltaEvent with minimal usage info."""
    usage_obj = UsageData(
        output_tokens=4
    )
    msg = BetaRawMessageDeltaEvent(usage_obj)
    expected = Usage(
        request_tokens=None,
        response_tokens=4,
        total_tokens=0+4,
        details={'output_tokens': 4}
    )
    codeflash_output = _map_usage(msg); result = codeflash_output # 2.86μs -> 551ns (420% faster)

# 2. EDGE TEST CASES

def test_edge_zero_tokens():
    """Test with all token counts zero."""
    usage_obj = UsageData(
        input_tokens=0,
        output_tokens=0,
        cache_creation_input_tokens=0,
        cache_read_input_tokens=0
    )
    msg = BetaMessage(usage_obj)
    expected = Usage(
        request_tokens=None,
        response_tokens=0,
        total_tokens=0,
        details={'input_tokens': 0, 'output_tokens': 0, 'cache_creation_input_tokens': 0, 'cache_read_input_tokens': 0}
    )
    codeflash_output = _map_usage(msg); result = codeflash_output # 2.58μs -> 524ns (391% faster)

def test_edge_missing_input_tokens():
    """Test when input_tokens are missing, but cache tokens present."""
    usage_obj = UsageData(
        output_tokens=9,
        cache_creation_input_tokens=4,
        cache_read_input_tokens=0
    )
    msg = BetaMessage(usage_obj)
    expected = Usage(
        request_tokens=4,
        response_tokens=9,
        total_tokens=13,
        details={'output_tokens': 9, 'cache_creation_input_tokens': 4, 'cache_read_input_tokens': 0}
    )
    codeflash_output = _map_usage(msg); result = codeflash_output # 2.63μs -> 570ns (361% faster)

def test_edge_missing_all_input_tokens():
    """Test when all input token fields are missing."""
    usage_obj = UsageData(
        output_tokens=12,
    )
    msg = BetaMessage(usage_obj)
    expected = Usage(
        request_tokens=None,
        response_tokens=12,
        total_tokens=12,
        details={'output_tokens': 12}
    )
    codeflash_output = _map_usage(msg); result = codeflash_output # 2.69μs -> 536ns (403% faster)

def test_edge_non_integer_fields():
    """Test that non-integer fields are ignored in details."""
    usage_obj = UsageData(
        input_tokens=1,
        output_tokens=2,
        info="not_an_int",
        another=None
    )
    msg = BetaMessage(usage_obj)
    expected = Usage(
        request_tokens=1,
        response_tokens=2,
        total_tokens=3,
        details={'input_tokens': 1, 'output_tokens': 2}
    )
    codeflash_output = _map_usage(msg); result = codeflash_output # 2.55μs -> 539ns (372% faster)

def test_edge_negative_tokens():
    """Test negative token counts (should be included as-is in details)."""
    usage_obj = UsageData(
        input_tokens=-5,
        output_tokens=-10,
        cache_creation_input_tokens=-2,
        cache_read_input_tokens=-3
    )
    msg = BetaMessage(usage_obj)
    expected = Usage(
        request_tokens=-5 + -2 + -3,
        response_tokens=-10,
        total_tokens=-5 + -2 + -3 + -10,
        details={'input_tokens': -5, 'output_tokens': -10, 'cache_creation_input_tokens': -2, 'cache_read_input_tokens': -3}
    )
    codeflash_output = _map_usage(msg); result = codeflash_output # 2.60μs -> 573ns (354% faster)

def test_edge_no_usage_info():
    """Test BetaRawMessageStreamEvent with no usage info (should return empty Usage)."""
    msg = BetaRawMessageStreamEvent()
    expected = Usage()
    codeflash_output = _map_usage(msg); result = codeflash_output # 2.76μs -> 545ns (406% faster)

def test_edge_details_none_when_empty():
    """Test that details is None when there are no int fields."""
    usage_obj = UsageData(
        info="abc",
        data=None
    )
    msg = BetaMessage(usage_obj)
    expected = Usage(
        request_tokens=None,
        response_tokens=None,
        total_tokens=None,
        details=None
    )
    codeflash_output = _map_usage(msg); result = codeflash_output # 2.72μs -> 584ns (365% faster)

def test_edge_output_tokens_zero_but_input_present():
    """Test output_tokens is zero but input tokens are present."""
    usage_obj = UsageData(
        input_tokens=7,
        output_tokens=0
    )
    msg = BetaMessage(usage_obj)
    expected = Usage(
        request_tokens=7,
        response_tokens=0,
        total_tokens=7,
        details={'input_tokens': 7, 'output_tokens': 0}
    )
    codeflash_output = _map_usage(msg); result = codeflash_output # 2.57μs -> 565ns (355% faster)

# 3. LARGE SCALE TEST CASES

def test_large_many_fields():
    """Test with a large number of integer fields in usage."""
    # 100 fields: input_tokens, output_tokens, cache_creation_input_tokens, cache_read_input_tokens, plus 96 extras
    usage_kwargs = {
        'input_tokens': 100,
        'output_tokens': 200,
        'cache_creation_input_tokens': 300,
        'cache_read_input_tokens': 400,
    }
    # Add 96 more integer fields
    for i in range(1, 97):
        usage_kwargs[f'extra_field_{i}'] = i
    usage_obj = UsageData(**usage_kwargs)
    msg = BetaMessage(usage_obj)
    expected_details = usage_kwargs.copy()
    expected = Usage(
        request_tokens=100+300+400,
        response_tokens=200,
        total_tokens=100+300+400+200,
        details=expected_details
    )
    codeflash_output = _map_usage(msg); result = codeflash_output # 2.78μs -> 572ns (386% faster)

def test_large_high_token_counts():
    """Test with very high token counts to check for integer overflow or performance issues."""
    big = 10**9
    usage_obj = UsageData(
        input_tokens=big,
        output_tokens=big,
        cache_creation_input_tokens=big,
        cache_read_input_tokens=big
    )
    msg = BetaMessage(usage_obj)
    expected = Usage(
        request_tokens=big*3,
        response_tokens=big,
        total_tokens=big*4,
        details={'input_tokens': big, 'output_tokens': big, 'cache_creation_input_tokens': big, 'cache_read_input_tokens': big}
    )
    codeflash_output = _map_usage(msg); result = codeflash_output # 2.71μs -> 515ns (427% faster)

def test_large_delta_event_no_input_tokens():
    """Test BetaRawMessageDeltaEvent with large output_tokens and no input token fields."""
    usage_obj = UsageData(
        output_tokens=999_999_999
    )
    msg = BetaRawMessageDeltaEvent(usage_obj)
    expected = Usage(
        request_tokens=None,
        response_tokens=999_999_999,
        total_tokens=999_999_999,
        details={'output_tokens': 999_999_999}
    )
    codeflash_output = _map_usage(msg); result = codeflash_output # 2.62μs -> 518ns (407% faster)

def test_large_many_stream_events():
    """Test mapping over a list of 1000 BetaMessage objects."""
    # Only test the function's scalability, not correctness of aggregation
    messages = []
    for i in range(1000):
        usage_obj = UsageData(
            input_tokens=i,
            output_tokens=i*2,
            cache_creation_input_tokens=0,
            cache_read_input_tokens=0
        )
        messages.append(BetaMessage(usage_obj))
    for i, msg in enumerate(messages):
        expected = Usage(
            request_tokens=i,
            response_tokens=i*2,
            total_tokens=i + i*2,
            details={'input_tokens': i, 'output_tokens': i*2, 'cache_creation_input_tokens': 0, 'cache_read_input_tokens': 0}
        )
        codeflash_output = _map_usage(msg); result = codeflash_output # 592μs -> 140μs (320% faster)

def test_large_sparse_fields():
    """Test with 1000 fields, only a few are integers."""
    usage_kwargs = {f'field_{i}': 'x' for i in range(1000)}
    usage_kwargs['input_tokens'] = 42
    usage_kwargs['output_tokens'] = 99
    usage_obj = UsageData(**usage_kwargs)
    msg = BetaMessage(usage_obj)
    expected = Usage(
        request_tokens=42,
        response_tokens=99,
        total_tokens=141,
        details={'input_tokens': 42, 'output_tokens': 99}
    )
    codeflash_output = _map_usage(msg); result = codeflash_output # 2.89μs -> 566ns (411% faster)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

import pytest
from pydantic_ai.models.anthropic import _map_usage

# Simulate minimal versions of the required classes and usage.Usage for testing purposes

class Usage:
    def __init__(self, request_tokens=None, response_tokens=None, total_tokens=None, details=None):
        self.request_tokens = request_tokens
        self.response_tokens = response_tokens
        self.total_tokens = total_tokens
        self.details = details

    def __eq__(self, other):
        if not isinstance(other, Usage):
            return False
        return (
            self.request_tokens == other.request_tokens and
            self.response_tokens == other.response_tokens and
            self.total_tokens == other.total_tokens and
            self.details == other.details
        )

    def __repr__(self):
        return (
            f"Usage(request_tokens={self.request_tokens}, "
            f"response_tokens={self.response_tokens}, "
            f"total_tokens={self.total_tokens}, "
            f"details={self.details})"
        )

class DummyUsage:
    def __init__(self, **kwargs):
        self.__dict__.update(kwargs)
    def model_dump(self):
        return dict(self.__dict__)

# BetaMessage: has .usage
class BetaMessage:
    def __init__(self, usage):
        self.usage = usage

# BetaRawMessageStartEvent: has .message.usage
class BetaRawMessageStartEvent:
    def __init__(self, message):
        self.message = message

# BetaRawMessageDeltaEvent: has .usage
class BetaRawMessageDeltaEvent:
    def __init__(self, usage):
        self.usage = usage

# BetaRawMessageStreamEvent: parent class, not directly used except for type checking
class BetaRawMessageStreamEvent:
    pass

# Other event types (simulate as subclasses of BetaRawMessageStreamEvent)
class RawMessageStopEvent(BetaRawMessageStreamEvent):
    pass
class RawContentBlockStartEvent(BetaRawMessageStreamEvent):
    pass
class RawContentBlockDeltaEvent(BetaRawMessageStreamEvent):
    pass
class RawContentBlockStopEvent(BetaRawMessageStreamEvent):
    pass

# Patch the usage module for test
class usage:
    Usage = Usage
from pydantic_ai.models.anthropic import _map_usage

# -----------------------------
# Unit Tests
# -----------------------------

# 1. Basic Test Cases

def test_basic_betamessage_minimal():
    # Basic BetaMessage with only input_tokens and output_tokens
    usage_obj = DummyUsage(input_tokens=10, output_tokens=5)
    msg = BetaMessage(usage_obj)
    expected = Usage(
        request_tokens=10,
        response_tokens=5,
        total_tokens=15,
        details={'input_tokens': 10, 'output_tokens': 5}
    )
    codeflash_output = _map_usage(msg) # 2.90μs -> 571ns (407% faster)

def test_basic_betarawmessagestartevent():
    # BetaRawMessageStartEvent with nested message.usage
    usage_obj = DummyUsage(input_tokens=20, output_tokens=7)
    msg = BetaRawMessageStartEvent(BetaMessage(usage_obj))
    expected = Usage(
        request_tokens=20,
        response_tokens=7,
        total_tokens=27,
        details={'input_tokens': 20, 'output_tokens': 7}
    )
    codeflash_output = _map_usage(msg) # 2.92μs -> 623ns (368% faster)

def test_basic_betarawmessagedeltaevent():
    # BetaRawMessageDeltaEvent with only output_tokens, no input tokens
    usage_obj = DummyUsage(output_tokens=12)
    msg = BetaRawMessageDeltaEvent(usage_obj)
    expected = Usage(
        request_tokens=None,
        response_tokens=12,
        total_tokens=12,
        details={'output_tokens': 12}
    )
    codeflash_output = _map_usage(msg) # 2.58μs -> 551ns (368% faster)

def test_basic_multiple_token_types():
    # BetaMessage with input_tokens, cache_creation_input_tokens, cache_read_input_tokens, output_tokens
    usage_obj = DummyUsage(
        input_tokens=5,
        cache_creation_input_tokens=7,
        cache_read_input_tokens=3,
        output_tokens=2
    )
    msg = BetaMessage(usage_obj)
    # request_tokens = 5 + 7 + 3 = 15
    expected = Usage(
        request_tokens=15,
        response_tokens=2,
        total_tokens=17,
        details={
            'input_tokens': 5,
            'cache_creation_input_tokens': 7,
            'cache_read_input_tokens': 3,
            'output_tokens': 2
        }
    )
    codeflash_output = _map_usage(msg) # 2.69μs -> 557ns (382% faster)

# 2. Edge Test Cases

def test_edge_zero_tokens():
    # All tokens are zero
    usage_obj = DummyUsage(input_tokens=0, output_tokens=0)
    msg = BetaMessage(usage_obj)
    expected = Usage(
        request_tokens=None,  # 0 or None, but function uses None if 0
        response_tokens=0,
        total_tokens=0,
        details={'input_tokens': 0, 'output_tokens': 0}
    )
    codeflash_output = _map_usage(msg) # 2.68μs -> 636ns (322% faster)


def test_edge_non_integer_usage_values():
    # Details should only keep integer values
    usage_obj = DummyUsage(
        input_tokens=10,
        output_tokens=2,
        foo="bar",
        bar=3.14
    )
    msg = BetaMessage(usage_obj)
    expected = Usage(
        request_tokens=10,
        response_tokens=2,
        total_tokens=12,
        details={'input_tokens': 10, 'output_tokens': 2}
    )
    codeflash_output = _map_usage(msg) # 2.93μs -> 600ns (388% faster)

def test_edge_extra_keys_in_usage():
    # Extra keys in usage, only integers are kept in details
    usage_obj = DummyUsage(
        input_tokens=8,
        output_tokens=4,
        irrelevant=None,
        another='string'
    )
    msg = BetaMessage(usage_obj)
    expected = Usage(
        request_tokens=8,
        response_tokens=4,
        total_tokens=12,
        details={'input_tokens': 8, 'output_tokens': 4}
    )
    codeflash_output = _map_usage(msg) # 2.74μs -> 532ns (415% faster)

def test_edge_no_usage_in_event():
    # Event types with no usage info should return Usage() with all None
    for event_cls in [RawMessageStopEvent, RawContentBlockStartEvent, RawContentBlockDeltaEvent, RawContentBlockStopEvent]:
        msg = event_cls()
        expected = Usage()
        codeflash_output = _map_usage(msg) # 4.91μs -> 1.04μs (371% faster)


def test_edge_negative_tokens():
    # Negative tokens (should be included as-is)
    usage_obj = DummyUsage(input_tokens=-3, output_tokens=-2)
    msg = BetaMessage(usage_obj)
    expected = Usage(
        request_tokens=-3,
        response_tokens=-2,
        total_tokens=-5,
        details={'input_tokens': -3, 'output_tokens': -2}
    )
    codeflash_output = _map_usage(msg) # 2.93μs -> 581ns (405% faster)

def test_edge_large_token_values():
    # Very large token counts
    usage_obj = DummyUsage(input_tokens=10**9, output_tokens=10**9)
    msg = BetaMessage(usage_obj)
    expected = Usage(
        request_tokens=10**9,
        response_tokens=10**9,
        total_tokens=2*10**9,
        details={'input_tokens': 10**9, 'output_tokens': 10**9}
    )
    codeflash_output = _map_usage(msg) # 2.67μs -> 503ns (430% faster)

def test_edge_all_token_types_missing():
    # No input_tokens, no cache tokens, only output_tokens
    usage_obj = DummyUsage(output_tokens=7)
    msg = BetaMessage(usage_obj)
    expected = Usage(
        request_tokens=None,
        response_tokens=7,
        total_tokens=7,
        details={'output_tokens': 7}
    )
    codeflash_output = _map_usage(msg) # 2.78μs -> 517ns (438% faster)

# 3. Large Scale Test Cases

def test_large_scale_many_token_types():
    # Simulate a usage object with many integer keys (simulate up to 1000 keys)
    usage_dict = {f'key_{i}': i for i in range(1000)}
    usage_dict['input_tokens'] = 500
    usage_dict['output_tokens'] = 300
    usage_obj = DummyUsage(**usage_dict)
    msg = BetaMessage(usage_obj)
    # request_tokens = input_tokens + (cache creation/read tokens, which are 0)
    expected_details = {k: v for k, v in usage_dict.items() if isinstance(v, int)}
    expected = Usage(
        request_tokens=500,
        response_tokens=300,
        total_tokens=800,
        details=expected_details
    )
    codeflash_output = _map_usage(msg) # 2.67μs -> 586ns (356% faster)

def test_large_scale_high_token_counts():
    # Simulate high token counts for all three input token types
    usage_obj = DummyUsage(
        input_tokens=300,
        cache_creation_input_tokens=400,
        cache_read_input_tokens=200,
        output_tokens=1000
    )
    msg = BetaMessage(usage_obj)
    expected = Usage(
        request_tokens=300+400+200,
        response_tokens=1000,
        total_tokens=1900,
        details={
            'input_tokens': 300,
            'cache_creation_input_tokens': 400,
            'cache_read_input_tokens': 200,
            'output_tokens': 1000
        }
    )
    codeflash_output = _map_usage(msg) # 2.69μs -> 562ns (379% faster)

def test_large_scale_batch_of_events():
    # Test a batch of 100 different BetaMessages
    for i in range(1, 101):
        usage_obj = DummyUsage(input_tokens=i, output_tokens=2*i)
        msg = BetaMessage(usage_obj)
        expected = Usage(
            request_tokens=i,
            response_tokens=2*i,
            total_tokens=3*i,
            details={'input_tokens': i, 'output_tokens': 2*i}
        )
        codeflash_output = _map_usage(msg) # 62.1μs -> 14.8μs (320% faster)

def test_large_scale_betarawmessagedeltaevent_batch():
    # Test a batch of BetaRawMessageDeltaEvent with only output_tokens
    for i in range(1, 101):
        usage_obj = DummyUsage(output_tokens=i)
        msg = BetaRawMessageDeltaEvent(usage_obj)
        expected = Usage(
            request_tokens=None,
            response_tokens=i,
            total_tokens=i,
            details={'output_tokens': i}
        )
        codeflash_output = _map_usage(msg) # 62.0μs -> 14.6μs (324% faster)
⏪ Replay Tests and Runtime
Test File::Test Function Original ⏱️ Optimized ⏱️ Speedup
models/test_anthropic.py::test_usage 55.6μs 44.3μs ✅25.3%
test_pytest_inlinesnapshotdisable_testsproviderstest_bedrock_py_testsproviderstest_google_gla_py_teststes__replay_test_0.py::test_pydantic_ai_models_anthropic__map_usage 17.8μs 19.8μs ⚠️-9.78%
test_pytest_inlinesnapshotdisable_teststest_messages_py_teststest_mcp_py_teststest_deps_py__replay_test_0.py::test_pydantic_ai_models_anthropic__map_usage 254μs 162μs ✅56.4%

To edit these changes git checkout codeflash/optimize-_map_usage-md0d6dgh and push.

Codeflash

codeflash-ai bot and others added 3 commits July 12, 2025 14:51
Here is an optimized version of the provided program.  
**Key optimizations with rationale:**
- Avoids repeated/expensive `isinstance()` checks by merging logic.
- Avoids unnecessary dictionary comprehensions and allocations if unused.
- Minimizes `details.get()` calls and reuse of variables.
- Uses local variable assignment to reduce attribute lookups.
- Minimizes creation of empty dicts and `Usage()` when possible.
- Uses tuple membership checks for event classes to condense branching ("flat is better than nested").
- Moves `model_dump().items()` out of dict comprehension when it's not needed.
- Handles the common/early-exit (no usage info) case with a constant.



**Summary of changes:**
- Single type comparison and attribute fetch per code path; avoids multiple checks and data flows.
- Uses a static `_EMPTY_USAGE` for "no details" paths—eliminates unnecessary object allocations.
- Handles dictionary and token computation as fast/local as possible (no repeated dict-get, minimal fallbacks).
- Preserves function signature, behavior, and all comment clarifications.

This implementation should provide improved (lower) latency per call, specifically for high-throughput scenarios.
elif msg_type is BetaRawMessageStartEvent:
response_usage = cast(BetaRawMessageStartEvent, message).message.usage
elif msg_type is BetaRawMessageDeltaEvent:
response_usage = cast(BetaRawMessageDeltaEvent, message).usage
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are the performance improvement metrics in the description still accurate after the latest changes to this PR?

I think this change reduces readability (and consistency with other code) significantly so I'd only want to do it if the performance impact (on real examples, not just the test_usage test) are significant. And if it is significant, should we do this everywhere we currently use isinstance (and don't care about subclasses)? Otherwise it seems odd to do it only in this specific place.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants