Skip to content

Conversation

@codeflash-ai
Copy link

@codeflash-ai codeflash-ai bot commented Oct 29, 2025

📄 685% (6.85x) speedup for IntUniformDistribution._asdict in optuna/distributions.py

⏱️ Runtime : 454 microseconds 57.8 microseconds (best of 366 runs)

📝 Explanation and details

The optimization replaces copy.deepcopy(self.__dict__) with self.__dict__.copy() in the _asdict method. This change eliminates unnecessary deep copying overhead since all attributes in the dictionary are immutable primitives (integers and booleans).

Key optimization:

  • copy.deepcopy() recursively copies all nested objects, which is overkill for simple int/bool values
  • dict.copy() creates a shallow copy, which is sufficient since the attributes (low, high, step, log) are immutable primitives that don't need recursive copying

Performance impact:
The line profiler shows the deep copy operation took 2.84ms (98.7% of total time) vs only 54μs for the shallow copy (48.5% of total time) - a ~25x speedup on the bottleneck line.

Test case benefits:
All test cases show consistent 200-600% speedups, with particularly strong gains in:

  • Basic operations (246-482% faster)
  • Large scale scenarios with many attributes (3335% faster in stress test)
  • Repeated calls (117-565% faster)

The optimization is safe because it preserves the exact same behavior - creating an independent copy of the dictionary while excluding the "log" field, without any risk of shared references since all values are immutable.

Correctness verification report:

Test Status
⚙️ Existing Unit Tests 🔘 None Found
🌀 Generated Regression Tests 107 Passed
⏪ Replay Tests 🔘 None Found
🔎 Concolic Coverage Tests 2 Passed
📊 Tests Coverage 100.0%
🌀 Generated Regression Tests and Runtime
import copy

# imports
import pytest
from optuna.distributions import IntUniformDistribution


# function to test
class BaseDistribution:
    pass

def _adjust_int_uniform_high(low, high, step):
    # Helper function: adjusts high so that [low, high] is divisible by step
    if step == 1:
        return high
    # Find the largest value <= high such that (value - low) % step == 0
    k = (high - low) // step
    return low + k * step

class IntDistribution(BaseDistribution):
    def __init__(self, low: int, high: int, log: bool = False, step: int = 1) -> None:
        if log and step != 1:
            raise ValueError(
                "Samplers and other components in Optuna only accept step is 1 "
                "when `log` argument is True."
            )
        if low > high:
            raise ValueError(f"`low <= high` must hold, but got ({low=}, {high=}).")
        if log and low < 1:
            raise ValueError(f"`low >= 1` must hold for `log=True`, but got ({low=}, {high=}).")
        if step <= 0:
            raise ValueError(f"`step > 0` must hold, but got {step=}.")
        self.log = log
        self.step = int(step)
        self.low = int(low)
        high = int(high)
        self.high = _adjust_int_uniform_high(self.low, high, self.step)
from optuna.distributions import IntUniformDistribution

# unit tests

# ----------- Basic Test Cases -----------

def test_asdict_basic_normal():
    # Test basic usage with default step=1
    dist = IntUniformDistribution(low=1, high=10)
    codeflash_output = dist._asdict(); result = codeflash_output # 5.74μs -> 1.66μs (246% faster)

def test_asdict_basic_custom_step():
    # Test with custom step
    dist = IntUniformDistribution(low=0, high=10, step=2)
    codeflash_output = dist._asdict(); result = codeflash_output # 5.54μs -> 1.25μs (344% faster)

def test_asdict_basic_high_not_divisible_by_step():
    # Test where high is not divisible by step
    dist = IntUniformDistribution(low=0, high=9, step=2)
    codeflash_output = dist._asdict(); result = codeflash_output # 5.54μs -> 1.17μs (374% faster)

def test_asdict_basic_low_equals_high():
    # Test where low == high
    dist = IntUniformDistribution(low=5, high=5)
    codeflash_output = dist._asdict(); result = codeflash_output # 5.55μs -> 1.18μs (369% faster)

def test_asdict_basic_negative_range():
    # Test with negative low/high
    dist = IntUniformDistribution(low=-10, high=-1)
    codeflash_output = dist._asdict(); result = codeflash_output # 5.59μs -> 1.05μs (434% faster)

# ----------- Edge Test Cases -----------

def test_asdict_edge_step_larger_than_range():
    # step > (high-low), should adjust high to low
    dist = IntUniformDistribution(low=0, high=3, step=5)
    codeflash_output = dist._asdict(); result = codeflash_output # 5.58μs -> 1.01μs (451% faster)

def test_asdict_edge_step_is_one_and_high_not_divisible():
    # step=1, high should not be adjusted
    dist = IntUniformDistribution(low=3, high=7, step=1)
    codeflash_output = dist._asdict(); result = codeflash_output # 5.56μs -> 954ns (482% faster)

def test_asdict_edge_zero_step_raises():
    # step=0 should raise ValueError
    with pytest.raises(ValueError):
        IntUniformDistribution(low=0, high=10, step=0)

def test_asdict_edge_negative_step_raises():
    # step < 0 should raise ValueError
    with pytest.raises(ValueError):
        IntUniformDistribution(low=0, high=10, step=-2)

def test_asdict_edge_low_greater_than_high_raises():
    # low > high should raise ValueError
    with pytest.raises(ValueError):
        IntUniformDistribution(low=10, high=5)

def test_asdict_edge_high_adjustment_check():
    # Check that high is correctly adjusted for step
    dist = IntUniformDistribution(low=2, high=7, step=3)
    # (7-2)//3 = 1, so high = 2+1*3=5
    codeflash_output = dist._asdict(); result = codeflash_output # 6.17μs -> 1.38μs (347% faster)

def test_asdict_edge_mutation_safety():
    # Changing the original object after asdict should not affect the dict
    dist = IntUniformDistribution(low=1, high=4)
    codeflash_output = dist._asdict(); d = codeflash_output # 5.98μs -> 1.22μs (391% faster)
    dist.low = 99

def test_asdict_edge_dict_is_copy():
    # The returned dict should be a copy, not a reference
    dist = IntUniformDistribution(low=1, high=4)
    codeflash_output = dist._asdict(); d = codeflash_output # 5.72μs -> 1.01μs (469% faster)
    d['low'] = 999

def test_asdict_edge_no_extra_keys():
    # The dict should only contain low, high, step
    dist = IntUniformDistribution(low=1, high=2)
    codeflash_output = dist._asdict(); result = codeflash_output # 5.60μs -> 1.04μs (440% faster)

def test_asdict_edge_step_is_maximal():
    # step = high - low, should adjust high to low
    dist = IntUniformDistribution(low=5, high=10, step=5)
    codeflash_output = dist._asdict(); result = codeflash_output # 5.52μs -> 1.07μs (417% faster)

# ----------- Large Scale Test Cases -----------

def test_asdict_large_scale_many_elements():
    # Test with large range, step=1
    dist = IntUniformDistribution(low=0, high=999)
    codeflash_output = dist._asdict(); result = codeflash_output # 5.52μs -> 1.04μs (429% faster)

def test_asdict_large_scale_large_step():
    # Test with large step
    dist = IntUniformDistribution(low=0, high=999, step=100)
    # (999-0)//100 = 9, so high = 0+9*100=900
    codeflash_output = dist._asdict(); result = codeflash_output # 5.59μs -> 1.09μs (414% faster)

def test_asdict_large_scale_high_not_divisible_by_step():
    # Test with high not divisible by step
    dist = IntUniformDistribution(low=0, high=997, step=100)
    # (997-0)//100 = 9, high = 0+9*100=900
    codeflash_output = dist._asdict(); result = codeflash_output # 5.55μs -> 1.16μs (377% faster)

def test_asdict_large_scale_negative_range():
    # Large negative range
    dist = IntUniformDistribution(low=-1000, high=-1, step=10)
    # (-1-(-1000))//10 = 99, high = -1000+99*10 = -10
    codeflash_output = dist._asdict(); result = codeflash_output # 5.32μs -> 1.04μs (413% faster)

def test_asdict_large_scale_mutation_testing():
    # Mutation: If asdict returns a reference, changing the object should affect the dict
    dist = IntUniformDistribution(low=0, high=999, step=2)
    codeflash_output = dist._asdict(); d = codeflash_output # 5.30μs -> 1.03μs (413% faster)
    dist.high = 123456

def test_asdict_large_scale_step_one_large():
    # step=1, large range
    dist = IntUniformDistribution(low=0, high=999)
    codeflash_output = dist._asdict(); result = codeflash_output # 5.60μs -> 1.07μs (422% faster)

def test_asdict_large_scale_step_max():
    # step is nearly as large as range
    dist = IntUniformDistribution(low=0, high=999, step=999)
    # (999-0)//999 = 1, high = 0+1*999=999
    codeflash_output = dist._asdict(); result = codeflash_output # 5.52μs -> 1.06μs (422% faster)

# ----------- Miscellaneous Robustness Tests -----------

def test_asdict_edge_float_inputs():
    # Inputs are floats, should be cast to int
    dist = IntUniformDistribution(low=1.2, high=5.7, step=2.1)
    codeflash_output = dist._asdict(); result = codeflash_output # 5.61μs -> 1.08μs (419% faster)

def test_asdict_edge_zero_range():
    # low == high == 0
    dist = IntUniformDistribution(low=0, high=0)
    codeflash_output = dist._asdict(); result = codeflash_output # 5.64μs -> 1.01μs (459% faster)

def test_asdict_edge_minimal_positive():
    # low=1, high=1, step=1
    dist = IntUniformDistribution(low=1, high=1, step=1)
    codeflash_output = dist._asdict(); result = codeflash_output # 5.44μs -> 987ns (451% faster)

def test_asdict_edge_high_adjustment_with_large_step():
    # high not divisible by step, large step
    dist = IntUniformDistribution(low=10, high=999, step=111)
    # (999-10)//111 = 8, high = 10+8*111=898
    codeflash_output = dist._asdict(); result = codeflash_output # 5.53μs -> 1.04μs (429% faster)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
#------------------------------------------------
import copy

# imports
import pytest  # used for our unit tests
from optuna.distributions import IntUniformDistribution


# function to test (copied from above)
class BaseDistribution:
    pass  # Dummy base class for test purposes

class IntDistribution(BaseDistribution):
    def __init__(self, low: int, high: int, log: bool = False, step: int = 1) -> None:
        if log and step != 1:
            raise ValueError(
                "Samplers and other components in Optuna only accept step is 1 "
                "when `log` argument is True."
            )
        if low > high:
            raise ValueError(f"`low <= high` must hold, but got ({low=}, {high=}).")
        if log and low < 1:
            raise ValueError(f"`low >= 1` must hold for `log=True`, but got ({low=}, {high=}).")
        if step <= 0:
            raise ValueError(f"`step > 0` must hold, but got {step=}.")
        self.log = log
        self.step = int(step)
        self.low = int(low)
        high = int(high)
        self.high = self._adjust_int_uniform_high(self.low, high, self.step)

    @staticmethod
    def _adjust_int_uniform_high(low, high, step):
        # Mimic the adjustment logic described in the docstring
        if step == 1:
            return high
        if (high - low) % step == 0:
            return high
        k = (high - low) // step
        return k * step + low
from optuna.distributions import IntUniformDistribution

# unit tests

# 1. Basic Test Cases

def test_asdict_basic_fields():
    # Test that _asdict returns correct dictionary for typical values
    dist = IntUniformDistribution(low=1, high=10, step=2)
    codeflash_output = dist._asdict(); d = codeflash_output # 5.49μs -> 1.04μs (427% faster)

def test_asdict_step_one():
    # Test with step=1 (no adjustment to high)
    dist = IntUniformDistribution(low=2, high=7, step=1)
    codeflash_output = dist._asdict(); d = codeflash_output # 5.49μs -> 1.06μs (416% faster)

def test_asdict_high_equals_low():
    # Test with high == low
    dist = IntUniformDistribution(low=5, high=5, step=1)
    codeflash_output = dist._asdict(); d = codeflash_output # 5.51μs -> 1.16μs (374% faster)

def test_asdict_high_adjustment():
    # Test adjustment of high when not divisible by step
    dist = IntUniformDistribution(low=0, high=10, step=3)
    # (10-0)//3 = 3, 3*3+0 = 9
    codeflash_output = dist._asdict(); d = codeflash_output # 5.64μs -> 1.08μs (420% faster)

def test_asdict_high_no_adjustment():
    # Test when high is divisible by step
    dist = IntUniformDistribution(low=0, high=9, step=3)
    codeflash_output = dist._asdict(); d = codeflash_output # 5.53μs -> 1.10μs (403% faster)

def test_asdict_deepcopy():
    # The returned dict should not be a reference to the original __dict__
    dist = IntUniformDistribution(low=1, high=3, step=1)
    codeflash_output = dist._asdict(); d = codeflash_output # 5.62μs -> 982ns (473% faster)
    d["low"] = 999

# 2. Edge Test Cases

def test_asdict_negative_low():
    # Test with negative low value
    dist = IntUniformDistribution(low=-5, high=5, step=2)
    codeflash_output = dist._asdict(); d = codeflash_output # 5.30μs -> 997ns (431% faster)

def test_asdict_large_step():
    # step greater than (high-low)
    dist = IntUniformDistribution(low=0, high=4, step=10)
    # (4-0)//10 = 0, 0*10+0 = 0
    codeflash_output = dist._asdict(); d = codeflash_output # 5.53μs -> 1.07μs (417% faster)

def test_asdict_step_equals_range():
    # step == (high-low)
    dist = IntUniformDistribution(low=2, high=5, step=3)
    # (5-2)//3 = 1, 1*3+2 = 5
    codeflash_output = dist._asdict(); d = codeflash_output # 5.46μs -> 1.05μs (420% faster)

def test_asdict_step_larger_than_range():
    # step > (high-low)
    dist = IntUniformDistribution(low=10, high=11, step=5)
    # (11-10)//5 = 0, 0*5+10 = 10
    codeflash_output = dist._asdict(); d = codeflash_output # 5.49μs -> 1.09μs (405% faster)

def test_asdict_zero_step_raises():
    # step=0 should raise ValueError
    with pytest.raises(ValueError):
        IntUniformDistribution(low=0, high=10, step=0)

def test_asdict_negative_step_raises():
    # step<0 should raise ValueError
    with pytest.raises(ValueError):
        IntUniformDistribution(low=0, high=10, step=-1)

def test_asdict_high_less_than_low_raises():
    # high < low should raise ValueError
    with pytest.raises(ValueError):
        IntUniformDistribution(low=10, high=5, step=1)

def test_asdict_float_inputs():
    # Test that float inputs are converted to int
    dist = IntUniformDistribution(low=1.9, high=5.1, step=1.1)
    codeflash_output = dist._asdict(); d = codeflash_output # 6.38μs -> 1.23μs (418% faster)

def test_asdict_mutation_resistance():
    # Changing the returned dict should not affect the distribution
    dist = IntUniformDistribution(low=2, high=8, step=2)
    codeflash_output = dist._asdict(); d = codeflash_output # 5.92μs -> 1.24μs (379% faster)
    d["high"] = 100

# 3. Large Scale Test Cases

def test_asdict_large_range_and_step():
    # Test with large range and step
    dist = IntUniformDistribution(low=0, high=999, step=10)
    codeflash_output = dist._asdict(); d = codeflash_output # 5.89μs -> 1.14μs (418% faster)

def test_asdict_many_instances():
    # Create many instances and check their dictionaries
    for i in range(0, 1000, 100):
        dist = IntUniformDistribution(low=i, high=i+99, step=5)
        codeflash_output = dist._asdict(); d = codeflash_output # 30.2μs -> 4.55μs (565% faster)

def test_asdict_large_step_and_high():
    # Test with large step and high
    dist = IntUniformDistribution(low=0, high=1000, step=1000)
    # (1000-0)//1000 = 1, 1*1000+0 = 1000
    codeflash_output = dist._asdict(); d = codeflash_output # 5.50μs -> 1.24μs (344% faster)

def test_asdict_stress_deepcopy():
    # Stress test that deepcopy works with large __dict__
    dist = IntUniformDistribution(low=0, high=999, step=1)
    # Simulate adding many attributes to __dict__
    for i in range(500):
        setattr(dist, f"attr_{i}", i)
    codeflash_output = dist._asdict(); d = codeflash_output # 195μs -> 5.70μs (3335% faster)
    # All added attributes should be present except "log"
    for i in range(500):
        pass

def test_asdict_no_side_effects_large():
    # Ensure that repeated calls do not mutate internal state
    dist = IntUniformDistribution(low=1, high=100, step=2)
    codeflash_output = dist._asdict(); d1 = codeflash_output # 5.43μs -> 1.60μs (239% faster)
    codeflash_output = dist._asdict(); d2 = codeflash_output # 3.16μs -> 1.46μs (117% faster)
    d1["low"] = -999

# Edge: Check that log is always omitted
def test_asdict_log_field_omitted():
    dist = IntUniformDistribution(low=1, high=10, step=1)
    codeflash_output = dist._asdict(); d = codeflash_output # 5.47μs -> 1.56μs (250% faster)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
#------------------------------------------------
from optuna.distributions import IntUniformDistribution

def test_IntUniformDistribution__asdict():
    IntUniformDistribution._asdict(IntUniformDistribution(0, 0, step=1))
🔎 Concolic Coverage Tests and Runtime
Test File::Test Function Original ⏱️ Optimized ⏱️ Speedup
codeflash_concolic_qluqolhr/tmp6nvgspbl/test_concolic_coverage.py::test_IntUniformDistribution__asdict 6.19μs 1.87μs 231%✅

To edit these changes git checkout codeflash/optimize-IntUniformDistribution._asdict-mhbiyvtb and push.

Codeflash

The optimization replaces `copy.deepcopy(self.__dict__)` with `self.__dict__.copy()` in the `_asdict` method. This change eliminates unnecessary deep copying overhead since all attributes in the dictionary are immutable primitives (integers and booleans).

**Key optimization**: 
- `copy.deepcopy()` recursively copies all nested objects, which is overkill for simple int/bool values
- `dict.copy()` creates a shallow copy, which is sufficient since the attributes (`low`, `high`, `step`, `log`) are immutable primitives that don't need recursive copying

**Performance impact**:
The line profiler shows the deep copy operation took 2.84ms (98.7% of total time) vs only 54μs for the shallow copy (48.5% of total time) - a ~25x speedup on the bottleneck line.

**Test case benefits**:
All test cases show consistent 200-600% speedups, with particularly strong gains in:
- Basic operations (246-482% faster)  
- Large scale scenarios with many attributes (3335% faster in stress test)
- Repeated calls (117-565% faster)

The optimization is safe because it preserves the exact same behavior - creating an independent copy of the dictionary while excluding the "log" field, without any risk of shared references since all values are immutable.
@codeflash-ai codeflash-ai bot requested a review from mashraf-222 October 29, 2025 04:58
@codeflash-ai codeflash-ai bot added ⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: High Optimization Quality according to Codeflash labels Oct 29, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: High Optimization Quality according to Codeflash

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant