Skip to content

Conversation

@codeflash-ai
Copy link

@codeflash-ai codeflash-ai bot commented Oct 28, 2025

📄 38% (0.38x) speedup for _check_values_are_feasible in optuna/study/_tell.py

⏱️ Runtime : 472 microseconds 343 microseconds (best of 152 runs)

📝 Explanation and details

The optimized code achieves a 37% speedup by eliminating expensive global lookups inside the hot loop through local variable binding.

Key optimizations:

  1. Local binding of float and math.isnan: The variables float_cast = float and isnan = math.isnan cache these functions as local variables at the start of the function. This avoids repeated global namespace lookups during each loop iteration.

  2. Performance impact in loops: In Python, global lookups (like float() and math.isnan()) are significantly slower than local variable access. Since these functions are called for every value in the input sequence, the optimization compounds with larger inputs.

Why this works:

  • Global namespace lookups involve dictionary operations that are more expensive than direct local variable access
  • The loop processes each value twice (once for float(), once for math.isnan()), so the savings multiply
  • Python's local variable access uses faster LOAD_FAST bytecode vs LOAD_GLOBAL

Test case performance patterns:

  • Small inputs (1-10 values): Modest improvements or slight slowdowns due to setup overhead
  • Large inputs (1000 values): Dramatic speedups of 35-52% where the loop dominates execution time
  • Early exit cases: Significant improvements (18-45% faster) when NaN detection triggers early returns

The optimization is most effective for the function's primary use case: validating large sequences of numeric values in optimization frameworks.

Correctness verification report:

Test Status
⚙️ Existing Unit Tests 🔘 None Found
🌀 Generated Regression Tests 44 Passed
⏪ Replay Tests 🔘 None Found
🔎 Concolic Coverage Tests 🔘 None Found
📊 Tests Coverage 100.0%
🌀 Generated Regression Tests and Runtime
from __future__ import annotations

import math
from collections.abc import Sequence

# imports
import pytest
from optuna.study._tell import _check_values_are_feasible


class DummyStudy:
    """A minimal stand-in for optuna.Study, just with a `directions` attribute."""
    def __init__(self, directions):
        self.directions = directions
from optuna.study._tell import _check_values_are_feasible

# unit tests

# ----------------------
# Basic Test Cases
# ----------------------

def test_single_value_match():
    # Single direction, single value, valid float
    study = DummyStudy(['minimize'])
    values = [1.0]
    codeflash_output = _check_values_are_feasible(study, values) # 1.07μs -> 1.14μs (5.90% slower)

def test_multiple_values_match():
    # Multiple directions, matching number of values, all valid floats
    study = DummyStudy(['minimize', 'maximize'])
    values = [0.5, 2.3]
    codeflash_output = _check_values_are_feasible(study, values) # 1.12μs -> 1.16μs (3.18% slower)

def test_integer_values():
    # Integer values should be accepted (can be cast to float)
    study = DummyStudy(['minimize', 'maximize'])
    values = [1, 2]
    codeflash_output = _check_values_are_feasible(study, values) # 1.31μs -> 1.34μs (2.46% slower)


def test_empty_values_and_directions():
    # No objectives, no values should be accepted
    study = DummyStudy([])
    values = []
    codeflash_output = _check_values_are_feasible(study, values) # 817ns -> 1.26μs (35.4% slower)

# ----------------------
# Edge Test Cases
# ----------------------

def test_nan_value():
    # NaN should be rejected
    study = DummyStudy(['minimize'])
    values = [float('nan')]
    codeflash_output = _check_values_are_feasible(study, values); result = codeflash_output # 2.83μs -> 2.40μs (18.2% faster)

def test_inf_value():
    # Inf should be accepted (not NaN)
    study = DummyStudy(['minimize'])
    values = [float('inf')]
    codeflash_output = _check_values_are_feasible(study, values) # 998ns -> 1.08μs (7.51% slower)

def test_negative_inf_value():
    # -Inf should be accepted (not NaN)
    study = DummyStudy(['minimize'])
    values = [float('-inf')]
    codeflash_output = _check_values_are_feasible(study, values) # 1.01μs -> 991ns (2.02% faster)

def test_non_float_castable_value():
    # Non-castable string should be rejected
    study = DummyStudy(['minimize'])
    values = ['not_a_number']
    codeflash_output = _check_values_are_feasible(study, values); result = codeflash_output # 2.29μs -> 2.58μs (11.2% slower)

def test_none_value():
    # None should be rejected
    study = DummyStudy(['minimize'])
    values = [None]
    codeflash_output = _check_values_are_feasible(study, values); result = codeflash_output # 2.33μs -> 2.45μs (5.22% slower)

def test_bool_value():
    # Booleans should be accepted (can be cast to float: True->1.0, False->0.0)
    study = DummyStudy(['minimize', 'maximize'])
    values = [True, False]
    codeflash_output = _check_values_are_feasible(study, values) # 1.50μs -> 1.58μs (4.88% slower)

def test_mismatched_length_fewer_values():
    # Fewer values than directions should be rejected
    study = DummyStudy(['minimize', 'maximize', 'maximize'])
    values = [1.0, 2.0]
    codeflash_output = _check_values_are_feasible(study, values); result = codeflash_output # 1.75μs -> 1.78μs (1.85% slower)

def test_mismatched_length_more_values():
    # More values than directions should be rejected
    study = DummyStudy(['minimize'])
    values = [1.0, 2.0]
    codeflash_output = _check_values_are_feasible(study, values); result = codeflash_output # 1.55μs -> 1.56μs (0.897% slower)

def test_tuple_values():
    # Tuple as values should work if elements are float-castable
    study = DummyStudy(['minimize', 'maximize'])
    values = (3, 4.5)
    codeflash_output = _check_values_are_feasible(study, values) # 1.33μs -> 1.34μs (0.748% slower)

def test_list_of_lists():
    # List containing non-float-castable list element should be rejected
    study = DummyStudy(['minimize'])
    values = [[1, 2]]
    codeflash_output = _check_values_are_feasible(study, values); result = codeflash_output # 3.79μs -> 3.93μs (3.61% slower)

def test_object_value():
    # Custom object should be rejected
    study = DummyStudy(['minimize'])
    class Foo: pass
    values = [Foo()]
    codeflash_output = _check_values_are_feasible(study, values); result = codeflash_output # 2.92μs -> 3.41μs (14.2% slower)

def test_empty_values_nonempty_directions():
    # No values, but directions present, should fail
    study = DummyStudy(['minimize'])
    values = []
    codeflash_output = _check_values_are_feasible(study, values); result = codeflash_output # 1.13μs -> 1.30μs (12.9% slower)

# ----------------------
# Large Scale Test Cases
# ----------------------

def test_large_valid_values():
    # Large number of directions and values, all valid
    n = 1000
    study = DummyStudy(['minimize'] * n)
    values = [float(i) for i in range(n)]
    codeflash_output = _check_values_are_feasible(study, values) # 44.2μs -> 29.0μs (52.1% faster)

def test_large_nan_in_middle():
    # Large number of values, one NaN in the middle
    n = 1000
    study = DummyStudy(['minimize'] * n)
    values = [float(i) for i in range(n)]
    values[n//2] = float('nan')
    codeflash_output = _check_values_are_feasible(study, values); result = codeflash_output # 23.8μs -> 16.4μs (44.8% faster)

def test_large_non_castable_in_middle():
    # Large number of values, one non-castable string in the middle
    n = 1000
    study = DummyStudy(['minimize'] * n)
    values = [float(i) for i in range(n)]
    values[n//2] = "bad"
    codeflash_output = _check_values_are_feasible(study, values); result = codeflash_output # 24.0μs -> 17.0μs (41.5% faster)

def test_large_mismatched_length():
    # Large number of directions, fewer values
    n = 1000
    study = DummyStudy(['minimize'] * n)
    values = [float(i) for i in range(n-1)]
    codeflash_output = _check_values_are_feasible(study, values); result = codeflash_output # 44.6μs -> 29.9μs (49.1% faster)

def test_large_all_inf_values():
    # Large number of directions, all values are inf
    n = 1000
    study = DummyStudy(['minimize'] * n)
    values = [float('inf')] * n
    codeflash_output = _check_values_are_feasible(study, values) # 43.9μs -> 28.9μs (52.2% faster)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
#------------------------------------------------
from __future__ import annotations

import math
from collections.abc import Sequence

# imports
import pytest  # used for our unit tests
from optuna.study._tell import _check_values_are_feasible


# Minimal stub for optuna.Study to allow us to run tests without optuna installed.
class Study:
    def __init__(self, directions):
        self.directions = directions
from optuna.study._tell import _check_values_are_feasible

# unit tests

# --- Basic Test Cases ---

def test_single_valid_float():
    # Single float, directions match, should pass
    study = Study(["minimize"])
    values = [0.5]
    codeflash_output = _check_values_are_feasible(study, values) # 1.26μs -> 1.22μs (3.78% faster)

def test_multiple_valid_floats():
    # Multiple floats, directions match, should pass
    study = Study(["minimize", "maximize"])
    values = [1.0, -2.5]
    codeflash_output = _check_values_are_feasible(study, values) # 1.14μs -> 1.12μs (1.70% faster)

def test_integer_values():
    # Integers should be castable to float, should pass
    study = Study(["minimize", "maximize"])
    values = [5, 10]
    codeflash_output = _check_values_are_feasible(study, values) # 1.31μs -> 1.35μs (3.25% slower)


def test_zero_value():
    # Zero is a valid float
    study = Study(["minimize"])
    values = [0]
    codeflash_output = _check_values_are_feasible(study, values) # 1.63μs -> 1.77μs (7.74% slower)

# --- Edge Test Cases ---

def test_nan_value():
    # NaN is not acceptable
    study = Study(["minimize"])
    values = [float('nan')]
    codeflash_output = _check_values_are_feasible(study, values); result = codeflash_output # 2.20μs -> 2.23μs (0.988% slower)

def test_inf_value():
    # Infinity is a valid float, not rejected by the function
    study = Study(["minimize"])
    values = [float('inf')]
    codeflash_output = _check_values_are_feasible(study, values) # 1.00μs -> 1.05μs (5.02% slower)

def test_negative_inf_value():
    # Negative infinity is a valid float, not rejected by the function
    study = Study(["minimize"])
    values = [float('-inf')]
    codeflash_output = _check_values_are_feasible(study, values) # 959ns -> 1.03μs (7.16% slower)

def test_non_castable_string():
    # String that cannot be cast to float should fail
    study = Study(["minimize"])
    values = ["not_a_float"]
    codeflash_output = _check_values_are_feasible(study, values); result = codeflash_output # 2.42μs -> 2.69μs (9.97% slower)

def test_none_value():
    # None cannot be cast to float
    study = Study(["minimize"])
    values = [None]
    codeflash_output = _check_values_are_feasible(study, values); result = codeflash_output # 2.25μs -> 2.52μs (10.4% slower)

def test_object_value():
    # Arbitrary object cannot be cast to float
    study = Study(["minimize"])
    values = [object()]
    codeflash_output = _check_values_are_feasible(study, values); result = codeflash_output # 2.98μs -> 3.39μs (12.0% slower)

def test_empty_values_and_directions():
    # Both directions and values are empty, should pass
    study = Study([])
    values = []
    codeflash_output = _check_values_are_feasible(study, values) # 612ns -> 835ns (26.7% slower)

def test_mismatched_lengths_more_values():
    # More values than directions
    study = Study(["minimize"])
    values = [1.0, 2.0]
    codeflash_output = _check_values_are_feasible(study, values); result = codeflash_output # 1.78μs -> 1.71μs (3.80% faster)

def test_mismatched_lengths_more_directions():
    # More directions than values
    study = Study(["minimize", "maximize"])
    values = [1.0]
    codeflash_output = _check_values_are_feasible(study, values); result = codeflash_output # 1.44μs -> 1.45μs (0.622% slower)

def test_tuple_values():
    # Values can be a tuple, not just a list
    study = Study(["minimize", "maximize"])
    values = (1.0, 2.0)
    codeflash_output = _check_values_are_feasible(study, values) # 1.11μs -> 1.06μs (5.49% faster)


def test_multiple_errors_only_first_reported():
    # Only the first error is reported (early return)
    study = Study(["minimize", "maximize"])
    values = ["not_a_float", float('nan')]
    codeflash_output = _check_values_are_feasible(study, values); result = codeflash_output # 2.69μs -> 3.15μs (14.6% slower)

def test_multiple_errors_nan_first():
    # If NaN is first, it is reported
    study = Study(["minimize", "maximize"])
    values = [float('nan'), "not_a_float"]
    codeflash_output = _check_values_are_feasible(study, values); result = codeflash_output # 2.67μs -> 2.38μs (12.3% faster)

def test_bool_values():
    # Booleans are castable to float (True->1.0, False->0.0)
    study = Study(["minimize", "maximize"])
    values = [True, False]
    codeflash_output = _check_values_are_feasible(study, values) # 1.52μs -> 1.68μs (9.45% slower)

def test_direction_list_is_not_sequence():
    # directions is not a list, but any sequence
    study = Study(("minimize", "maximize"))
    values = [1.0, 2.0]
    codeflash_output = _check_values_are_feasible(study, values) # 1.13μs -> 1.16μs (2.51% slower)

# --- Large Scale Test Cases ---

def test_large_number_of_values_and_directions():
    # 1000 directions and 1000 values, all valid
    n = 1000
    study = Study(["minimize"] * n)
    values = [float(i) for i in range(n)]
    codeflash_output = _check_values_are_feasible(study, values) # 43.3μs -> 29.2μs (48.4% faster)

def test_large_number_of_values_with_nan():
    # 999 valid, last is NaN
    n = 1000
    study = Study(["minimize"] * n)
    values = [float(i) for i in range(n-1)] + [float('nan')]
    codeflash_output = _check_values_are_feasible(study, values); result = codeflash_output # 44.9μs -> 30.0μs (49.6% faster)

def test_large_number_of_values_with_non_castable():
    # 999 valid, last is non-castable string
    n = 1000
    study = Study(["minimize"] * n)
    values = [float(i) for i in range(n-1)] + ["not_a_float"]
    codeflash_output = _check_values_are_feasible(study, values); result = codeflash_output # 45.5μs -> 30.5μs (48.9% faster)

def test_large_mismatched_lengths():
    # 1000 directions, 999 values
    n = 1000
    study = Study(["minimize"] * n)
    values = [float(i) for i in range(n-1)]
    codeflash_output = _check_values_are_feasible(study, values); result = codeflash_output # 44.7μs -> 29.7μs (50.7% faster)


def test_large_all_integers():
    # 1000 directions, 1000 integer values
    n = 1000
    study = Study(["minimize"] * n)
    values = list(range(n))
    codeflash_output = _check_values_are_feasible(study, values) # 55.8μs -> 41.1μs (35.9% faster)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

To edit these changes git checkout codeflash/optimize-_check_values_are_feasible-mhayvoza and push.

Codeflash

The optimized code achieves a **37% speedup** by eliminating expensive global lookups inside the hot loop through **local variable binding**.

**Key optimizations:**

1. **Local binding of `float` and `math.isnan`**: The variables `float_cast = float` and `isnan = math.isnan` cache these functions as local variables at the start of the function. This avoids repeated global namespace lookups during each loop iteration.

2. **Performance impact in loops**: In Python, global lookups (like `float()` and `math.isnan()`) are significantly slower than local variable access. Since these functions are called for every value in the input sequence, the optimization compounds with larger inputs.

**Why this works:**
- Global namespace lookups involve dictionary operations that are more expensive than direct local variable access
- The loop processes each value twice (once for `float()`, once for `math.isnan()`), so the savings multiply
- Python's local variable access uses faster LOAD_FAST bytecode vs LOAD_GLOBAL

**Test case performance patterns:**
- **Small inputs (1-10 values)**: Modest improvements or slight slowdowns due to setup overhead
- **Large inputs (1000 values)**: Dramatic speedups of 35-52% where the loop dominates execution time
- **Early exit cases**: Significant improvements (18-45% faster) when NaN detection triggers early returns

The optimization is most effective for the function's primary use case: validating large sequences of numeric values in optimization frameworks.
@codeflash-ai codeflash-ai bot requested a review from mashraf-222 October 28, 2025 19:36
@codeflash-ai codeflash-ai bot added ⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: High Optimization Quality according to Codeflash labels Oct 28, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: High Optimization Quality according to Codeflash

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant