⚡️ Speed up function get_max_res_without_distortion by 35%
#98
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
📄 35% (0.35x) speedup for
get_max_res_without_distortioninsrc/transformers/models/llama4/image_processing_llama4_fast.py⏱️ Runtime :
112 microseconds→83.0 microseconds(best of202runs)📝 Explanation and details
The optimized code replaces expensive floating-point operations with faster integer arithmetic. The key optimization is eliminating
min(math.floor(original_height * scale_w), target_height)andmin(math.floor(original_width * scale_h), target_width)calls.Specific changes:
math.floor(original_height * scale_w)becomes(original_height * target_width) // original_widthmath.floor(original_width * scale_h)becomes(original_width * target_height) // original_heightmin()calls replaced with conditional expressions usingif-elseWhy this is faster:
//) is significantly faster than float multiplication +math.floor()- eliminates floating-point precision overheadmath.floor()andmin()a if a < b else b) is faster thanmin(a, b)function callsThe line profiler shows the original
min(math.floor(...))lines consumed 30.7% of total runtime, which are now split into simpler integer operations consuming only 19.8% combined.Test case performance: The optimization shows consistent 30-60% speedup across all test cases, with particularly strong gains on:
The mathematical equivalence is preserved:
math.floor(a * (b/c))equals(a * b) // cfor positive integers, maintaining identical behavior while using faster integer arithmetic.✅ Correctness verification report:
🌀 Generated Regression Tests and Runtime
To edit these changes
git checkout codeflash/optimize-get_max_res_without_distortion-mhjqprdxand push.