⚡️ Speed up method ImportAnalyzer._fast_generic_visit by 44% in PR #867 (inspect-signature-issue)
#880
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
⚡️ This pull request contains optimizations for PR #867
If you approve this dependent PR, these changes will be merged into the original PR branch
inspect-signature-issue.📄 44% (0.44x) speedup for
ImportAnalyzer._fast_generic_visitincodeflash/discovery/discover_unit_tests.py⏱️ Runtime :
1.09 milliseconds→756 microseconds(best of27runs)📝 Explanation and details
The optimization converts the recursive AST traversal from a call-stack based approach to an iterative one using a manual stack, delivering a 44% performance improvement.
Key optimizations applied:
Stack-based iteration replaces recursion: The original code used recursive calls to
_fast_generic_visit()andmeth()for AST traversal. The optimized version uses a manual stack withwhileloop iteration, eliminating function call overhead and stack frame management costs.Faster method resolution: Replaced
getattr(self, "visit_" + classname, None)withtype(self).__dict__.get("visit_" + classname), which is significantly faster for method lookup. The class dictionary lookup avoids the more expensive attribute resolution pathway.Local variable caching: Pre-cached frequently accessed attributes like
stack.append,stack.pop, andtype(self).__dict__into local variables to reduce repeated attribute lookups during the tight inner loop.Why this leads to speedup:
.get()is ~2-3x faster thangetattr()for method lookups, especially important since this happens for every AST node visited.Performance characteristics from test results:
The optimization shows variable performance depending on AST structure:
This optimization is particularly valuable for AST analysis tools that process large codebases, where the cumulative effect of faster traversal becomes significant.
✅ Correctness verification report:
🌀 Generated Regression Tests and Runtime
To edit these changes
git checkout codeflash/optimize-pr867-2025-11-05T09.44.48and push.