Skip to content

Conversation

@MissionWAR
Copy link
Owner

Summary

This PR includes improvements across three areas:

Features

  • New --json-stats <path> flag for machine-readable statistics output

Performance

  • Inline duplicate pruning for other_rules (reduces memory)
  • Use sys.intern() for domain strings
  • Fast-path optimizations

Code Quality

  • Type aliases and TypedDict for better type safety
  • Comprehensive docstrings

- Add type aliases (RuleEntry, WildcardEntry) for complex tuple types
- Add TypedDict for stats dictionaries in cleaner.py and pipeline.py
- Add named constants (LRU_CACHE_SIZE) for better readability
- Add Final type hints for all module-level constants
- Add __all__ exports in scripts/__init__.py
- Add comprehensive docstrings with examples for all public functions
- Reorganize UNSUPPORTED_MODIFIERS by category with inline docs
- Bump version to 1.3.0
- Use sys.intern() for domain strings (memory dedup, faster lookups)
- Add EMPTY_FROZENSET constant to avoid repeated allocations
- Add fast-path in should_prune_by_modifiers() for empty modifiers (~90% of calls)
- Use walrus operator for early exit on empty lines
- Use tuple form for startswith() calls (single call vs OR)
- Restructure extract_abp_info() for early return on no modifiers
- New save_stats_json() function to export stats as JSON
- JSON includes version, timestamp, execution time, and all statistics
- Useful for CI/CD integration and monitoring
- Changed other_rules from list to set for O(1) duplicate checks during parse
- Removed separate Phase 4 deduplication step (reduces memory usage)
- Same output, lower peak memory, slightly faster
@MissionWAR MissionWAR merged commit 57a0e63 into main Jan 7, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants