feat(evolution): add IC-based factor deduplication penalty#38
Merged
feat(evolution): add IC-based factor deduplication penalty#38
Conversation
Add ic_correlation_penalty parameter to compute_fitness() that penalizes strategies whose active factors are highly correlated (measuring the same signal). This complements the existing HHI-based factor_diversity_bonus which only checks weight concentration, not actual factor similarity. - scoring.py: ic_correlation_penalty multiplier (0.7x–1.15x) - corr > 0.7: penalty down to 0.7x at corr=1.0 - corr < 0.3: bonus up to 1.15x at corr=0.0 - [0.3, 0.7]: neutral (1.0x) - auto_evolve.py: lightweight IC correlation computation in evaluate() - Samples 20 stocks × 50 dates from pre-computed indicators - Cached per generation via gen_seed + active factor set - tests/test_ic_dedup.py: 36 tests covering penalty math, edge cases, caching, backward compatibility, and Pearson correlation helper Co-Authored-By: Claude Opus 4.6 <[email protected]>
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
P0: IC-Based Factor Deduplication
Add Information Coefficient (IC) correlation as a fitness adjustment factor to prevent the GA from converging on redundant, highly-correlated factor combinations.
What it does
Why it matters
The existing HHI-based
factor_diversity_bonusonly checks weight concentration (whether weights are spread across many factors). But it doesn't check whether the factors themselves are redundant — two perfectly correlated factors with equal weights would get a diversity bonus despite being functionally identical.IC dedup addresses this: factors with similar signals get penalized, pushing the GA toward genuinely diverse strategies.
Implementation
stratevo/evolution/scoring.py: Newic_correlation_penaltyparam incompute_fitness()stratevo/evolution/auto_evolve.py:_compute_ic_correlation()method with efficient sampling (20 stocks x 50 dates) and per-generation cachingtests/test_ic_dedup.py: 36 comprehensive tests (penalty math, diverse vs redundant, Pearson edge cases, caching, backward compat)Testing