-
Notifications
You must be signed in to change notification settings - Fork 0
Challenges in Model Selection and Validation
ai-lab-projects edited this page Apr 29, 2025
·
1 revision
Currently, successful training — meaning models that develop meaningful trading behaviors — is relatively rare.
Therefore, we repeatedly retrain and search for good runs manually.
Even when we find a model that:
- Achieves good metrics (such as low p-value) on the training and validation sets,
- And passes final testing with promising performance,
there is a critical risk:
By retraining many times and selecting only the seemingly good models,
we may artificially find a model that looks great by sheer luck, even if no real learning occurred.
This is especially true when:
- You repeat training/selection many times.
- You select models based on small p-values or other metrics.
- You don't sufficiently adjust for the number of attempts (multiple comparisons).
- Use train, validation, and test sets properly.
- After selecting models based on train/val, evaluate p-values on the test set.
- Require strong performance on all splits, not just train/val.
- Limit the number of retrials or adjust p-values to account for multiple testing.
- Explore Bayesian model selection or cross-validation techniques to better quantify uncertainty.
While model selection is necessary given the current instability of learning,
we must always be cautious that the "best" model found might just be a statistical fluke.
Robust validation and conservatism in interpreting results are crucial.