This project explores the application of logistic regression to credit scoring, using data from Kaggle’s Home Credit Default Risk competition. The focus is on building a simple and interpretable Probability of Default (PD) model, suitable for classroom instruction and for demonstrating threshold-based tradeoffs in binary classification.
The notebook is structured as a complete teaching module, with well-labeled sections, model interpretation, and visualizations. Code blocks are annotated in an appendix that links back to the main notebook content.
This project is designed for students and early-career analysts interested in:
- Modeling PD using logistic regression with imbalanced classes
- Understanding how threshold choice affects precision and recall
- Using ROC curves and confusion matrices for evaluation
- Practicing real-world model building from public credit data
Section | Description |
---|---|
1 | Introduction and Teaching Context |
2 | Dataset Overview and Pedagogical Considerations |
3 | Building and Interpreting the Credit Scoring Model |
4 | Conclusion and Teaching Takeaways |
Appendix A | Code blocks annotated by section for modular use |
- Transparent logistic model structure
- Threshold tuning with clear recall/precision tradeoffs
- Probability plots to simulate loan approval cutoff scenarios
- ROC curve and AUC for performance tracking
Sergey Kharitonov. (2018). Home Credit Default Risk. Kaggle.
https://www.kaggle.com/competitions/home-credit-default-risk
This project is released under the MIT License and intended for educational and non-commercial use.
This project was conducted in a personal capacity as an independent researcher. All views, analyses, and interpretations presented here are my own and do not represent the views of any current or former employer. The project is for educational and non-commercial use only.
If you find this notebook helpful, feel free to fork the repository or give it a ⭐️.
Feedback and suggestions are always welcome — contributions to improve the educational value are especially appreciated.