Welcome to the repository for the "Computational Reproducibility in Machine Learning: A Hands-On Workshop." This repository is designed to provide hands-on demonstrations to attendees of the workshop.
Computational reproducibility, defined as the ability to consistently recreate the results of a computational analysis using the same data, code, and methods, is a cornerstone of reliable and impactful research. In this 2.5-hour workshop, participants explore the principles of computational reproducibility and learn why it is essential for avoiding setbacks in their own projects, accelerating research progress, and enhancing recognition in their field.
The first version of this workshop was given at:
Data Science in Multi-Messenger Astrophysics Program
University of Minnesota
February 25, 2025
- Understand the definition, importance, and challenges of computational reproducibility.
- Learn best practices for managing randomness in computational experiments.
- Gain practical experience with tools and techniques for reproducibility.
- Structure projects for reproducibility using Git and GitHub.
- Manage dependencies with virtual environments and containerize workflows with Docker.
- Assign Digital Object Identifiers (DOIs) to make research outputs citable and publicly accessible.
This workshop is based on the paper:
[1] J. Shenouda and W. U. Bajwa, “A guide to computational reproducibility in signal processing and machine learning,” IEEE Signal Processing Magazine, vol. 40, no. 2, pp. 141–151, Mar. 2023, doi: 10.1109/MSP.2022.3217659.