GitHub - camcasuga/MLErasmus: This site contains all document relevant for the Machine Learning courses of the Erasmus+ network

Introduction

Probability theory and statistical methods play a central role in science. Nowadays we are surrounded by huge amounts of data. For example, there are about one trillion web pages; more than one hour of video is uploaded to YouTube every second, amounting to years of content every day; the genomes of 1000s of people, each of which has a length of more than a billion base pairs, have been sequenced by various labs and so on. This deluge of data calls for automated methods of data analysis, which is exactly what machine learning aims at providing.

Learning outcomes

This course aims at giving you insights and knowledge about many of the central algorithms used in Data Analysis and Machine Learning. The course is project based and through various numerical projects, normally three, you will be exposed to fundamental research problems in these fields, with the aim to reproduce state of the art scientific results. Both supervised and unsupervised methods will be covered. You will learn to develop and structure large codes for studying different systems where Machine Learning is applied to, get acquainted with computing facilities and learn to handle large scientific projects. A good scientific and ethical conduct is emphasized throughout the course. More specifically, after this course you will

Learn about basic data analysis, data optimization and machine learning;
Be capable of extending the acquired knowledge to other systems and cases;
Have an understanding of central algorithms used in data analysis and machine learning;
Understand linear methods for regression and classification, from ordinary least squares, via Lasso and Ridge to Logistic regression;
Learn about various neural networks and deep learning methods for supervised and unsupervised learning;
Learn about about decision trees, random forests and boosting
Learn about support vector machines and kernel transformations
Reduction of data sets
Work on numerical projects to illustrate the theory. The projects play a central role and you are expected to know modern programming languages like Python or C++.

The course has two central parts

Statistical analysis and optimization of data
Machine learning

These topics will be scattered thorughout the course and may not necessarily be taught separately. Rather, we will often take an approach (during the lectures and project/exercise sessions where say elements from statistical data analysis are mixed with specific Machine Learning algorithms.

Statistical analysis and optimization of data

The following topics will be covered

Basic concepts, expectation values, variance, covariance, correlation functions and errors;
Simpler models, binomial distribution, the Poisson distribution, simple and multivariate normal distributions;
Gradient methods for data optimization
Linear methods for regression and classification;
Estimation of errors using cross-validation, blocking, bootstrapping and jackknife methods;
Practical optimization using Singular-value decomposition and least squares for parameterizing data.

Machine learning

The following topics will be covered

Linear Regression and Logistic Regression;
Neural networks and deep learning;
Decisions trees, random forests, boosting and bagging
Support vector machines
Dimensionality reduction, mainly Principal Component Analysis

Hands-on demonstrations, exercises and projects aim at deepining your understanding of these topics.

Computational aspects play a central role and you are expected to work on numerical examples and projects which illustrate the theory and methods. We recommend strongly to form small project groups of 2-3 participants.

Prerequisites

Basic knowledge in programming and mathematics, with an emphasis on linear algebra. Knowledge of Python or/and C++ as programming languages is strongly recommended and experience with Jupiter notebook is recommended.

Practicalities

Lectures are in the morning, from 10am-12pm.
Four hours of laboratory sessions for work on computational projects, from 2pm to 6pm;
Lectures and lab sessions will all be at GANIL, starting January 20 at 9am.
Grading scale: Grades are awarded on a scale from A to F, where A is the best grade and F is a fail. We are aiming at having two projects to be handed in. These will graded and should be finalized not later than two weeks after the course is over. Both projects count 50% each of the final grade. We plan to make the grades available not later than March 1, hopefully the grades will be available before that.

Lecture material

The link https://compphysics.github.io/MLErasmus/doc/web/course.html gives you direct access to the learning material with lectures slides and jupyter notebooks. Videos of the lectures will be added.

Possible textbooks

Recommended textbooks:

Trevor Hastie, Robert Tibshirani, Jerome H. Friedman, The Elements of Statistical Learning, Springer
Aurelien Geron, Hands‑On Machine Learning with Scikit‑Learn and TensorFlow, O'Reilly

General learning book on statistical analysis:

Christian Robert and George Casella, Monte Carlo Statistical Methods, Springer
Peter Hoff, A first course in Bayesian statistical models, Springer

General Machine Learning Books:

Kevin Murphy, Machine Learning: A Probabilistic Perspective, MIT Press
Christopher M. Bishop, Pattern Recognition and Machine Learning, Springer
David J.C. MacKay, Information Theory, Inference, and Learning Algorithms, Cambridge University Press
David Barber, Bayesian Reasoning and Machine Learning, Cambridge University Press

Teaching schedule, topics and teachers

Teachers: Stian Bilek (SB), Lucas Charpentier (LC), Morten Hjorth-Jensen (MHJ), and Hanna Svennevik (HS)

Week 4, January 20-24, 2020

Monday Lecture 10am-12pm: Introduction to Machine Learning and linear regression (MHJ)
Video: https://folk.uio.no/mhjensen/MLErasmus/LectureJan20.mp4
Monday Laboratory 2pm-6pm: Getting familiar with Git, GitHub, installing Python packages and Computational Exercises (SB, LC and HS)
Tuesday Lecture 10am-2pm: Linear Regression and Logistic Regression (MHJ)
Video: https://folk.uio.no/mhjensen/MLErasmus/LectureTue21.mp4
Tuesday Laboratory 10am-2pm: Computational Exercises (SB, LC and HS), exercise set 2
Wednesday Lecture 10am-12pm: Regression and Bias-Variance Tradeoff (MHJ)
Video: https://folk.uio.no/mhjensen/MLErasmus/LectureWedn22.mp4
Wednesday Laboratory 2pm-6pm: Computational Exercises (SB, LC and HS), exercise sets 2 and 3
Thursday Lecture 10am-12pm: Bias-Variance tradeoff, Logistic Regression and Optimization (MHJ)
Video: https://folk.uio.no/mhjensen/MLErasmus/FirstLectureThurs23.mp4
Video: https://folk.uio.no/mhjensen/MLErasmus/SecondLectureThurs23.mp4
Thursday Laboratory 2pm-6pm: Computational Exercises (SB, LC and HS), exercise sets 2 and 3
Friday Lecture 10am-12pm: Logistic Regression and begin Neural Networks (MHJ)
Video: https://folk.uio.no/mhjensen/MLErasmus/LectureFri24.mp4
Friday Laboratory 2pm-6pm: Using and installing TensorFlow and Computational Exercises (SB, LC and HS), exercise sets 2 and 3 and first project

Week 5, January 27-31, 2020

Monday Lecture 10am-12pm: Neural Networks (MHJ)
Video: https://folk.uio.no/mhjensen/MLErasmus/LectureJan27.mp4
Monday Laboratory 2pm-6pm: Computational Exercises (SB, LC and HS) and work on project 1
Tuesday Lecture 10am-2pm: Neural Networks, back propagation and examples of classification and regression problems (MHJ)
Video: https://folk.uio.no/mhjensen/MLErasmus/LectureTue28.mp4
Tuesday Laboratory 2pm-6pm: Computational Exercises (SB, LC and HS) and work on project 1
Wednesday Lecture 10am-12pm: Decision Trees, Random Forests and Boosting (MHJ)
Video: https://folk.uio.no/mhjensen/MLErasmus/LectureWedn29.mp4
Wednesday Laboratory 2pm-6pm: Computational Exercises (SB, LC and HS), work on project 1
Thursday Lecture 10am-12pm: Decision trees, Random Forests and Boosting (MHJ)
Video: https://folk.uio.no/mhjensen/MLErasmus/LectureThurs30.mp4
Thursday Laboratory 2pm-6pm: Computational Exercises (SB, LC and HS), work on project 1
Friday Lecture 10am-12pm: Bossting and XGBoost and Summary of course (MHJ), presentation of project 2
Video: https://folk.uio.no/mhjensen/MLErasmus/LectureFri31.mp4
Friday Laboratory 2pm-6pm: Computational Exercises (SB, LC and HS), work on projects 1 and 2
Handwritten Notes First and Second weeks: https://folk.uio.no/mhjensen/MLErasmus/HandwrittenNotes.pdf

Name		Name	Last commit message	Last commit date
Latest commit History 113 Commits
doc		doc
.gitignore		.gitignore
README.md		README.md
_config.yml		_config.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Introduction

Learning outcomes

The course has two central parts

Statistical analysis and optimization of data

Machine learning

Prerequisites

Practicalities

Lecture material

Possible textbooks

Teaching schedule, topics and teachers

Teachers: Stian Bilek (SB), Lucas Charpentier (LC), Morten Hjorth-Jensen (MHJ), and Hanna Svennevik (HS)

Week 4, January 20-24, 2020

Week 5, January 27-31, 2020

About

Releases

Packages

camcasuga/MLErasmus

Folders and files

Latest commit

History

Repository files navigation

Introduction

Learning outcomes

The course has two central parts

Statistical analysis and optimization of data

Machine learning

Prerequisites

Practicalities

Lecture material

Possible textbooks

Teaching schedule, topics and teachers

Teachers: Stian Bilek (SB), Lucas Charpentier (LC), Morten Hjorth-Jensen (MHJ), and Hanna Svennevik (HS)

Week 4, January 20-24, 2020

Week 5, January 27-31, 2020

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Packages