- Introduction
- Lecture 1
- Lecture 2
- Homework 1 posted
- Lecture 3
- 1st discussion section
- No class for MLK Day
- Introduction to policy gradients
- Further policy gradients and related
- Homework 2 posted/discussed
- Project 1 posted/discussed
- Discussed practical implementation of environments in gym
- Lecture on dynamic programming approaches to MDP (Sutton and Barto)
- Discussed Temporal Differencing (TD(0)) and how Q-learning arises
- Extended Homework 2 due date to 19 February 2020