Reinforcement learning is a type of machine learning problem in which the learner gets a (delayed) numerical feedback signal about its demonstrated performance. It is the toughest type of machine learning problem to solve, but also the one that best encompasses the idea of artificial intelligence as a whole. In this course we will define the components that make up a reinforcement learning problem and will see what the important concepts are when trying to solve such a problem, such as state and action values, policies and performance feedback. We will look at the different properties a reinforcement learning problem can have and what the consequences of these properties are with respect to solvability. We will discuss value based techniques as well as direct policy learning and learn how to implement these techniques. We will study the influence of generalisation on learning performance and see how supervised learning (and specifically deep learning) can be used to help reinforcement learning techniques tackle larger problems. We will also look at the evaluation of learned policies and the development of performance over time.
Prerequisites
No hard prerequisites but having some background in Machine Learning and/or Data Mining will be helpful.
Recommended reading
Lecture slides will be uploaded before each lecture. These slides are designed and intended as support during teaching, not as study material by themselves. They are supplied as a service, but additional note taking will be necessary to pass the class.
The book “Reinforcement Learning – An Introduction” by Sutton and Barto is freely available at: https://www.andrew.cmu.edu/course/10-703/textbook/BartoSutton.pdf
More information at: https://curriculum.maastrichtuniversity.nl/meta/477745/reinforcement-learning