Reinforcement learning  

Reinforcement learning is a type of machine learning problem in which the learner gets a (delayed) numerical feedback signal about its demonstrated performance. It is the toughest type of machine learning problem to solve, but also the one that best encompasses the idea of artificial intelligence as a whole. In this course we will define the components that make up a reinforcement learning problem and will see what the important concepts are when trying to solve such a problem, such as state and action values, policies and performance feedback. We will look at the different properties a reinforcement learning problem can have and what the consequences of these properties are with respect to solvability. We will discuss value based techniques as well as direct policy learning and learn how to implement these techniques. We will study the influence of generalisation on learning performance and see how supervised learning (and specifically deep learning) can be used to help reinforcement learning techniques tackle larger problems. We will also look at the evaluation of learned policies and the development of performance over time. Prerequisites No hard prerequisites but having some background in Machine Learning and/or Data Mining will be helpful. Recommended reading Lecture slides will be uploaded before each lecture. These slides are designed and intended as support during teaching, not as study material by themselves. They are supplied as a service, but additional note taking will be necessary to pass the class. The book “Reinforcement Learning – An Introduction” by Sutton and Barto is freely available at: https://www.andrew.cmu.edu/course/10-703/textbook/BartoSutton.pdf More information at: https://curriculum.maastrichtuniversity.nl/meta/477745/reinforcement-learning
Presential
English
Reinforcement learning
English

Funded by the European Union. Views and opinions expressed are however those of the author(s) only and do not necessarily reflect those of the European Union or HaDEA. Neither the European Union nor the granting authority can be held responsible for them. The statements made herein do not necessarily have the consent or agreement of the ASTRAIOS Consortium. These represent the opinion and findings of the author(s).