Course Description

This course will discuss the theory and practice of reinforcement learning (RL) with applications to control, robotics, and multi-agent systems. The goal is for students to understand: (1) key theoretical concepts, (2) basic algorithms and their implementation, and (3) when and how RL can be used for research applications. Topics include Markov decision processes (MDPs), value-based methods, policy methods, function approximation, and multi-agent reinforcement learning (MARL).

Pre-requisites: Credit or concurrent registration in CS 446 or equivalent; STAT 400 or equivalent; proficiency with Python.

Learning Outcomes

Understand the mathematical fundamentals of key theoretical concepts in RL
Know how to implement common RL algorithms with code from scratch
Gain experience applying RL to open-ended research problems

Course Organization

Course Materials

Links to course materials are provided at the top of the website. Links to assignments will be posted on the Schedule page.

Lectures will draw from the following books, which are currently free in PDF form:

“Reinforcement Learning: An Introduction” by Sutton and Barto (MIT Press, 2018)
“Algorithms for Decision Making” by Kochenderfer, Wheeler, and Wray (MIT Press, 2022)

Additional references that you might find helpful are listed below.

Course Tools

Links to course tools are provided at the top of the website.

Campuswire

All announcements and discussions will be handled on Campuswire. We recommend you set up notifications to keep up with announcements.

GitHub

All assignments will be distributed using GitHub Classroom. If you have not used GitHub, there is a short tutorial available here.

See the Assignments page for more details regarding assignment distribution and submissions.

Gradescope

All assignment submissions and grades will be handled on Gradescope.

Python

All coding assignments will be done with Python. Python is open-source, widely used, and has a very active support community (e.g., stack overflow). You are expected to already be proficient in Python.

See the Resources page for resources related to coding (e.g., suggestions on setting up a programming environment for this course).

Assignments

Homeworks: Homeworks may contain a mix of analytical problems (e.g., derivations) and coding problems. You are encouraged to work together on homeworks, but each student should prepare and submit their own work. Homework that is viewed as insufficiently distinct to warrant an independent submission will not be given credit, and, depending on the situation, may be submitted as cheating via the FAIR system.

Projects: This course will involve a final project. The project will be open-ended and aims to offer you an opportunity to implement your choice of methods and apply them to a research problem of interest to you.

Literature Reviews: You will perform two literature reviews throughout the course. One will be a group assignment where you work with 2-3 others to understand and present details for a seminal paper within a current topic being discussed in class. The other will be an individual assignment where you independently identify a paper related to a topic (different from your group literature review) and present a short summary to the class (3-5 minutes). You will sign-up for topics of interest - you will only do each type of review once.

Online students: for the group literature review, you will work with others but not be required to present. For the individual review, you will record the 1-slide summary to be presented in class.

Drop Pollicy: Your lowest homework grade will be dropped.

Late Policy: Late homework and project submissions will be accepted up to 72 hours after the deadline with the following deductions: 10% (within 24 hours of the deadline), 15% (within 48 hours of the deadline), and 20% (within 72 hours of the deadline).

Grading

Your final grade will calculated from homeworks (50%), the project (25%), and literature reviews (15% for group, 10% for individual). The following grading scale will be used:

Grade	Point Range
A	[93, 100)
A-	[90, 93)
B+	[87, 90)
B	[83, 87)
B-	[80, 83)
C+	[77, 80)
C	[73, 77)
C-	[70, 73)
D+	[67, 70)
D	[63, 67)
D-	[60, 63)
F	< 60

Respect and Growth in the Classroom

The effectiveness of our course is dependent upon each of us to create a safe and encouraging learning environment that allows for the open exchange of ideas while also ensuring equitable opportunities and respect for all of us. Everyone is expected to help establish and maintain an environment where students, staff, and faculty can contribute without fear of personal ridicule, or intolerant or offensive language. We ask everyone to be ready to learn and grow in your respect and understanding of others, in addition to your understanding of the course material.

Inclusivity

A feeling of belonging and inclusion is critical to the success and health of our community. The Aerospace Engineering department has a committee called Aero’s Space to Belong. They offer office hours, one-on-one discussion, and a reporting process. If you experience conflict that undermines your or someone else’s feelings of belonging, please consider using these resources: https://aerospace.illinois.edu/diversity/reporting.

Accomodations

Any student with special needs or circumstances requiring accommodation in this course (e.g., disability-related academic adjustments and/or auxiliary aids) is encouraged to contact the instructor and the Disability Resources and Educational Services (DRES) as soon as possible. To contact DRES, you may visit 1207 S. Oak St., Champaign, call 333-4603, e-mail disability@illinois.edu or go to the DRES website. We will ensure these special needs are met.

Additional References

There are many additional references on reinforcement learning. A few are listed below.

“Algorithms of Reinforcement Learning” by Szepesvari (Morgan & Claypool Publishers, 2009)
“Reinforcement Learning: State-of-the-Art” by Wiering and Otterlo (Spring, 2012)
“Reinforcement Learning and Optimal Control” by Bertsekas (Athena Scientific, 2019)