CS 277: Control and Reinforcement Learning

Winter 2021

Note: This course was previously offered as CS 295.

Course logistics

When: Tuesdays and Thursdays at 5–6:20pm
- Lectures will be recorded and added to this playlist with access for uci.edu accounts.
Where: zoom
Announcements and forum: piazza
- Important course announcements will be made on the forum.
- Please post on the forum, publicly or privately, all course-related questions (no emails please).
Assignments: gradescope
- Published on this page biweekly.
Instructor: Prof. Roy Fox
- Office hours: calendly
- Enrolled students are welcome to:
  - schedule 15-minute slots (more than once if needed);
  - give at least 4-hour notice;
  - attend individually or with friends.

Grading policy

Assignments: 88%
- 4 best of 5 assignments count for 22% each.
- No late submission.
Participation: 5%
Bonus: 7%

Schedule

(Week) Dates	Tuesday	Thursday
(1) Jan 5, 7	Introduction	Imitation Learning
(2) Jan 12, 14	Temporal-Difference Methods	Policy-Gradient Methods
(3) Jan 19, 21	Actor–Critic Methods	Advanced Model-Free Methods Assignment 1 due
(4) Jan 26, 28	Optimal Control	Stochastic Optimal Control Assignment 2 due
(5) Feb 2, 4	Planning	Model-Based Methods
(6) Feb 9, 11	Partial Observability	Partial-Observability Methods
(7) Feb 16, 18	Exploration (Feb 18) Assignment 3 due	Review (Feb 23)
(8) Feb 23, 25	Inverse RL (recorded)	Control as Inference Assignment 4 due
(9) Mar 2, 4	Structured Control	Multi-Task Learning
(10) Mar 9, 11	Multi-Agent RL	Open Questions Assignment 5 due

Assignments

Assignment 1; due Friday, January 22, 2021 (Pacific Time). Solution
Assignment 2; due Friday, January 29, 2021 (Pacific Time). Solution
Assignment 3; due Tuesday, February 16, 2021 (Pacific Time). Solution
Assignment 4; due Friday, February 26, 2021 (Pacific Time). Solution
Assignment 5; due Friday, March 12, 2021 (Pacific Time).

Resources

Courses

Sergey Levine (Berkeley)
Pieter Abbeel (Berkeley)
Dimitri Bertsekas (MIT; also available in book form; also see 2017 book)
David Silver (UCL)

Books

Vincent François-Lavet et al., An Introduction to Deep Reinforcement Learning
Richard Sutton & Andrew Barto, Reinforcement Learning: An Introduction (2nd edition)
Csaba Szepesvári, Algorithms for Reinforcement Learning

RL libraries

More resources

Awesome Deep RL

Academic honesty

Don’t cheat. Academic honesty is a requirement for passing this class. Compromising the academic integrity of this course is subject to a failing grade. The work you submit must be your own. Academic dishonesty includes, among other things, copying answers from other students or online resources, allowing other students to copy your answers, communicating exam answers to other students during an exam, or attempting to use notes or other aids during an exam. If you do so, you will be in violation of the UCI Policy on Academic Honesty and the ICS Policy on Academic Honesty. It is your responsibility to read and understand these policies, in light of UCI’s definitions and examples of academic misconduct. Note that any instance of academic dishonesty will be reported to the Academic Integrity Administrative Office for disciplinary action, and may fail the course.

CS 277: Control and Reinforcement Learning

Winter 2021

Course logistics

Grading policy

Schedule

Assignments

Resources

Courses

Books

RL libraries

More resources

Further reading

Imitation Learning

Temporal-difference methods

Policy-gradient methods

Actor–critic methods

Model-based methods

Exploration

Inverse Reinforcement Learning

Control as Inference

Structured Control

Multi-Task Learning

Multi-Agent RL

Academic honesty