CS 277: Control and Reinforcement Learning

Winter 2026

Schedule

(Week) Dates Monday Tuesday Thursday Friday
(1) Jan 6, 8Introduction
  
Imitation Learning
  
(2) Jan 12, 13, 15, 16Quiz 1
 
Temporal Difference
  
Deep Q-Learning
 
Exercise 1
 
(3) Jan 19, 20, 22Quiz 2
 
Policy GradientAdvanced Model-Free RL
(4) Jan 26, 27, 29, 30Quiz 3Optimal ControlStochastic Optimal ControlExercise 2
(5) Feb 2, 3, 5Quiz 4PlanningModel-Based RL
(6) Feb 9, 10, 12, 13Quiz 5ExplorationPartial ObservabilityExercise 3
(7) Feb 16, 17, 19Quiz 6RL for Language ModelingInverse RL
(8) Feb 23, 24, 26, 27Quiz 7Bounded RLStructured ControlExercise 4
(9) Mar 2, 3, 5Quiz 8Offline RLMulti-Task RL
(10) Mar 9, 10, 12, 13Quiz 9Multi-Agent RLOpen QuestionsExercise 5

Note: the planned schedule is subject to change. Course materials will regularly be added above.

Course logistics

  • When: Tuesdays and Thursdays at 5–6:20pm.
  • Where: ICS 180.
  • Format:
    • Lectures: there will be a lecture each class covering topics in control and reinforcement learning. Lectures will be in-person and their recording will be posted above with access to enrolled students. Attendance is optional but recommended (see reasons below).
    • Quizzes: every week but the last, there will be a quiz about that week’s topics, due by the following Monday. Quizzes consist of multiple-choice questions intended to encourage you to think more deeply about the topics, and are only graded for completion, not correctness: half the score for submitting a complete quiz, and half the score for doing better than a random guess.
    • Exercises: there will be 5 exercises, due every other week. The best 4 exercises will be averaged for the final grade, and a bonus will be given for scoring at least 50% on all 5 exercises.
    • Class discussions: we will discuss each quiz and exercise in a class following its deadline. There will also be recaps, deep dives, and freeform discussions.
  • Ed Discussion:
    • Please use the forum for course-related discussions.
    • Important course announcements will be posted there as well (not on Canvas).
    • Please use the forum (not email) to privately message course staff about course-related matters.
    • Please note that the identity of anonymous posters is visible to the course staff.
  • Gradescope:
    • Quizzes and exercises will be posted above and submitted on Gradescope.
    • We encourage submitting PDF files, and particularly writing them in LaTeX.
  • Instructor: Prof. Roy Fox
    • Office hours: in-person or on zoom.
    • Welcome to schedule a 15-minute meeting (or two if needed).
    • Welcome to attend in person (DBH 4064) or by zoom, individually or with classmates.
    • More times available by request.
  • Teaching assistant: Kyungmin Kim

Grading policy

  • Exercises: 80% (+5% bonus)
    • Best 4 exercises: 20% each.
    • Score at least 50% on every exercise: 5% bonus.
    • Late submission policy: 5 grace days total for all exercises.
  • Quizzes: 18%
    • 9 quizzes: 2% each.
    • Deadlines on Mondays (end of day). No late submissions allowed.
    • This grading policy may change if we end up having fewer quizzes.
  • Participation: 2% (+2% bonus)
    • Class or forum participation: 2%.
      • To get full points, occasionally participate in class discussions or office hours by asking thoughtful on-topic questions or sharing quiz answers.
      • Alternatively, post on the forum at least a few on-topic (not administrative) questions, answers, thoughts, or useful links.
    • Course evaluation: 2% bonus.

Compute Resources

Students enrolled in the course have GPU quota on the HPC3 cluster.

RL Resources

Courses
Books
RL libraries
More resources

Further reading

Imitation Learning