General Information
Catalog description //
Communication //
Videos //
Grading //
Labs //
Textbooks //
Homework //
Collaboration //
Class meetings //
Staff //
Office hours //
Miscellaneous
Course description
Prerequisites:
- CS 70 (or CS 62 Pomona)
- HMC Core Math or Probability/Statistics + Multivariable Calculus + Linear Algebra
Reinforcement learning consists of agents learning to act in an environment that provides intermittent rewards. It's used as the basis of many game-playing agents: Backgammon, Atari games, Go, Chess.
Topics covered include:
- Various reinforcement learning algorithms, including Monte Carlo, SARSA, Q-Learning.
- An introduction to deep reinforcement learning
- An in-depth look at Alpha Zero, the DeepMind algorithm that learned to play Chess and Go by playing against itself.
- Exploration vs. Exploitation
- Monte Carlo methods
- Temporal Difference methods
The course will include both written homework and programming assignments (PA).
Communication
We will distribute assignments on the course web site, and make all announcements through Piazza.
The course web site has the schedule for the term.
Videos
Each lecture will be captured, with screen contents, blackboard recording, and audio.
Links to the lectures will be posted on the
Schedule page within a few days.
You can also subscribe to the
course YouTube playlist.
Due to potential technical difficulties, don't view those saved lectures as a replacement for taking good notes.
Grading policy
Grades in this class will be based on:
- homework (lowest 2 are dropped) (10%)
- in-class quizzes (lowest 2 are dropped) (10%)
- programming assignments (lowest 2 are dropped) (30%)
- 2 midterm exams (lowest is dropped) (30%)
- Comprehensive Final exam (20%)
The course is not graded on a curve. The percentage score will be used to determine a final letter grade as follows:
| Percentage score |
Grade |
| ≥90% |
A |
| ≥80%-<90% |
B |
| ≥65%-<80% |
C |
| ≥50%-<65% |
D |
| <50% |
F |
Pluses and minuses will be attached to the letter grades as the instructor deems
appropriate.
You can see all grades that have been assigned, along with statistics for each assignment, at
Sakai.
Textbooks
This class relies on the following book:
Course readings (as shown on the schedule) should be read before that class, so that you've been exposed
to the topic (and terminology).
After the class, you should re-read thoroughly.
You are responsible for everything covered in the reading, whether explicitly covered in class, unless explicitly noted in the schedule.
Homework
The homeworks are intended to make you think about the lecture topic and/or
get your hands dirty. Homeworks are
the start of lecture (by 1pm) on the specified due dates. I do not grade
your answers for correctness, but merely verify that you put reasonable
effort into them.
No late homework will be accepted, but I'll drop the lowest homework score.
Homework will be assigned and submitted via Gradescope. You'll need to create a PDF (either electronically, or by scanning from a printer).
Further instructions here.
Quizzes
We have short (<10 minute) in-class quizzes almost every week.
These quizzes are based on associated homeworks and on readings for that day's class.
There are no makeups for quizzes (except for a verified medical situation).
The lowest quiz score is dropped to provide you some flexibility for travel, etc.
Programming Assignments
The programming assignments (PAs) are where you get a chance to do hands-on programming
working on reinforcement learning problems.
We'll be using Google Colab, which provides an online Python Jupyter notebook
The lowest PA score will be dropped.
You have a total of 72 late hours for the semester for PAs.
Each hour late in excess of 72 hours will penalize your total PA grade by 1%.
If you need to use late hours for a PA:
- Submit your PA when it is due, even if not complete. That way, if you don't
actually make any progress, you still have something for me to grade.
- Email me telling me that you're planning on submitting late. That way, I don't unnecesssarily
grade the initial submission.
- When your PA is done, resubmit it.
- Email me when you resubmit, specifying how many late hours you are using (or telling me
you just want me to grade the version you submitted at the due date).
- I'll grade your late submission (or version as of the due date if requested) and I'll update the total number of late hours you've used on Sakai.
These late hours are intended for cases where you fall behind due to illness, job interviews, HMC athletic events, deadlines in other classes, etc.
For extensions under extenuating circumstances (e.g., you are sick for a week), I require a letter from one of the student deans.
Collaboration
- All students enrolled in this course are bound by the HMC Honor Code. More information on the HMC Honor Code can be found in the HMC Student Handbook.
- It is your responsibility to determine whether your actions adhere to the HMC Honor Code. If this document does not clarify the legitimacy of a particular action, you should contact the course instructor and request clarification.
- Work you submit for individual assignments should be your own, and you should complete all assignments based on your own understanding of the underlying material. If you work with, or receive help from, another individual on an assignment, provide a written acknowledgement in complete sentences that includes the person's name and the nature of the help.
- When you submit a group assignment, the group should complete all assignments based on your group's understanding of the underlying material. If your group works with, or receive help from another individual or group on an assignment, provide a written acknowledgement in complete sentences that includes the person's name and the nature of the help. The submitted assignment should be based on joint work among all members of the group, and all group members should understand the entire assignment and submittal. If one or more members of the group did not participate in an assignment, provide an explanatory statement as part of the submission..
- This document is not meant to be an exhaustive list of every possible Honor Code violation. Infractions not explicitly mentioned here may still violate the Honor Code.
- Boundaries of Collaboration: verbal collaboration with other students on individual assignments is encouraged. However, all submitted written work should be written by yourself individually, and not a collaborative effort or copied from a common source (e.g., a chalkboard).
- Use of Web Resources: the use of Internet resources to aid in course work is acceptable, as long it does not substitute for an understanding of the course material. Plagiarism and direct copying from online (or any other) sources is strictly prohibited. However, you may not look at source code for labs or homework from the internet or from other students.
- Use of Your Own Work from Previous Semesters: if you have previously attempted this course, you may refer to your work from previous semesters, but may not resubmit it as this semester's coursework.
- Use of Other Course Resources from Previous Semesters: You may reference assignments of this course from previous semesters or tests from previous semesters.
- Retention of Course Resources: assignments and exams from this course may be committed to dorm repositories or otherwise used to help future students.
Class meetings
Lectures will be held on Monday and Wednesday from 1:15pm to 2:30pm
in Shanahan 3425.
Staff
Lecturer
Neil Rhodes
Piazza
We'll use Piazza for questions. Use private posts to me only for specific code examples or personal issues—instead, prefer public posts so everyone can share in the question and answer, and so that other students can answer as well.
Office hours
Monday
- 11:45-12:45 (Zoom office hours URL as posted in Piazza)
- 2:30-3:30 (Same zoom URL as class lecture)
Tuesday
- 8:30-11:00 (Zoom office hours URL as posted in Piazza)
Wednesday:
- 11:45-12:45 (Zoom office hours URL as posted in Piazza)
- 2:30-3:30 (Same zoom URL as class lecture)
During my office hours, if you need to meet with me privately, let me know, and we'll arrange that.
Appointments outside of the listed office hours can be arranged via email.
Miscellaneous
If you need accommodations for a documented disability, please talk to me or contact Brandon Ice, the HMC Student Accommodation Advisor (bice@hmc.edu). You will find information about disability resources on the college website: https://www.hmc.edu/ability. Students from the other Claremont Colleges should contact their home college';s disability officer.
If I learn of a potential violation of the college's gender-based misconduct policy (see https://www.hmc.edu/tix), I am required to report it to Leslie Hughes, the HMC Title IX Coordinator. If you want to speak to someone confidentially, the following resources are available on and off campus: the EmPOWER Center (909-607-2689), the Monsour Counseling Center (909-621-8202), and the McAlister Chaplains (909-621-8685).
HMC https://www.hmc.edu/website-accessibility/
CS 181V: (Reinforcement Learning) home //
Last updated Mon 30 Mar 2020 11:08:54 AM PDT