CS 159: Spring 2020

Logistics | Schedule | Grading | Policies | Helpful Links

Introduction

This course will introduce you to a broad range of topics in the area of natural language processing including language modeling, part of speech tagging, spelling correction, morphology, syntactic parsing, semantics and machine translation. If time permits, we may also cover speech recognition, natural language generation or discourse systems.

Course Goals

By the end of the course you will:

Class Information

Professor: Julie Medero
Office: Olin 1269
Office Hours: Mondays 1:15-3:00 and Fridays 1-2 in Olin 1269

Grutors: Celela Chen and Celine Park
Grutoring hours: Wednesdays 3-5 (LAC computer lab)

Class time: Tuesday and Thursday 9:35-10:50am
Class room: SHAN 1480

GitHub organization: hmc-cs159-spring2020

Prerequisites:

Textbooks

You should not purchase any textbooks this semester.

Tentative schedule

Readings

There will be reading assigned each week, and you are expected to complete that reading before class each Tuesday. In class time will involve individual and group problem sessions, discussions, and other activities that depend on you having done the reading. You will be asked to fill out a worksheet each Tuesday; if you’d like to print your own copy and make notes as you read, you can bring it to class with you.

Assignments

Programming assignments will typically be available on Thursdays before class and due on the following Wednesday evening at 10:00pm. In the second half of the semester, your programming work will focus on a large project. There will be several intermediate deadlines.

WEEK DAY ANNOUNCEMENTS TOPIC & READING LABS & PROJECTS   
1

Jan 21

 

Introduction, Regular Expressions, Encodings

Handouts

Lab 1

Jan 23

 
2

Jan 28

 

Handouts:

Tokenization, Normalization, Segmentation

Everyone:

Options:

Lab 2

Jan 30

 
3

Feb 04

 

Handouts:

N-Grams and Smoothing

Everyone:

Options:

Lab 3

Feb 06

 
4

Feb 11

 

Handouts:

Vector Semantics

Everyone:

Options:

Lab 4

Feb 13

 
5

Feb 18

 

Handouts

Word Sense Disambiguation

Everyone:

Options:

Pick a paper to present from last year's Wordnet Conference

Lab 5

Feb 20

 
6

Feb 25

 

Part of Speech Tagging

Everyone:

Options:

Lab 6

Feb 27

 
7

Mar 03

 

Lab 7

Mar 05

 
8

Mar 10

 

(TBD)

Mar 12

First exam

 

Mar 17

Spring Break

Mar 19

9

Mar 24

 

Lab 8

Mar 26

 
10

Mar 31

 

(TBD)

Apr 02

 
11

Apr 07

 

Final Project
TIRA tutorial

Apr 09

 
12

Apr 14

 

(TBD)

Apr 16

 
13

Apr 21

 

Apr 23

 
14

Apr 28

 

(TBD)

Apr 30

Second exam

 

May 08

Final project due at 12pm

Grading

Your final grade will be calculated as follows:

Final grades are calculated according to the below grading scale:

   
A 93
A- 90
B+ 87
B 83
B- 80
C+ 77
C 73
C- 70
D+ 67
D 65

Final grades are truncated, not rounded. For example, an 82.8 will receive a B-.

Programming language

Assignments will presuppose knowledge of python3. You will almost certainly end up learning some Perl and bash scripting, but you are not expected to know this.

Please make sure that each program you turn in has:

I expect that you will be using python3 for all of your lab assignments, but if you would like to use something different, you are welcome to come talk to me about your plan.

Accessing the CS labs after hours

All students enrolled in CS courses are eligible for 24-hour access to Olin and the CS labs (Beckman 102 and 104). The code for the lab doors will be shared on the course Piazza site. If you are an off-campus student who does not have access to the building, you should visit the HMC Facilities & Maintenance office in the basement of the Platt Campus Center to get access added to your card.

Policies

Assignment Extension Policy

You have three late days that can be used on weekly labs or a final project component at any point during the semester without penalty. You can use all three late days on one assignment, or split them across multiple assignments. To use a late day, you must email me after you have completed the assignment and pushed to your repository.

I encourage you to work together on your assignments for this class. Weekly labs and the final project can be done in groups of up to three. If you work in a group, only one of you needs to use your late day(s).

Illness

If you get sick during the term, please notify me immediately, even if you think that being sick will not affect your ability to complete your assignments. You should also notify me any time that you’re sick enough to miss any classes or find that your performance is below par for any reason.

Classroom Environment

As your instructor, I am committed to creating a classroom environment that welcomes all students, regardless of race, gender, social class, religious beliefs, etc. We all have implicit biases, and I will try to continually examine my judgments, words and actions to keep my biases in check and treat everyone fairly. I expect that you will do the same, and that you will let me know if there is anything I can do to make sure everyone is encouraged to succeed in this class.

Reminder: The Honor Code

All students—even those from other colleges—are expected to understand and comply with Harvey Mudd College’s Honor Code. If you haven’t already done so, you must read, sign, and abide by the computer-science department’s interpretation of the Honor Code to participate in this course. Specifically:

These principles apply to all methods and media of discussion or exchange (voice, writing, email, etc.).

Academic Accommodations

If you anticipate or experience academic barriers based on a disability (including mental health, chronic or temporary medical conditions), please let me know immediately so that we can privately discuss options. Any student with a documented disability who requires reasonable accommodations should contact their home college’s disability officer: