Course Syllabus

Course Overview

A graduate-level course covering technical foundations of modern natural language processing (NLP). The course will cast NLP as an application of machine learning, in particular deep learning, and focus on deriving general mathematical principles that underlie state-of-the-art NLP systems today. 

Course survey (due May 6)

Prerequisites for undergraduates: linear algebra (M250), probability (CS206, or M477/S379), data structures (CS112). Recommended: multivariable calculus (M251), machine learning (533).

Syllabus Page

Project Google Sheet (15-minute meeting scheduling link)

Instructor: Karl Stratos (karl.stratos@rutgers.edu)

Instructor Office Hours: Tuesday 4-5pm (Zoom link, passcode "rutgersnlp") 

Teaching Assistant: Wenyue Hua (wh302@scarletmail.rutgers.edu)

Teaching Assistant Office Hours: Thursday 3-4pm (Zoom link, passcode "nlpta")

Textbooks (for optional reading):

LaTeX templates:

Course Schedule

Week 1

Tuesday, January 19

Lecture: General introduction (video, slides)

Entrance Quiz: 3:20-4pm (TA available in the office hour Zoom link during this time window) 

Optional reading: Chapter 1 (Eisenstein); linear algebra review (Kolter)

Week 2

Tuesday, January 26

Lecture: Linear classification (video, slides)

Optional reading: Chapter 2.5, 2.6  (Eisenstein) 

Assignment 1 assigned (due in 3 weeks) 

Jupyter Notebook on projections 

Week 3

Tuesday, February 2

Lecture: Optimization, introduction to deep learning (video, slides)

Optional reading: Notes on feedforward networks (Collins), notes on backpropagation

Jupyter Notebook on separable encodings 

Week 4 

Tuesday, February 9 

Lecture: Feedforward networks, universality, backpropagation (video, slides)

Optional reading: Chapter 3.1-3.3 (Eisenstein), notes on Xavier initialization (Stanford), notes on gradient-based optimization algorithms (Ruder)

Thursday, February 11

Quiz 1: 30 minutes (available 1-6pm)

Week 5

Tuesday, February 16  

Lecture: Convolutional, recurrent and attention-based architectures (video, slides)

Optional reading: Chapter 3.4 (Eisenstein), Olah's blogs on LSTMs and attention, notes on transformers 

Assignment 2 assigned

Assignment 1 due

Week 6

Tuesday, February 23

Lecture: Language models, beam search, text generation (video, slides)

Optional reading: RNN LM PyTorch example, generate function in Hugging Face transformers, top-p/top-k sampling implementation

Week 7 

Tuesday, March 2

Lecture: Conditional language models, machine translation (video, slides)

Optional reading: Chapter 18.1 (Eisenstein), Google NMT and multilingual translation papers, T5 paper  

Thursday, March 4

Quiz 2: 40 minutes (available 1-6pm)

Week 8

Tuesday, March 9

Lecture: Copy mechanism, relation-aware self-attention, hidden Markov models (video, slides)

Optional reading: Gulcehre et al. (2016), Shaw et al. (2018), notes on hidden Markov models (Collins), example of neural HMM (Chui and Rush, 2020) 

Assignment 3 assigned

Assignment 2 due 


Spring Recess (March 12-20)


Week 9

Tuesday, March 23

Lecture: Marginal decoding, conditional random fields (video, slides

Optional reading: Chapter 7.5.3 (Eisenstein), Lample et al. (2016), notes on graphical models (Blei), notes on belief propagation

Week 10

Tuesday, March 30

Lecture: Natural language understanding, pretrained language models (video, slides

Optional reading: The word2vec paper (also a blog), the ELMo paper, the BERT paper, paper analyzing commonsense reasoning performance (Trichelair et al., 2019), paper about effects of pretraining scale (Zhang et al., 2020

Jupyter notebook on how to use BERT 

Thursday, April 1

Quiz 3: 30 minutes (available 1-6pm)

Week 11 

Tuesday, April 6

Lecture: More pretrained transformers, latent-variable generative models (video, slides)

Optional reading: The BART paper, Section 1 and Appendix A of this note, additional notes, VAEs applied to text generation and document hashing   

Project proposal due 

Assignment 3 due 

Week 12 

Tuesday, April 12

Lecture: More variational autoencoders, discrete latent variables (video, slides)

Optional reading: Notes on Gumbel (Appendix A of this note, you may have to refresh the page), Li et al. (2019) 

Week 13

Tuesday, April 20

Lecture: Knowledge-intensive language tasks (video, slides

Optional reading: Notes on noise contrastive estimation, Lee et al. (2019), Cheng et al. (2020), Wu et al. (2020) 

Milestone due

Week 14

Tuesday, April 26

Lecture: Coreference resolution, review (video, slides

Optional reading: Section 4.2 of Marquez et al. (2012), LEA, end-to-end coref (Lee et al., 2017) and its extension, coref with BERT and SpanBERT, CorefQA    


Monday, May 10

Project presentation video and final report due 

Course Summary:

Date Details Due