Skip to main content Link Menu Expand (external link) Document Search Copy Copied

Natural Language Understanding and Computational Semantics

DS-GA 1012/LING-GA 1012, Spring 2023
New York University

This is an advanced master’s level course in natural language processing, with focus on the representation of meaning at various linguistic levels. Students will learn basic techniques in deep learning, transfer learning, and in-context adaptation while being introduced to natural language understanding tasks such as text classification, syntactic and semantic parsing, and question answering. The course will culminate in an original research project completed by the student in collaboration with two to three classmates.

Course Staff

Instructor

  • Sophie Hao she/her (NYU email: sophie.hao)

Section Leaders

  • Lorena Piedras she/her (NYU email: lp2535)
  • Namrata Mukhija she/her (NYU email: nm3571)

Grader

  • Artie Shen (NYU email: ys1001)

Logistics

All class sessions take place in person in Room G08 of 12 Waverly Place. They will also be live-streamed on Zoom and recorded.

Lectures

Tuesdays, 10:00–11:40, with Sophie (Zoom)

Lab

Thursdays, 11:15–12:05, with Lorena or Namrata (Zoom)

Office Hours

Office hours take place in person at 60 5th Ave.

Tuesdays, 3:00–4:00, with Namrata in Room 763
Thursdays, 12:30–1:30, with Lorena in Room 763
Fridays, 11:00–12:00, with Sophie in Room 700

Prerequisites

The recommended prerequisite for this course is Natural Language Processing with Representation Learning (DS-GA 1011). You will still be able to register for the course if you have not taken DS-GA 1011, but please be aware that this is an advanced course. Students are expected to have seen most of the following concepts before.

Calculus and Linear Algebra

Partial derivatives, gradients, vectors, matrices, matrix multiplication, vector spaces

Probability and Statistics

Probability distributions, conditional probabilities, Bayes’s theorem, linear regression

Machine Learning and Data Science

Features (discrete vs. continuous), optimization, train/dev/test, dimensionality reduction (e.g., PCA)

Python Programming

Basic syntax, iterables/comprehension, Jupyter notebooks, package managers (e.g., pip), modules, object-oriented programming, data types

Natural Language Processing

Tokenization, vector semantics, language modeling

Since this is a graduate-level course with students from a diverse array of backgrounds (data science, computer science, linguistics, and undergrads), many students will be unfamiliar with one or more of the above topics. This is okay, as long as you feel comfortable looking up anything that you don’t understand or asking for help when necessary.