CS 6803: Topics in NLP - Course Webpage

Course Overview

This graduate-level course discusses recent advancements in the field of NLP. It covers different aspects of Large Langauge Models - the different architectural and design challenges, as well as useful considerations for training such models.

Key Details

Instructor: Dr. Maunendra Sankar Desarkar

Office: CS 506

Class Time: Saturdays, 8-11 am

Syllabus & Grading

Course Materials

We will not be following any specific book for the course. As most of the topics are contemporary in nature and are areas of active research, we will mostly follow contents from recent research papers. Details of the papers will be shared during the lectures.

Grading Breakdown (Tentative)

Component	Weight
In-class quizzes (3-4)	10-20%
Project	40-60%
Semester Exam	30-50%

Prerequisites

CS 5803 (Natural Language Processing)

🗓️ Tentative Course Schedule

Serial No.	Topics	Reading
1	Transformer architecture, different types of positional embeddings: Absolute, relative, rotary	Shared during class
2	Attention Mechanisms: MHA, MQA, GQA, MHLA, Flash Attention	Shared during class
3	Pre-tokenization and Tokenizaton: BPE, Word-piece, Sentence-Piece, Unigram LM, Pruning, ByteT5	Shared during class
4	Multilinguality: Multilingual Pre-Training, Data Sampling	Shared during class
5	Mechanistic Interpretability: LogitLens, PatchScope, Circuits	Shared during class
6	Vision Language Models: CLIP, LLaVa, InstructBLIP	Shared during class
7	State Space Models - S4 Architecture, Mamba	Shared during class
8	Choices during model building: Data Mixing, Mixture of Experts	Shared during class
9	Scaling Laws: Kaplan, Chinchila	Shared during class
10	Agent Foundation Models	Shared during class

Detailed day-wise schedule is not given. Depending on the coverage, discussion on some topics may span over multiple lectures.