CS 6803: Topic in NLP

Aug-Nov 2025 | Department of Computer Science

Course Overview

This graduate-level course discusses recent advancements in the field of NLP. It covers different aspects of Large Langauge Models - the different architectural and design challenges, as well as useful considerations for training such models.

Key Details

Instructor: Dr. Maunendra Sankar Desarkar

Office: CS 506

Class Time: Saturdays, 8-11 am


Syllabus & Grading

Course Materials

Grading Breakdown (Tentative)

Component Weight
In-class quizzes (3-4) 10-20%
Project 40-60%
Semester Exam 30-50%

Prerequisites

CS 5803 (Natural Language Processing)


🗓️ Tentative Course Schedule

Serial No. Topics Reading
1 Transformer architecture, different types of positional embeddings: Absolute, relative, rotary Shared during class
2 Attention Mechanisms: MHA, MQA, GQA, MHLA, Flash Attention Shared during class
3 Pre-tokenization and Tokenizaton: BPE, Word-piece, Sentence-Piece, Unigram LM, Pruning, ByteT5 Shared during class
4 Multilinguality: Multilingual Pre-Training, Data Sampling Shared during class
5 Mechanistic Interpretability: LogitLens, PatchScope, Circuits Shared during class
6 Vision Language Models: CLIP, LLaVa, InstructBLIP Shared during class
7 State Space Models - S4 Architecture, Mamba Shared during class
8 Choices during model building: Data Mixing, Mixture of Experts Shared during class
9 Scaling Laws: Kaplan, Chinchila Shared during class
10 Agent Foundation Models Shared during class

Detailed day-wise schedule is not given. Depending on the coverage, discussion on some topics may span over multiple lectures.