EE5350 Error Correcting Codes
Table of Contents
Welcome to EE5350 (Error Correcting Codes). This is an elective course for final-year BTech and MTech/PhD students. A background on linear algebra and probability is essential for taking this course. While not mandatory, it is helpful to have some familiarity with information theory and digital communication. The course will involve a good amount of math and programming in C/C++/python.
Error control codes are the most important building blocks of modern digital communication systems, and were a major contributing factor to the cellular revolution. They also have applications in storage (from early CDs to DNA storage, QR codes and RAID systems), cryptography (secret sharing, post-quantum cryptosystems based on ECCs), quantum computation (Quantum ECCs for fault-tolerant quantum computation), just to name a few. In this course, we will discuss the basic principles of error correction from a combinatorial and algebraic perspective, and then talk about ``modern'' codes based on so-called graphical approaches. We will aim to discuss at least four major families of codes widely used in practice: BCH codes, Reed-Solomon codes, Reed-Muller codes, and LDPC codes. All these have a variety of applications, and LDPC codes are a part of major wireless communication standards.
Classes will be offline, and much of our online interaction (homework submissions, announcements, etc) will be on Google classroom. Invites will be sent to registered students by the first week of August. If you have not received an invite by the second class, please send me an email.
Prerequisites
- Strong foundation in linear algebra/matrices, probability and random processes
- Preferable to have a background on information theory and digital communication.
- Comfort with programming in C/C++/python
1 Assessment (tentative):
Each student will be expected to
- attend classes and solve short in-class quizzes
- participate in class
- solve a mid-term exam
- solve homework assignments, roughly 3 (collaboration is fine as long as it is explicitly declared, but you have to write the solutions in your own words)
- present a paper/do a project/solve a final exam
Homeworks | 30% |
Attendance and class participation | 5% |
In-class quizzes | 10% |
Mid-term exam | 25% |
Final project/paper presentation/final exam | 30% |
2 Instructor:
Name | Dr. Shashank Vatedka |
shashankvatedka@ee.iith.ac.in | |
Office | 446, Academic block C |
3 Class timings:
- Slot P: Mondays 14:30-16:00, Thursdays 16:00-17:50
- Class venue: A-112
4 Primary references:
There is no single textbook for this course. Primarily, the material will be drawn from
- Lecture notes and slides, uploaded here.
- Essential Coding Theory, Venkatesan Guruswami, Atri Rudra and Madhu Sudan (GRS), available online
- Introduction to Coding Theory, Ron Roth (RR), Cambridge University Press, 2006
- "The generalized distributive law", SM Aji and RJ McEliece, IEEE Transactions on Information Theory, 2000
- "Factor graphs and the sum-product algorithm", FR Kshischang, BJ Frey and HA Loeliger, IEEE Transactions on Information Theory, 2001.
- Modern Coding Theory, Tom Richardson and Ruediger Urbanke, Cambridge University Press, 2008. The bible for LDPC codes.
Other references (in no particular order):
- "A mathematical theory of communication", Claude Shannon, Bell systems Tech Journal, 1948. The classic paper that started this field.
- "Channel coding: The road to channel capacity", DJ Costello and GD Forney Jr, Proceedings of the IEEE, 2007. Gives a great historical perspective.
- NPTEL lectures by Prof Vijay Kumar (IISc)
- NPTEL lectures by Prof Andrew Thangaraj (IITM)
- Fundamentals of error correcting codes, WC Huffman and V Pless, Cambridge University Press, 2003
- The theory of error correcting codes, FJ Macwilliams and NJA Sloane, Elsevier/North Holland, 1977
- Error control coding, S Lin and D Costello (2nd ed), Prentice-Hall, 2004
- Fundamentals of convolutional coding, R Johanneson and K Zigangirov, IEEE Press, 1999.
- Information theory, inference, and learning algorithms, David MacKay (soft copy available for free on the author's website). A very nice book, but gives a very different flavour that what will be covered in the course. Suggested for additional reading.
- A student's guide to coding and information theory, Stefan Moser (amazon link).
- Finite fields or Introduction to finite fields and their applications, R Lidl and H Neiderreiter (the first is considered by many as the book on finite fields)
5 Tentative syllabus:
Channel models and coding theory, need for error correction; Basics of fields and vector spaces; Codes, rate, decoding; Bounds on the size of codes: GV, singleton, Plotkin, EB and LP bounds; Basics of finite fields: construction and arithmetic; Algebraic codes: BCH, Reed-Solomon and Reed-Muller codes; Codes on graphs: convolutional codes and Viterbi decoding; Generalized distributive law and factor graphs; LDPC codes and their decoding; Message passing algorithms, belief propagation, density evolution and EXIT charts;
6 Tentative schedule
Some of the recorded classes will be made available in this playlist.
Topic | Date | Notes/links/material | Homeworks |
---|---|---|---|
Introduction to coding theory | Review basic material on linear algebra/probability | ||
- Channels, noise, and need for error correction | Linear Algebra by Gilbert Strang | ||
- Various channel models and error correcting codes | NPTEL course on linear algebra | ||
- Rate, encoding, decoding, complexity | Chapter 1 of RR | ||
- The channel coding theorem for DMCs | Chapter 1 of GRS | ||
- Adversarial models and minimum distance | Class notes 1 | ||
- Recap of linear algebra: fields and vector spaces | Class notes 2 | ||
- Linear codes | Chapter 2 of RR | ||
Bounds on the size of codes | Chapter 4 of RR | ||
- Singleton bound, Hamming bound, sphere packing bound, GV bound | Class notes 3 | ||
- Plotkin bound, EB bound | Chapter 2 of Huffman and Vera Pless, "Fundamentals of ECCs" | ||
- LP bound | |||
- The Hamming code | |||
Finite fields and their arithmetic | Class notes 4 | ||
- Fields and polynomials, Euclidean algorithm | Chapter 3 of RR | ||
- Constructing finite fields | |||
- Subfields, factorization | |||
Algebraic codes | Class notes 5 | ||
- Reed-Solomon codes | Chapter 5 of RR | ||
- BCH codes | Chapter 5 and Chapter 9 of GRS | ||
- Reed-Muller codes | |||
- Cyclic codes | |||
Convolutional codes | Class notes 6 | ||
- Construction and encoding | Lin-Costello | ||
- Viterbi algorithm | |||
- BCJR algorithm and Turbo codes (if time permits) | |||
Generalized distributive law, factor graphs | Aji-McEliece paper | ||
- Marginalization problem of a sum-of-products | |||
- Generalized distributive law | |||
- Factor graphs and message passing algorithms | |||
- Physics interpretation: Free energies and Bethe approximations (if time permits) | |||
Low-Density Parity-Check (LDPC) Codes | Class notes 7 | ||
- Construction and encoding | Richardson-Urbanke (chapters 1-3 essentially cover what we need) | ||
- Constructing the Tanner graph | |||
- Belief propagation decoding of LDPC codes | |||
- Density evolution | |||
- EXIT charts | |||
Other topics | |||
- Based on time and interest |
7 Further reading
Coding theory is a vast topic, and hence there are several topics that we will not be able to cover in class.
Good sources for the latest work on channel coding can be found in the IEEE Transactions on Information Theory, IEEE Transactions on Communications, various IEEE conferences including the annual IEEE International Symposium on Information Theory (ISIT), Information Theory Workshop (ITW). Several good preprints are made available online on the Arxiv preprint server.
The following are representative papers. You can go through papers citing/cited by the following as well.
- Turbo codes, the first class of codes that could practically get close to capacity.
- Convolutional codes and Viterbi decoding (you will need to implement this), precursors to Turbo codes.
- Concatenated codes, another very interesting approach that lets you approach capacity but for very large blocklengths.
- Expander codes are a family of codes that have asymptotically large minimum distance, but do not achieve capacity.
- Spatially coupled LDPC codes are also known to achieve capacity. Other references: ref 1, ref2
- Channel coding in 5G
- List decoding of polar codes
- How to construct polar codes
- Reed Muller codes achieve capacity of the BEC
- Finite blocklength information theory, the paper that was a recent breakthrough.
- List decoding, an interesting variant, also has other algorithmic applications. Also see this.
- Advanced topics on channel coding from these lecture notes, or these are also acceptable.
- Lattice codes achieve the capacity of the AWGN channel
- Physical layer network coding was a hot topic, and lattice codes were a contributing factor
- Coding for DNA storage
- People have used neural networks for channel coding
- Rate splitting for multiple access channels
- Secret sharing
- Codes for distributed storage, also see this and this
- Network coding (also see this and references therein)
- Private information retrieval
- Post-quantum cryptographic schemes
- Codes for the binary deletion channel
8 General comments regarding presentations/reports
In addition to learning cool stuff, you should also learn to present your work. The final presentation is an opportunity to develop your presenting skills. You must be able to convey your ideas and thoughts clearly to another person. Here are some general comments for making good presentations:
- Make sure that you acknowledge all the references/material that you used (papers/code/etc.) While it does not matter much for the purposes of this class, this is very important to keep in mind when you are delivering presentations at bigger venues
- In a presentation, always introduce yourself and your team members/collaborators in the very beginning.
- If much of the presentation is based on on one/two papers/references, explicitly state which references you are using in the first/second slide.
- Avoid saying "We did this/we showed that…" when you are in fact talking about some other authors' work. Unless you have derived something/programmed something yourself, always say "they/the authors did/showed/observed that…"
- A presentation/report is like telling a story: even before preparing the slides/document, imagine that you want to explain the work to a friend in 15-20 mins, identify exactly what you want to convey, and only then proceed.
- Make sure that you explain the problem clearly.
- When reading a paper, you should first understand what the problem is, what makes it challenging, and why it is worthwhile solving the problem. Then, you should identify what is new about the paper, and what the contributions are. To understand the rest of the paper, you should then try to understand the constructions/proof techniques/ideas. Your presentation should also bring out these aspects. See this article and this one on how to read a paper.
- There is no single best rule for presentations, but generally in any presentation, the viewer/audience's attention is maximum in the first 5-10 minutes and then gradually decreases. So it is helpful to get to the punchline in the first few minutes.
- Citations should be treated differently in papers and slides. When you are making a reference to a particular paper/book/website (for e.g., when you say that result A can be found in paper B) in your slides, mention the paper/book/website as a footnote rather than saying [1], and putting the reference in the last slide.
- Do not put too much text on your slides. This is bad practice. Each slide should contain at most 3-5 points. You should absolutely avoid long sentences.
- Do not copy text directly from the paper.
- Do not read out your slides word-to-word.
- Similarly, do not put too many lemmas in a slide. Each slide should convey a single idea. Unless very important, do not put too much math/long derivations in a short presentation. It is more important to convey ideas/intuition. But at the same time, there should be a significant amount of technical content. It is therefore very useful to instead explain using examples.
- Make sure that the fonts/diagrams/labels/plots are clear. Label all the axes on your plots.
- Speak clearly, and make sure that the audio quality is fine. Check your mic volume. Do a couple of dry runs.
- If you are using Google meet to record, then make sure that you "pin" the slides since meet might automatically switch displaying you/your icon instead of the presentation.
- Make sure that you check, and double check the slides and report for typos, technical and grammatical errors.
- If your group has more than one person, then the presentation must be shared equally.