HYDERABAD EECS THEORY SEMINARS (HyTS)

Talk by Dr. Prasad Krishnan

Venue: A-221, IITH
Time: 12-1pm on 18th October, 2024

Title: An Information-theoretic look at storing data on shotgun-sequenced DNA

Abstract: In shotgun sequencing, the input string (typically, a long DNA sequence composed of nucleotide bases) is sequenced as multiple overlapping fragments of much shorter lengths (called reads). Modelling the shotgun sequencing pipeline as a communication channel for DNA data storage, the capacity of this channel was identified in a recent work, assuming that the reads themselves are noiseless substrings of the original sequence. Modern shotgun sequencers however also output quality scores for each base read, indicating the confidence in its identification. Bases with low quality scores can be considered to be erased. Motivated by this, we consider the shotgun sequencing channel with erasures, where each symbol in any read can be independently erased with some probability. We identify achievable rates for this channel, using a random code construction and a decoder that is inspired by typicality arguments in information theory.

Bio: Prasad Krishnan received the B.E. degree from the College of Engineering at Guindy, Anna University, in 2007, and the Ph.D. degree from the Department of Electrical Communication Engineering, Indian Institute of Science, Bangalore, in 2014. Since 2014, he has been with the Signal Processing and Communications Research Centre, International Institute of Information Technology Hyderabad (IIIT Hyderabad) as a faculty member. His current research interests are in Algebraic Coding Theory, Private Information Retrieval, and Codes for DNA data storage.

First HyTS: August 21, 2024

Venue:

  • 9.30am-12.45pm: C-LH6, IITH

  • 1pm-5pm : LH12, Lecture Hall Complex, IITH

The tentative schedule for the workshop is given below:

Time Event
9.30-10.15 Shashank Vatedka: Sphere Packings and List Decoding in Euclidean Space Slides
10.15-11.00 Myna Vajha: Two-way generalization of a burst erasure correcting code Slides
11.00-11.15 Tea Break
11.15-12.00 Lakshmi Prasad Natarajan: Recursive Subproduct Codes (and Berman Codes)
12.00-12.45 Mrinmoy Datta:On the second and third weights of Projective Reed-Muller Codes
12.45-15.30 Break
15:30-16:30 Alexander Barg: Smoothing of codes, uniform distributions, and applications Slides

Speaker Bios and Talk Abstracts:

Shashank Vatedka: Sphere Packings and List Decoding in Euclidean Space

Abstract: What is the maximum number of non-overlapping \(n\)-dimensional spheres that you can pack in a given volume? This is the classical sphere-packing problem for which the solution is unknown except for some special cases. In this talk, I will discuss a generalization of this called the \(L\)-packing problem, wherein we allow at most \(L\) spheres to intersect. When \(L=1\), this reduces to the classical sphere packing problem. I will discuss connections to communications, coding theory, and some bounds on the maximum achievable rate, density of \(L\)-packings when \(n\) is large. Parts of this talk are based on joint work with Yihan Zhang (IST Austria).

Bio: Shashank Vatedka received his MSc (Engg) and PhD from the Indian Institute of Science Bengaluru. After postdoctoral stints at the Institute of Network Coding (The Chinese University of Hong Kong), and Telecom Paris, he joined the Indian Institute of Technology Hyderabad where he is currently an Assistant Professor. His research interests are Information Theory and Coding, with applications to Statistical Inference, Data Compression and Security.

Myna Vajha: Two way generalization of a burst erasure correcting code

Abstract: In this talk, we will present two generalizations of a binary code introduced by Hollmann and Tolhuizen (HT) that can correct any burst erasure including a wrap-around burst. The HT code is constructed by recursively concatenating identity matrices of appropriate sizes, either column-wise or row-wise, depending on the parameters. HT codes were initially introduced in the context of constructing streaming codes capable of recovering from a burst of size \(b\) within a delay constraint of \(\tau\). The first generalization of HT code allows us to construct a streaming code that can recover from either a random erasure of size \(a\) or a burst erasure of size \(b\) within a delay constraint of \(\tau\). The second generalization of the HT code is relevant in the context of a three-node relay setting, where both links experience at most one burst erasure of size \(b\). Burst erasure in the source-relay link can lead to message symbols being transmitted out of order at the relay. This is joint work with Vinayak Ramkumar, Nikhil Krishnan, P. Vijay Kumar.

Bio: Myna Vajha is an Assistant Professor at the EE department, IIT Hyderabad. She received BTech. (2011) and M.Tech. (2013) degrees from IIT Kharagpur and University of Southern California respectively, and the Ph.D. degree from IISc Bangalore in 2020. Post PhD, Myna worked on physical layer algorithms for WiFi at Qualcomm Wireless R&D, Bangalore. Her current research interests are in the areas of coding and information theory and systems design of communication systems.

Lakshmi Prasad Natarajan: Title: Recursive Subproduct Codes (and Berman Codes)

Abstract: We introduce a family of binary linear codes identified as subcodes of m-dimensional product codes which includes RM codes as a special case. This family also includes the dual of a class of codes introduced by Berman in 1967. Recursive subproduct codes provide a wider range of block lengths compared to RM codes while sharing several of its structural properties. The special case of dual Berman codes achieves the capacity of the binary erasure channel (in the sense of vanishing bit error rate).

Bio: Lakshmi Prasad Natarajan received the B.E. degree in electronics and communication from the College of Engineering, Guindy, in 2008, and the Ph.D. degree from the Indian Institute of Science, Bengaluru, in 2013. From 2014 to 2016, he held a post-doctoral position at the Department of Electrical and Computer Systems Engineering, Monash University, Australia. He is a Faculty Member with the Department of Electrical Engineering, Indian Institute of Technology Hyderabad. His primary research interests include coding techniques and information theory for communication systems.

Mrinmoy Datta:On the second and third weights of Projective Reed-Muller Codes

Abstract: It is well-known that the determination of Hamming weights of codewords belonging to the projective Reed-Muller codes PRM(d, m) are equivalent to the determination of the number of \(F_q\)-rational points on a given projective hypersurface of degree \(d\) and dimension \(m-1\). The highest number of such points on hypersurface, and consequently, the minimum weight of PRM (d, m) is known, thanks to Serre. In this talk, we will explore the question of determining the second and third minimum weights of the Projective Reed-Muller codes. We will revisit several results due to Sboui, Rodier, and Geil, among others, to this end. Finally, we will present some recent results concerning these problems.

Bio: Mrinmoy Datta is currently an Assistant Professor at the Department of Mathematics at IIT Hyderabad. He obtained his Masters degree from ISI Banglore and PhD from IIT Bombay. He held post-doctorial positions at Technical University of Denmark (DTU), and at UiT Norges Arktiske Universitet. He works in the area of Algebraic Geometry over Finite Fields with applications to Linear Error-Correcting Codes.

Alexander Barg: Smoothing of codes, uniform distributions, and applications

Abstract: The action of a noise operator on a binary code transforms it into a distribution on the Hamming space . We aim to characterize the cases when the output distribution is close to the uniform distribution on the space, as measured by Renyi divergence of order alpha>1. A version of this question is known as the channel resolvability problem in information theory, and it has implications for security guarantees in wiretap channels, error correction, discrepancy, and other problems. In the first part of the talk, we derive expressions for the minimum rate of codes required to attain asymptotically perfect smoothing and give achievability results for the rate of Reed-Muller codes used on the wiretap channel. In the second part we consider a related problem of linear hashing, proving estimates for the expected p-divergence from the uniform distribution over the ensemble of random linear codes. In the third part we prove some impossibility results for the worst-to-average case reduction in the problem of Learning Parity with Noise using code smoothing. The talk is based on joint works with Madhura Pathegama.

Bio: Alexander Barg (Fellow, IEEE) is currently a Professor with the Department of Electrical and Computer Engineering and the Institute for Systems Research (ISR), University of Maryland, College Park, MD, USA. His research interests include information and coding theory, applied probability, and algebraic combinatorics. He received the 2015 Information Theory Society Paper Award. He was a Plenary Speaker at the 2016 IEEE International Symposium on Information Theory, Barcelona, Spain. From 2018 to 2019, he served as the Editor-in-Chief for this IEEE Transactions. He is currently the Editor-in-Chief of Foundations and Trends in Communications and Information Theory. He is honoured with the 2024 IEEE Richard Hamming medal.