Home
Welcome to the Teaching Page of
Mathukumalli Vidyasagar
Fellow of The Royal Society
Distinguished Professor &
SERB National Science Chair
Indian Institute of Technology Hyderabad
Email: M.Vidyasagar@iith.ac.in
An Overview of Reinforcement Learning
02 August to 23 November 2022
Timings:
Tuesdays and Wednesdays, 12:00 noon to 1:30 PM;
Fallback time: Fridays, 12:00 noon to 1:30 PM.
Class schedule:
PDF
Google Meet:
Link
Contents
Lecture NotesSlides of Lectures
Python Codes
Links to Recordings
Lecture Notes
(Updated frequently; check the date and ensure you have the latest version.)
PDF
Slides of Lectures
- About the course PDF
- Topic 1: Introduction PDF
- Topic 2: Markov Reward Processes PDF
- Topic 3: Markov Decision Processes PDF
- Topic 4: Review of Probability PDF
- Topic 5: Stochastic Approximation: Preliminaries PDF
- Topic 6: Stochastic Approximation PDF
- Topic 7: Markov Processes with Absorbing States PDF
- Topic 8: Parametric Approximation Methods PDF
- Topic 9: Parametric Approximation Methods -- Simultaneous Value and Policy Approximation PDF
- Topic 10: Zap Q-Learning PDF
- Topic 11: Finite-Time Stochastic Approximation PDF
- Topic 12: Stochastic Approximation Revisited PDF
- Topic 13: Batch Asynchronous Stochastic Approximation PDF
Python codes (for small problems only)
- Read Me file Text file
- Computing the average reward: Code
- Computing the hitting time and hitting probabilities Code
- Computing the hitting time and hitting probabilities for a randomly generated transition matrix Code
- Computing the hitting time and hitting probabilities for the Snakes and Ladders game Code
- Computing the optimal policy using randomly generated data Code
- Value iteration for the Snakes and Ladders game Code
- Computing the stationary distribution of a Markov chain Code
- Computing the stationary distribution of a Markov chain: Example Code
- Value Iteration Code
- Value iteration, called by the policy iteration routine Code
Some Data Files
- For a generic Markov process example: Transition matrix Reward
- For the Snakes and Ladders Game Transition matrix Reward
Link to the Recordings of Lectures
Note that Lecture 1 is not available.
If you wish to access the link, please send me an email.
The links to the recordings of the lectures are available on my Google
Drive:
https://drive.google.com/drive/folders/1tjbdSbs8qXSHpyhEAQEY2mu9dEuMRiE3u23oIRgMI9cqpI1jVPF4FN_dcmI3uWLIpPyzHTqE?usp=sharing