Options
Concentration of Contractive Stochastic Approximation and Reinforcement Learning
Journal
Stochastic Systems
Date Issued
2022-12-01
Author(s)
Chandak, Siddharth
Borkar, Vivek S.
Dodhia, Parth
Abstract
Using a martingale concentration inequality, concentration bounds “from time n0 on” are derived for stochastic approximation algorithms with contractive maps and both martingale difference and Markov noises. These are applied to reinforcement learning algorithms, in particular to asynchronous Q-learning and TD(0).
Subjects