Zap Stochastic Approximation and Reinforcement Learning

Speaker : François Durand
Nokia Bell Labs France
Date: 07/10/2020
Time: 11:00 am - 12:00 pm
Location: Paris-Rennes Room (EIT Digital)


Many reinforcement learning problems can be seen from the point of view of stochastic approximation. Unfortunately, classic stochastic approximation algorithms, such as Robbins-Monro, may have an infinite asymptotic variance. The class of “zap” algorithms aim at solving that problem. We then examine the application of zap algorithms to reinforcement learning, with the example of zap Q-learning.