Speaker : | Matthew Andrews |
Nokia Bell Labs | |
Date: | 18/10/2024 |
Time: | 2:00 pm - 3:00 pm |
Location: | Amphi 6 |
Abstract
In many Reinforcement Learning (RL) environments the state is represented by an image. In such cases, if the RL doesn’t work well, is the problem that the image processing doesn’t recognize the salient features of the image well (i.e. there’s a recognition issue)? Or is the problem that the RL can’t learn the correct action, even though it correctly recognizes the state (i.e. there’s a decision issue)? Or is it a bit of both?
In this talk we’ll discuss two ways to formalize these questions. In the first, we examine how well an agent can learn the “Q-value” of an image, if it is given some explicit examples in a training set. We then compare an agent trained in this way with an agent trained with standard RL techniques. In the second, we decompose the regret of an RL agent into two terms that separately capture the recognition and decision error.
We illustrate our techniques using standard RL environments such as Minigrid and Pong.
Joint work with Alihan Huyuk, Xueyuan She, Atefeh Mohajeri, Ryo Koblitz