Exploiting Partial System Knowledge in Reinforcement Learning for Admission Control and Electricity Storage Optimization

When

17/01/2025

2:00 pm-5:00 pm

Lucas Weber

Inria

Where

Inria Paris
48 rue Barrault, Paris, 75013

Event Type

PhD Defense

This thesis exploits partial system knowledge to design more efficient reinforcement learning (RL) algorithms for three problems: admission control (1), electricity storage optimization (2), and the acceleration of bias function computation (3).

For (1), the system is modeled as an M/M/c/S queue with m job classes. We propose a model-based algorithm, named UCRL-AC, with a finite-time regret bound dominated by O(S\log T + \sqrt{mT \log T}), where T is the total running time. UCRL-AC exploits the queuing structure by learning the arrival rates.

For (2), we design an RL algorithm that minimizes energy and demand charges by controlling a battery. The knowledge of the battery dynamics allows an efficient offline exploration, which enables fast training with minimal data. The algorithm is tested on real-world data.

For (3), we show that for a fixed policy, the bias function computation can be accelerated through the knowledge of eigenvalues of the transition probability matrix.

Composition du jury:

Urtzi Ayesta, IRIT (rapporteur)
Giovanni Neglia, Centre Inria d’Université Côte d’Azur (rapporteur)
Johanne Cohen, LISN – Université Paris-Saclay (examinatrice)
Bruno Gaujal, Inria Grenoble (examinateur)
Alain Jean-Marie, Inria Montpellier (examinateur)
Lorenzo Maggi, NVIDIA (examinateur)
Ana Buši?, Inria Paris (directrice de thèse)
Jiamin Zhu, IFP Energies Nouvelles (co-encadrante de thèse)
Tristan Charrier, AMIAD (membre invité, superviseur DGA)