Comparing the modelling powers of RNN and HMM

Speaker : Achille Salaün
Date: 11/03/2020
Time: 2:00 pm - 3:00 pm
Location: Paris-Rennes Room (EIT Digital)


Recurrent Neural Networks (RNN) and Hidden Markov Models (HMM) are popular models for processing sequential data and have found many applications such as speech recognition, time series prediction or machine translation. Although both models have been extended in several ways (eg. Long Short Term Memory and Gated Recurrent Unit architectures, Variational RNN, partially observed Markov models), their theoretical understanding remains partially open. In this context, our approach consists in classifying both models from an information geometry point of view. More precisely, both

models can be used for modeling the distribution of a sequence of random observations from a set of latent variables; however, in RNN the latent variable is deterministically deduced from the current observation and the previous latent variable, while in HMM the set of (random) latent variables is a Markov chain. In this paper, we first embed these two generative models into a generative unified model (GUM). We next consider the subclass of GUM models which yield a stationary Gaussian observations probability distribution function (pdf). Such pdf are characterized by their covariance sequence; we show that the GUM model can produce any stationary Gaussian distribution with geometrical covariance structure. We finally discuss about the modeling power of the HMM and RNN submodels, via their associated observations pdf: some observations pdf can be modeled by a RNN, but not by an HMM, and vice versa; some can be produced by both structures, up to a reparameterization.