Transformer models in Artificial Intelligence for Natural Language Processing

Speaker : Léo Laugier
Télécom Paris
Date: 09/10/2019
Time: 10:30 am - 12:00 pm
Location: Telecom Paristech, I304 (3rd floor)

Abstract

We’ll explore recent Deep Learning models for Natural Language Processing based on the (“post-Recurrent Neural Network”) Transformer architecture described in Attention Is All You Need (Vaswani et al., 2017). 
We’ll understand the intuition behind this architecture and how it was applied to supervised Seq2Seq tasks. 
Then we’ll see how Devlin et al. (2018) pre-trained a deep bidirectional transformer called BERT, a model producing state-of-the-art results on several Natural Language Understanding tasks.
If time permits, we’ll dig into derived models of transformer such as Open AI GPT-2 (Radford et al., 2019). GPT-2 achieves astounding results in Natural Language Generation and has been reported as “the text generator performing too well to be released” in the press…

Slides to the presentation