Thesis defense: “Algorithmic and Software Contributions to Graph Mining”

Speaker : Nathan De Lara
Télécom-Paris
Date: 22/09/2020
Time: 9:00 am - 1:00 pm
Location: LINCS Seminars room

Abstract

Since the introduction of Google’s Page Rank method for Web searches in the late 1990s, graph algorithms have been part of our daily lives. In the mid 2000s, the arrival of social networks has amplified this phenomenon, creating new use-cases for these algorithms. Relationships between entities can be of multiple types: user-user symmetric relationships for Facebook or LinkedIn, follower-followee asymmetric ones for Twitter or even user-content bipartite ones for Netflix, Deezer or Amazon. They all come with their own challenges and the applications are numerous: centrality calculus for influence measurement, node clustering for knowledge discovery, node classification for recommendation or embedding for link prediction, to name a few. In the meantime, the context in which graph algorithms are applied has rapidly become more constrained. On the one hand, the increasing size of the datasets with millions of entities, and sometimes billions of relationships, bounds the asymptotic complexity of the algorithms for industrial applications. On the other hand, as these algorithms affect our daily lives, there is a growing demand for explanability and fairness in the domain of artificial intelligence in general. Graph mining is no exception. For example, the European Union has published a set of ethics guidelines for trustworthy AI. This calls for further analysis of the current models and even new ones. This thesis provides specific answers via a novel analysis of not only standard, but also extensions, variants, and original graph algorithms. Scalability is taken into account every step of the way. Following what the Scikit-learn project does for standard machine learning, we deem it important to make these algorithms available to as many people as possible and participate in graph mining popularization. Therefore, we have developed an open-source software, Scikit-network, which implements and documents the algorithms in a simple and efficient way. With this tool, we cover several areas of graph mining such as graph embedding, clustering, and semi-supervised node classification.

Participer à la réunion Zoom
https://telecom-paris.zoom.us/j/99270581285?pwd=VUR5Z3gxN2tqamhJc0xnOXZYcWxUQT09

ID de réunion : 992 7058 1285
Code secret : 143438