Combatting toxicity in on-line conversations

When

17/04/2019

10:30 am-11:30 am

Ioannis Pavlopoulos

Athens University of Economics and Business

Where

Paris-Rennes Room (EIT Digital)
23 avenue d'Italie, 75013 Paris

Watch on Youtube

Event Type

Abusive behaviour in online social media platforms (a.k.a. hate speech, toxicity, cyberbullying) has forced major companies to hire hundreds of moderators, or even buy a whole company to deal with the problem. Toxicity detection (i.e., classify a post as toxic or not) has recently gained popularity, with two very successful workshops being organized and an international challenge attracting thousands of systems. In previous work, it was shown that Recurrent Neural Networks (RNNs) achieve state of the art performance for the task of automatic toxicity detection (Pavlopoulos et al. 2017a), while incorporating classification-specific attention mechanisms and user embeddings improved further the overall performance of the RNNs (Pavlopoulos et al. 2017b). Interestingly, the classification-specific attention mechanism highlights suspicious words for free, without including highlighted words in the training data (Pavlopoulos et al. 2017c). Despite this recent work, toxicity detection systems suffer from two major shortcomings. First, although most abusive posts appear within conversations (e.g., utterances, posted as replies to other posted utterances), the structure of the conversation is currently ignored and systems are solely based on the text of each single post, in isolation, for their decisions. Second, systems only detect abusive posts; that is, current technology does not help users modify their posts themselves to avoid being abusive. Such systems are primarily used to censor online conversations on a platform, which often results in abusive users migrating to another platform (Chandrasekharan et al. 2018). Instead, it has been argued, in both academic (Zhang et al. 2018) and industrial circles that a more fruitful approach is to help users improve their posts in on-line conversations (e.g., by suggesting non-abusive rewrites). To address the first problem current research investigates a) the compilation of taxonomies of context-aware toxicity and b) the creation of datasets of toxic utterances (context aware). To address the second problem we investigate the creation of datasets with annotations on the word level to study deeper the various forms of toxicity and the difficulty of its rephrasing.