Tommaso Caselli

50 papers · 2010–2026 · 8 conferences · across top CS/AI conferences

Achievements

+11 more ↓

🏃 Academic Marathon (15) 🐝 Cross-Pollinator (8) 🧭 Keyword Pioneer 🌍 Conference Polyglot (8) 🌈 Renaissance Researcher (7)

🌈 Renaissance Researcher (7) 🌉 Interdisciplinary Bridge 🗺️ Taxonomy Completionist (65) 🔬 Deep Specialist (19) 🏆 Keyword Champion (2) 🧬 Topic Evolution 💎 Century Club (49) 🗃️ Keyword Collector (161) 🔥 Unstoppable (6) ⚡ Prolific Year (8) ❓ The Questioner (3)

Conferences

ACL (11) EMNLP (9) SEMEVAL (9) COLING (7) IJCNLP (7) NAACL (3) AACL (2) EACL (2)

Top co-authors

Ali Hürriyetoğlu (5) Malvina Nissim (5) Michael Granitzer (4) Valerio Basile (4) Jelena Mitrović (4) Chiara Zanchi (3) Nelleke Oostdijk (3) Hylke van der Veen (3) Fiona Anting Tan (3) Hansi Hettiarachchi (3)

Research topics

Applications (1) Linguistics (1) Digital Humanities (1)

Keywords

text classification (17) transfer learning (6) social media (5) hate speech detection (5) multilingual model (4) abusive language detection (4) large language model (4) offensive language detection (4) multilingual nlp (3) bias mitigation (3) sentiment analysis (3) event extraction (3) dutch language (3) pretrained language model (2) few-shot learning (2) domain adaptation (2) pre-trained language model (2) misogyny detection (2) named entity recognition (2) multi-label classification (2)

Papers

Lexical Popularity: Quantifying the Impact of Pre-training for LLM Performance EACL 2026 Simulating Identity, Propagating Bias: Abstraction and Stereotypes in LLM-Generated Text EMNLP 2025 TEXT-CAKE: Challenging Language Models on Local Text Coherence COLING 2025 Learning from Disagreement: Entropy-Guided Few-Shot Selection for Toxic Language Detection ACL 2025 HODIAT: A Dataset for Detecting Homotransphobic Hate Speech in Italian with Aggressiveness and Target Annotation ACL 2025 The “r” in “woman” stands for rights. Auditing LLMs in Uncovering Social Dynamics in Implicit Misogyny EMNLP 2025 Language is Scary when Over-Analyzed: Unpacking Implied Misogynistic Reasoning with Argumentation Theory-Driven Prompts EMNLP 2024 RECESS: Resource for Extracting Cause, Effect, and Signal Spans AACL 2023 SKAM at SemEval-2023 Task 10: Linguistic Feature Integration and Continuous Pretraining for Online Sexism Detection and Classification SEMEVAL 2023 RECESS: Resource for Extracting Cause, Effect, and Signal Spans IJCNLP 2023 Dynamic Stance: Modeling Discussions by Labeling the Interactions EMNLP 2023 WikiBio: a Semantic Resource for the Intersectional Analysis of Biographical Events ACL 2023 SKAM at SemEval-2023 Task 10: Linguistic Feature Integration and Continuous Pretraining for Online Sexism Detection and Classification ACL 2023 Benchmarking Offensive and Abusive Language in Dutch Tweets ACL 2023 Dead or Murdered? Predicting Responsibility Perception in Femicide News Reports AACL 2022 SocioFillmore: A Tool for Discovering Perspectives ACL 2022 How about Time? Probing a Multilingual Language Model for Temporal Relations COLING 2022 Event Causality Identification with Causal News Corpus - Shared Task 3, CASE 2022 EMNLP 2022 Dead or Murdered? Predicting Responsibility Perception in Femicide News Reports IJCNLP 2022 RUG-1-Pegasussers at SemEval-2022 Task 3: Data Generation Methods to Improve Recognizing Appropriate Taxonomic Word Relations NAACL 2022 “Zo Grof !”: A Comprehensive Corpus for Offensive and Abusive Language in Dutch NAACL 2022 RUG-1-Pegasussers at SemEval-2022 Task 3: Data Generation Methods to Improve Recognizing Appropriate Taxonomic Word Relations SEMEVAL 2022 Fighting the COVID-19 Infodemic: Modeling the Perspective of Journalists, Fact-Checkers, Social Media Platforms, Policy Makers, and the Society EMNLP 2021 A Multilingual Approach to Identify and Classify Exceptional Measures against COVID-19 EMNLP 2021 MultiLexNorm: A Shared Task on Multilingual Lexical Normalization EMNLP 2021 DALC: the Dutch Abusive Language Corpus ACL 2021 HateBERT: Retraining BERT for Abusive Language Detection in English ACL 2021 PROTEST-ER: Retraining BERT for Protest Event Extraction IJCNLP 2021 The Corpora They Are a-Changing: a Case Study in Italian Newspapers IJCNLP 2021 Guiding Principles for Participatory Design-inspired Natural Language Processing IJCNLP 2021 HateBERT: Retraining BERT for Abusive Language Detection in English IJCNLP 2021 DALC: the Dutch Abusive Language Corpus IJCNLP 2021 Guiding Principles for Participatory Design-inspired Natural Language Processing ACL 2021 Fighting the COVID-19 Infodemic with a Holistic BERT Ensemble NAACL 2021 PROTEST-ER: Retraining BERT for Protest Event Extraction ACL 2021 The Corpora They Are a-Changing: a Case Study in Italian Newspapers ACL 2021 GruPaTo at SemEval-2020 Task 12: Retraining mBERT on Social Media and Fine-tuned Offensive Language Models COLING 2020 Topic and Emotion Development among Dutch COVID-19 Twitter Communities in the early Pandemic COLING 2020 GruPaTo at SemEval-2020 Task 12: Retraining mBERT on Social Media and Fine-tuned Offensive Language Models SEMEVAL 2020 Crowdsourcing StoryLines: Harnessing the Crowd for Causal Relation Annotation COLING 2018 Proceedings of the Workshop Events and Stories in the News 2018 COLING 2018 The Content Types Dataset: a New Resource to Explore Semantic and Functional Characteristics of Texts EACL 2017 VUACLTL at SemEval 2016 Task 12: A CRF Pipeline to Clinical TempEval SEMEVAL 2016 SPINOZA_VU: An NLP Pipeline for Cross Document TimeLines SEMEVAL 2015 SemEval-2015 Task 9: CLIPEval Implicit Polarity of Events SEMEVAL 2015 FBK-TR: SVM for Semantic Relatedeness and Corpus Patterns for RTE SEMEVAL 2014 Automatic Domain Assignment for Word Sense Alignment EMNLP 2014 FBK-TR: Applying SVM with Multiple Linguistic Features for Cross-Level Semantic Similarity SEMEVAL 2014 Sourcing the Crowd for a Few Good Ones: Event Type Detection COLING 2012 SemEval-2010 Task 13: TempEval-2 SEMEVAL 2010