conftrace_

Bertie Vidgen

30 papers · 2019–2025 · 10 conferences · across top CS/AI conferences

Achievements

Jump to papers ↓
+13 more ↓ 🌈 Renaissance Researcher (6) πŸŒ‰ Interdisciplinary Bridge πŸƒ Academic Marathon (6) 🌍 Conference Polyglot (10) πŸ—ΊοΈ Taxonomy Completionist (36)
πŸƒ Academic Marathon (6) πŸ—ΊοΈ Taxonomy Completionist (36) 🐝 Cross-Pollinator (15) πŸ”¬ Deep Specialist (10) πŸ‘₯ Mega-Team (71) 🀝 Dynamic Duo (13) 🧬 Topic Evolution πŸ† Keyword Champion (2) ❓ The Questioner πŸ’Ž Century Club (30) πŸ—ƒοΈ Keyword Collector (112) ⚑ Prolific Year (11) πŸ”₯ Unstoppable (7)

Conferences

ACL (7) EMNLP (6) NAACL (6) IJCNLP (4) ICML (2) AAAI (1) COLING (1) EACL (1) NIPS (1) SEMEVAL (1)

Papers

LMUNIT: Fine-grained Evaluation with Natural Language Unit Tests EMNLP 2025 SafetyPrompts: A Systematic Review of Open Datasets for Evaluating and Improving Large Language Model Safety AAAI 2025 The PRISM Alignment Dataset: What Participatory, Representative and Individualised Human Feedback Reveals About the Subjective and Multicultural Alignment of Large Language Models NIPS 2024 XSTest: A Test Suite for Identifying Exaggerated Safety Behaviours in Large Language Models NAACL 2024 Position: TrustLLM: Trustworthiness in Large Language Models ICML 2024 Position: Near to Mid-term Risks and Opportunities of Open-Source Generative AI ICML 2024 SemEval-2023 Task 10: Explainable Detection of Online Sexism SEMEVAL 2023 Improving the Detection of Multilingual Online Attacks with Rich Social Media Data from Singapore ACL 2023 SemEval-2023 Task 10: Explainable Detection of Online Sexism ACL 2023 The Past, Present and Better Future of Feedback Learning in Large Language Models for Subjective Human Preferences and Values EMNLP 2023 Two Contrasting Data Annotation Paradigms for Subjective NLP Tasks NAACL 2022 Is More Data Better? Re-thinking the Importance of Efficiency in Abusive Language Detection with Transformers-Based Active Learning COLING 2022 Multilingual HateCheck: Functional Tests for Multilingual Hate Speech Detection Models NAACL 2022 Hatemoji: A Test Suite and Adversarially-Generated Dataset for Benchmarking and Detecting Emoji-Based Hate NAACL 2022 Handling and Presenting Harmful Text in NLP Research EMNLP 2022 An Expert Annotated Dataset for the Detection of Online Misogyny EACL 2021 Deciphering Implicit Hate: Evaluating Automated Detection Algorithms for Multimodal Hate IJCNLP 2021 Findings of the WOAH 5 Shared Task on Fine Grained Hateful Memes Detection IJCNLP 2021 Introducing CAD: the Contextual Abuse Dataset NAACL 2021 Dynabench: Rethinking Benchmarking in NLP NAACL 2021 Learning from the Worst: Dynamically Generated Datasets to Improve Online Hate Detection IJCNLP 2021 Findings of the WOAH 5 Shared Task on Fine Grained Hateful Memes Detection ACL 2021 Deciphering Implicit Hate: Evaluating Automated Detection Algorithms for Multimodal Hate ACL 2021 Learning from the Worst: Dynamically Generated Datasets to Improve Online Hate Detection ACL 2021 HateCheck: Functional Tests for Hate Speech Detection Models IJCNLP 2021 HateCheck: Functional Tests for Hate Speech Detection Models ACL 2021 Recalibrating classifiers for interpretable abusive content detection EMNLP 2020 Detecting East Asian Prejudice on Social Media EMNLP 2020 Online Abuse and Human Rights: WOAH Satellite Session at RightsCon 2020 EMNLP 2020 Challenges and frontiers in abusive content detection ACL 2019