Marcos Zampieri

109 papers · 2014–2026 · 10 conferences · across top CS/AI conferences

Achievements

+13 more ↓

🌍 Conference Polyglot (10) 🐣 Hot Topic Early Bird 🧭 Keyword Pioneer 🌉 Interdisciplinary Bridge 🏃 Academic Marathon (11)

🧭 Keyword Pioneer 🐣 Hot Topic Early Bird 🏃 Academic Marathon (11) 🤝 Dynamic Duo (26) 👥 Mega-Team (36) 🔬 Deep Specialist (28) 🏆 Keyword Champion (12) 🚀 Conference Pioneer ⚡ Prolific Year (21) 🗃️ Keyword Collector (243) 💎 Century Club (103) 🔥 Unstoppable (8) ❓ The Questioner

Conferences

COLING (19) NAACL (19) EMNLP (16) ACL (13) SEMEVAL (12) IJCNLP (11) EACL (9) AACL (7) AAAI (2) IJCAI (1)

Top co-authors

Tharindu Ranasinghe (27) Kai North (20) Shervin Malmasi (17) Dhiman Goswami (16) Nishat Raihan (15) Preslav Nakov (10) Matthew Shardlow (10) Sadiya Sayara Chowdhury Puspo (9) Antonios Anastasopoulos (9) Md Nishat Raihan (8)

Research topics

Linguistics (2) Education (1)

Keywords

text classification (27) multilingual nlp (16) large language model (13) offensive language detection (12) social media (10) low-resource language (9) natural language processing (9) machine translation (9) transformer model (9) language identification (8) neural machine translation (8) lexical simplification (8) dialect identification (7) shared task (6) lexical complexity (6) social media analysis (6) complexity prediction (6) code generation (5) hate speech detection (5) sentiment analysis (5)

Papers

CodeGuard: Improving LLM Guardrails in CS Education EACL 2026 Grammatical Error Correction for Low-Resource Languages: The Case of Zarma EACL 2026 Large Language Models for Mental Health: A Multilingual Evaluation EACL 2026 Claim Verification in the Age of Large Language Models: A Survey ACL 2026 A Survey on Multilingual Mental Disorders Detection from Social Media Data EACL 2026 Does Machine Translation Impact Offensive Language Identification? The Case of Indo-Aryan Languages COLING 2025 Tracing L1 Interference in English Learner Writing: A Longitudinal Corpus with Error Annotations EMNLP 2025 Exploring the Performance of Large Language Models on Subjective Span Identification Tasks AACL 2025 Datasets for Depression Modeling in Social Media: An Overview NAACL 2025 MojoBench: Language Modeling and Benchmarks for Mojo NAACL 2025 ARTICLE: Annotator Reliability Through In-Context Learning (Student Abstract) AAAI 2025 Overview of BLP-2025 Task 2: Code Generation in Bangla AACL 2025 TigerLLM - A Family of Bangla Large Language Models ACL 2025 Subasa - Adapting Language Models for Low-resourced Offensive Language Detection in Sinhala NAACL 2025 Multilingual Native Language Identification with Large Language Models NAACL 2025 Bayelemabaga: Creating Resources for Bambara NLP NAACL 2025 mHumanEval - A Multilingual Benchmark to Evaluate Large Language Models for Code Generation NAACL 2025 Overview of BLP-2025 Task 2: Code Generation in Bangla IJCNLP 2025 Exploring the Performance of Large Language Models on Subjective Span Identification Tasks IJCNLP 2025 GMU-MU at the Financial Misinformation Detection Challenge Task: Exploring LLMs for Financial Claim Verification COLING 2025 ARTICLE: Annotator Reliability Through In-Context Learning AAAI 2025 MentalHelp: A Multi-Task Dataset for Mental Health in Social Media COLING 2024 Multilingual Resources for Lexical Complexity Prediction: A Review COLING 2024 An Extensible Massively Multilingual Lexical Simplification Pipeline Dataset using the MultiLS Framework COLING 2024 MasonTigers at SemEval-2024 Task 1: An Ensemble Approach for Semantic Textual Relatedness NAACL 2024 MasonTigers at SemEval-2024 Task 9: Solving Puzzles with an Ensemble of Chain-of-Thought Prompts NAACL 2024 Countering Hateful and Offensive Speech Online - Open Challenges EMNLP 2024 Rater Cohesion and Quality from a Vicarious Perspective EMNLP 2024 MultiLS: An End-to-End Lexical Simplification Framework EMNLP 2024 MasonPerplexity at Multimodal Hate Speech Event Detection 2024: Hate Speech and Target Detection Using Transformer Ensembles EACL 2024 MasonTigers at SemEval-2024 Task 1: An Ensemble Approach for Semantic Textual Relatedness SEMEVAL 2024 MasonTigers at SemEval-2024 Task 9: Solving Puzzles with an Ensemble of Chain-of-Thought Prompts SEMEVAL 2024 GMU at MLSP 2024: Multilingual Lexical Simplification with Transformer Models NAACL 2024 The BEA 2024 Shared Task on the Multilingual Lexical Simplification Pipeline NAACL 2024 Native Language Identification in Texts: A Survey NAACL 2024 A Survey of Multimodal Sarcasm Detection IJCAI 2024 A Federated Learning Approach to Privacy Preserving Offensive Language Identification COLING 2024 EmoMix-3L: A Code-Mixed Dataset for Bangla-English-Hindi for Emotion Detection COLING 2024 Language Variety Identification with True Labels COLING 2024 A Text-to-Text Model for Multilingual Offensive Language Identification IJCNLP 2023 A Text-to-Text Model for Multilingual Offensive Language Identification AACL 2023 SentMix-3L: A Novel Code-Mixed Test Dataset in Bangla-English-Hindi for Sentiment Analysis AACL 2023 OffMix-3L: A Novel Code-Mixed Test Dataset in Bangla-English-Hindi for Offensive Language Identification AACL 2023 Target-Based Offensive Language Identification ACL 2023 Teacher and Student Models of Offensive Language in Social Media ACL 2023 ALEXSIS+: Improving Substitute Generation and Selection for Lexical Simplification with Information Retrieval ACL 2023 Findings of the VarDial Evaluation Campaign 2023 EACL 2023 Vicarious Offense and Noise Audit of Offensive Speech Classifiers: Unifying Human and Machine Disagreement on What is Offensive EMNLP 2023 Offensive Language Identification in Transliterated and Code-Mixed Bangla EMNLP 2023 nlpBDpatriots at BLP-2023 Task 1: Two-Step Classification for Violence Inciting Text Detection in Bangla - Leveraging Back-Translation and Multilinguality EMNLP 2023 nlpBDpatriots at BLP-2023 Task 2: A Transfer Learning Approach towards Bangla Sentiment Analysis EMNLP 2023 SentMix-3L: A Novel Code-Mixed Test Dataset in Bangla-English-Hindi for Sentiment Analysis IJCNLP 2023 OffMix-3L: A Novel Code-Mixed Test Dataset in Bangla-English-Hindi for Offensive Language Identification IJCNLP 2023 ALEXSIS-PT: A New Resource for Portuguese Lexical Simplification COLING 2022 Findings of the TSAR-2022 Shared Task on Multilingual Lexical Simplification EMNLP 2022 GMU-WLV at TSAR-2022 Shared Task: Evaluating Lexical Simplification Models EMNLP 2022 An Evaluation of Binary Comparative Lexical Complexity Models NAACL 2022 Handling Extreme Class Imbalance in Technical Logbook Datasets ACL 2021 Comparing Approaches to Dravidian Language Identification EACL 2021 Findings of the VarDial Evaluation Campaign 2021 EACL 2021 Handling Extreme Class Imbalance in Technical Logbook Datasets IJCNLP 2021 SOLID: A Large-Scale Semi-Supervised Dataset for Offensive Language Identification IJCNLP 2021 An Exploratory Analysis of the Relation between Offensive Language and Mental Health IJCNLP 2021 SemEval-2021 Task 1: Lexical Complexity Prediction IJCNLP 2021 LCP-RIT at SemEval-2021 Task 1: Exploring Linguistic Features for Lexical Complexity Prediction IJCNLP 2021 WLV-RIT at SemEval-2021 Task 5: A Neural Transformer Framework for Detecting Toxic Spans IJCNLP 2021 MUDES: Multilingual Detection of Offensive Spans NAACL 2021 WLV-RIT at SemEval-2021 Task 5: A Neural Transformer Framework for Detecting Toxic Spans ACL 2021 LCP-RIT at SemEval-2021 Task 1: Exploring Linguistic Features for Lexical Complexity Prediction ACL 2021 SemEval-2021 Task 1: Lexical Complexity Prediction SEMEVAL 2021 LCP-RIT at SemEval-2021 Task 1: Exploring Linguistic Features for Lexical Complexity Prediction SEMEVAL 2021 WLV-RIT at SemEval-2021 Task 5: A Neural Transformer Framework for Detecting Toxic Spans SEMEVAL 2021 SemEval-2021 Task 1: Lexical Complexity Prediction ACL 2021 An Exploratory Analysis of the Relation between Offensive Language and Mental Health ACL 2021 fBERT: A Neural Transformer for Identifying Offensive Content EMNLP 2021 A Computational Exploration of Pejorative Language in Social Media EMNLP 2021 Findings of the 2021 Conference on Machine Translation (WMT21) EMNLP 2021 SOLID: A Large-Scale Semi-Supervised Dataset for Offensive Language Identification ACL 2021 Multilingual Offensive Language Identification with Cross-lingual Embeddings EMNLP 2020 Neural Machine Translation for Similar Languages: The Case of Indo-Aryan Languages EMNLP 2020 SemEval-2020 Task 12: Multilingual Offensive Language Identification in Social Media (OffensEval 2020) COLING 2020 MaintNet: A Collaborative Open-Source Library for Predictive Maintenance Language Resources COLING 2020 SemEval-2020 Task 12: Multilingual Offensive Language Identification in Social Media (OffensEval 2020) SEMEVAL 2020 A Report on the VarDial Evaluation Campaign 2020 COLING 2020 Findings of the 2020 Conference on Machine Translation (WMT20) EMNLP 2020 NLP Tools for Predictive Maintenance Records in MaintNet AACL 2020 Neural Machine Translation for Extremely Low-Resource African Languages: A Case Study on Bambara AACL 2020 Experiments in Cuneiform Language Identification NAACL 2019 Findings of the 2019 Conference on Machine Translation (WMT19) ACL 2019 UDS–DFKI Submission to the WMT2019 Czech–Polish Similar Language Translation Shared Task ACL 2019 SemEval-2019 Task 6: Identifying and Categorizing Offensive Language in Social Media (OffensEval) SEMEVAL 2019 UTFPR at SemEval-2019 Task 5: Hate Speech Identification with Recurrent Neural Networks SEMEVAL 2019 Predicting the Type and Target of Offensive Posts in Social Media NAACL 2019 Proceedings of the Sixth Workshop on NLP for Similar Languages, Varieties and Dialects NAACL 2019 A Report on the Third VarDial Evaluation Campaign NAACL 2019 Proceedings of the First Workshop on Trolling, Aggression and Cyberbullying (TRAC-2018) COLING 2018 Discriminating between Indo-Aryan Languages Using SVM Ensembles COLING 2018 A Neural Approach to Language Variety Translation COLING 2018 A Report on the Complex Word Identification Shared Task 2018 NAACL 2018 Benchmarking Aggression Identification in Social Media COLING 2018 Proceedings of the Fifth Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial 2018) COLING 2018 Language Identification and Morphosyntactic Tagging: The Second VarDial Evaluation Campaign COLING 2018 A Portuguese Native Language Identification Dataset NAACL 2018 LTG at SemEval-2016 Task 11: Complex Word Identification with Classifier Ensembles SEMEVAL 2016 CATaLog Online: A Web-based CAT Tool for Distributed Translation with Data Capture for APE and Translation Process Research COLING 2016 MAZA at SemEval-2016 Task 11: Detecting Lexical Complexity Using a Decision Stump Meta-Classifier SEMEVAL 2016 MacSaar at SemEval-2016 Task 11: Zipfian and Character Features for ComplexWord Identification SEMEVAL 2016 AMBRA: A Ranking Approach to Temporal Text Classification SEMEVAL 2015 Temporal Text Ranking and Automatic Dating of Texts EACL 2014