Carolina Scarton

50 papers · 2010–2025 · 8 conferences · across top CS/AI conferences

Achievements

+15 more ↓

🌍 Conference Polyglot (8) 🐣 Hot Topic Early Bird 🧭 Keyword Pioneer 🌉 Interdisciplinary Bridge 🏃 Academic Marathon (15)

🏃 Academic Marathon (15) 🧭 Keyword Pioneer 🐣 Hot Topic Early Bird 🌟 Keyword Trendsetter Combo (4) 🤝 Dynamic Duo (15) 🔬 Deep Specialist (10) 🧬 Topic Evolution 🏆 Keyword Champion (2) ⚡ Prolific Year (7) 💎 Century Club (50) 🗃️ Keyword Collector (169) ❓ The Questioner 📈 Trend Setter 🔥 Unstoppable (11) 🚀 Conference Pioneer

Conferences

ACL (14) EMNLP (12) SEMEVAL (6) COLING (5) IJCNLP (5) NAACL (4) AACL (2) EACL (2)

Top co-authors

Lucia Specia (15) Kalina Bontcheva (9) Aline Villavicencio (9) Yue Li (7) Marco Idiart (6) Harish Tayyar Madabushi (5) Fernando Alva-Manchego (5) Chenghua Lin (5) Tomas Goldsack (5) Edward Gow-Smith (5)

Keywords

text simplification (8) machine translation (7) text classification (6) zero-shot learning (5) quality estimation (5) large language model (5) multilingual model (4) sentence simplification (4) biomedical text (4) lay summarisation (3) neural machine translation (3) noun compound (3) vector space model (3) transformer encoder (3) evaluation metric (3) multilingual nlp (3) natural language inference (2) stance detection (2) domain adaptation (2) low-resource language (2)

Papers

It’s All About In-Context Learning! Teaching Extremely Low-Resource Languages to LLMs EMNLP 2025 GateNLP at SemEval-2025 Task 10: Hierarchical Three-Step Prompting for Multilingual Narrative Classification SEMEVAL 2025 GateNLP at SemEval-2025 Task 10: Hierarchical Three-Step Prompting for Multilingual Narrative Classification ACL 2025 Label Set Optimization via Activation Distribution Kurtosis for Zero-Shot Classification with Generative Models EMNLP 2025 Can We Identify Stance without Target Arguments? A Study for Rumour Stance Classification COLING 2024 Navigating Prompt Complexity for Zero-Shot Classification: A Study of Large Language Models in Computational Social Science COLING 2024 ATLAS: Improving Lay Summarisation with Attribute-based Control ACL 2024 Enhancing Idiomatic Representation in Multiple Languages via an Adaptive Contrastive Triplet Loss ACL 2024 Overview of the BioLaySumm 2024 Shared Task on the Lay Summarization of Biomedical Research Articles ACL 2024 Word Boundary Information Isn’t Useful for Encoder Language Models ACL 2024 Reference-less Analysis of Context Specificity in Translation with Personalised Language Models COLING 2024 SheffieldVeraAI at SemEval-2023 Task 3: Mono and Multilingual Approaches for News Genre, Topic and Persuasion Technique Classification SEMEVAL 2023 Analysing State-Backed Propaganda Websites: a New Dataset and Linguistic Study EMNLP 2023 Enhancing Biomedical Lay Summarisation with External Knowledge Graphs EMNLP 2023 Don’t waste a single annotation: improving single-label classifiers through soft labels EMNLP 2023 MTCue: Learning Zero-Shot Control of Extra-Textual Attributes by Leveraging Unstructured Context in Neural Machine Translation ACL 2023 Overview of the BioLaySumm 2023 Shared Task on Lay Summarization of Biomedical Research Articles ACL 2023 SheffieldVeraAI at SemEval-2023 Task 3: Mono and Multilingual Approaches for News Genre, Topic and Persuasion Technique Classification ACL 2023 GateNLP-UShef at SemEval-2022 Task 8: Entity-Enriched Siamese Transformer for Multilingual News Article Similarity NAACL 2022 Controlling Formality in Low-Resource NMT with Domain Adaptation and Re-Ranking: SLT-CDT-UoS at IWSLT2022 ACL 2022 Making Science Simple: Corpora for the Lay Summarisation of Scientific Literature EMNLP 2022 Improving Tokenisation by Alternative Treatment of Spaces EMNLP 2022 SemEval-2022 Task 2: Multilingual Idiomaticity Detection and Sentence Embedding NAACL 2022 SemEval-2022 Task 2: Multilingual Idiomaticity Detection and Sentence Embedding SEMEVAL 2022 GateNLP-UShef at SemEval-2022 Task 8: Entity-Enriched Siamese Transformer for Multilingual News Article Similarity SEMEVAL 2022 Assessing the Representations of Idiomaticity in Vector Models with a Noun Compound Dataset Labeled at Type and Token Levels IJCNLP 2021 AStitchInLanguageModels: Dataset and Methods for the Exploration of Idiomaticity in Pre-Trained Language Models EMNLP 2021 Assessing the Representations of Idiomaticity in Vector Models with a Noun Compound Dataset Labeled at Type and Token Levels ACL 2021 Probing for idiomaticity in vector space models EACL 2021 Measuring What Counts: The Case of Rumour Stance Classification AACL 2020 Revisiting Rumour Stance Classification: Dealing with Imbalanced Data COLING 2020 Toxic Language Detection in Social Media for Brazilian Portuguese: New Dataset and Multilingual Analysis AACL 2020 ASSET: A Dataset for Tuning and Evaluation of Sentence Simplification Models with Multiple Rewriting Transformations ACL 2020 Cross-Sentence Transformations in Text Simplification ACL 2019 EASSE: Easier Automatic Sentence Simplification Evaluation EMNLP 2019 EASSE: Easier Automatic Sentence Simplification Evaluation IJCNLP 2019 Sheffield Submissions for the WMT18 Quality Estimation Shared Task EMNLP 2018 Learning Simplifications for Specific Target Audiences ACL 2018 Exploring gap filling as a cheaper alternative to reading comprehension questionnaires when evaluating machine translation for gisting EMNLP 2018 Sheffield Submissions for WMT18 Multimodal Translation Shared Task EMNLP 2018 Learning How to Simplify From Explicit Labeling of Complex-Simplified Text Pairs IJCNLP 2017 Improving Evaluation of Document-level Machine Translation Quality Estimation EACL 2017 MUSST: A Multilingual Syntactic Simplification Tool IJCNLP 2017 SAARSHEFF at SemEval-2016 Task 1: Semantic Textual Similarity with Machine Translation Evaluation Metrics and (eXtreme) Boosted Tree Ensembles SEMEVAL 2016 Quality Estimation for Language Output Applications COLING 2016 Multi-level Translation Quality Prediction with QuEst++ ACL 2015 Discourse and Document-level Information for Evaluating Language Output Tasks NAACL 2015 Multi-level Translation Quality Prediction with QuEst++ IJCNLP 2015 USAAR-SHEFFIELD: Semantic Textual Similarity with Deep Regression and Machine Translation Evaluation Metrics SEMEVAL 2015 SIMPLIFICA: a tool for authoring simplified texts in Brazilian Portuguese guided by readability assessments NAACL 2010