Ona de Gibert
15 papers · 2018–2025 · 6 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+9 more ↓ Show less ↑
🐝 Cross-Pollinator (14) 🌉 Interdisciplinary Bridge 🏃 Academic Marathon (7) 🌍 Conference Polyglot (6) 🌈 Renaissance Researcher (5)
🗺️
Taxonomy Completionist
(36)
🧭
Keyword Pioneer
🌉
Interdisciplinary Bridge
🤝
Dynamic Duo
(11)
🔬
Deep Specialist
(11)
👥
Mega-Team
(35)
⚡
Prolific Year
(8)
🗃️
Keyword Collector
(64)
💎
Century Club
(15)
Conferences
EMNLP (6)
ACL (4)
NAACL (2)
COLING (1)
EACL (1)
SEMEVAL (1)
Top co-authors
Keywords
machine translation
(10)
low-resource language
(6)
large language model
(4)
text classification
(3)
parallel corpus
(3)
knowledge distillation
(3)
overgeneration mistake
(2)
multilingual corpus
(2)
data augmentation
(2)
shared task
(2)
multilingual model
(2)
multilingual translation
(2)
indigenous language
(2)
language model
(2)
benchmark evaluation
(2)
hallucination detection
(2)
multilingual nlp
(2)
instruction-tuned model
(2)
language modeling
(1)
model evaluation
(1)
Papers
Scaling Low-Resource MT via Synthetic Data Generation with LLMs
EMNLP 2025
GlotEval: A Test Suite for Massively Multilingual Evaluation of Large Language Models
EMNLP 2025
DocHPLT: A Massively Multilingual Document-Level Translation Dataset
EMNLP 2025
SemEval-2025 Task 3: Mu-SHROOM, the Multilingual Shared-task on Hallucinations and Related Observable Overgeneration Mistakes
ACL 2025
SemEval-2025 Task 3: Mu-SHROOM, the Multilingual Shared-task on Hallucinations and Related Observable Overgeneration Mistakes
SEMEVAL 2025
An Expanded Massive Multilingual Dataset for High-Performance Language Technologies (HPLT)
ACL 2025
Findings of the AmericasNLP 2025 Shared Tasks on Machine Translation, Creation of Educational Material, and Translation Metrics for Indigenous Languages of the Americas
NAACL 2025
EdinHelsOW WMT 2025 CreoleMT System Description: Improving Lusophone Creole Translation through Data Augmentation, Model Merging and LLM Post-editing
EMNLP 2025
Hybrid Distillation from RBMT and NMT: Helsinki-NLP’s Submission to the Shared Task on Translation into Low-Resource Languages of Spain
EMNLP 2024
A New Massive Multilingual Dataset for High-Performance Language Technologies
COLING 2024
MAMMOTH: Massively Multilingual Modular Open Translation @ Helsinki
EACL 2024
Findings of the AmericasNLP 2024 Shared Task on Machine Translation into Indigenous Languages
NAACL 2024
Four Approaches to Low-Resource Multilingual NMT: The Helsinki Submission to the AmericasNLP 2023 Shared Task
ACL 2023
The OPUS-MT Dashboard – A Toolkit for a Systematic Evaluation of Open Machine Translation Models
ACL 2023
Hate Speech Dataset from a White Supremacy Forum
EMNLP 2018