Alon Lavie

35 papers · 2003–2026 · 7 conferences · across top CS/AI conferences

Achievements

+10 more ↓

🌉 Interdisciplinary Bridge 🌈 Renaissance Researcher (6) 🌍 Conference Polyglot (7) 🏃 Academic Marathon (22) 🗺️ Taxonomy Completionist (31)

🧭 Keyword Pioneer 🌈 Renaissance Researcher (6) 🐝 Cross-Pollinator (15) 🔬 Deep Specialist (12) 🤝 Dynamic Duo (12) 🧬 Topic Evolution 🔥 Unstoppable (7) 🗃️ Keyword Collector (73) 💎 Century Club (34) ❓ The Questioner (2)

Conferences

EMNLP (16) ACL (9) NAACL (5) EACL (2) COLING (1) CONLL (1) IJCNLP (1)

Top co-authors

Ricardo Rei (12) Craig Stewart (9) Ana C Farinha (8) Chrysoula Zerva (6) Chi-kiu Lo (5) Markus Freitag (5) Chris Dyer (5) Luísa Coheur (5) Tom Kocmi (4) John Mendonça (4)

Keywords

machine translation (12) quality estimation (6) machine translation evaluation (4) translation evaluation (4) human evaluation (4) large language model (3) evaluation metric (3) neural network (3) metric correlation (2) quality metric (2) quality assessment (2) automatic metric (2) conversational ai (2) automatic evaluation (2) ensemble learning (2) statistical significance (2) dialogue evaluation (2) reference-free evaluation (2) neural metric (2) segment-level analysis (2)

Papers

MEDAL: A Framework for Benchmarking LLMs as Multilingual Open-Domain Dialogue Evaluators EACL 2026 Findings of the WMT25 Shared Task on Automated Translation Evaluation Systems: Linguistic Diversity is Challenging and References Still Help EMNLP 2025 CUNI and Phrase at WMT25 MT Evaluation Task EMNLP 2025 On the Benchmarking of LLMs for Open-Domain Dialogue Evaluation ACL 2024 Are LLMs Breaking MT Metrics? Results of the WMT24 Metrics Shared Task EMNLP 2024 Soda-Eval: Open-Domain Dialogue Evaluation in the age of LLMs EMNLP 2024 Dialogue Quality and Emotion Annotations for Customer Support Conversations EMNLP 2023 The Inside Story: Towards Better Understanding of Machine Translation Neural Evaluation Metrics ACL 2023 Results of WMT23 Metrics Shared Task: Metrics Might Be Guilty but References Are Not Innocent EMNLP 2023 CometKiwi: IST-Unbabel 2022 Submission for the Quality Estimation Shared Task EMNLP 2022 Results of WMT22 Metrics Shared Task: Stop Using BLEU – Neural Metrics Are Better and More Robust EMNLP 2022 COMET-22: Unbabel-IST 2022 Submission for the Metrics Shared Task EMNLP 2022 Results of the WMT21 Metrics Shared Task: Evaluating Metrics with Expert-based Human Evaluations on TED and News Domain EMNLP 2021 MT-Telescope: An interactive platform for contrastive evaluation of MT systems ACL 2021 Are References Really Needed? Unbabel-IST 2021 Submission for the Metrics Shared Task EMNLP 2021 MT-Telescope: An interactive platform for contrastive evaluation of MT systems IJCNLP 2021 Unbabel’s Participation in the WMT20 Metrics Shared Task EMNLP 2020 COMET: A Neural Framework for MT Evaluation EMNLP 2020 Synthesizing Compound Words for Machine Translation ACL 2016 Humor Recognition and Humor Anchor Extraction EMNLP 2015 Learning from Post-Editing: Online Model Adaptation for Statistical Machine Translation EACL 2014 Improving Syntax-Augmented Machine Translation by Coarsening the Label Set NAACL 2013 Grouping Language Model Boundary Words to Speed K–Best Extraction from Hypergraphs NAACL 2013 Language Model Rest Costs and Space-Efficient Storage EMNLP 2012 Language Model Rest Costs and Space-Efficient Storage CONLL 2012 Unsupervised Word Alignment with Arbitrary Features ACL 2011 Better Hypothesis Testing for Statistical Machine Translation: Controlling for Optimizer Instability ACL 2011 Extending the METEOR Machine Translation Evaluation Metric to the Phrase Level NAACL 2010 A Best-First Probabilistic Shift-Reduce Parser ACL 2006 Parser Combination by Reparsing NAACL 2006 A Best-First Probabilistic Shift-Reduce Parser COLING 2006 Multi-Engine Machine Translation Guided by Explicit Word Matching ACL 2005 Automatic Measurement of Syntactic Development in Child Language ACL 2005 BLANC: Learning Evaluation Metrics for MT EMNLP 2005 Speechalator: Two-Way Speech-to-Speech Translation in Your Hand NAACL 2003