Pinzhen Chen

35 papers · 2020–2026 · 8 conferences · across top CS/AI conferences

Achievements

+12 more ↓

🐝 Cross-Pollinator (10) 🌍 Conference Polyglot (8) 🏃 Academic Marathon (5) 🧭 Keyword Pioneer 🌈 Renaissance Researcher (5)

🌉 Interdisciplinary Bridge 🌍 Conference Polyglot (8) 🏃 Academic Marathon (5) 🤝 Dynamic Duo (12) 🏆 Keyword Champion (2) 👥 Mega-Team (35) 🔬 Deep Specialist (18) ❓ The Questioner (4) 💎 Century Club (34) ⚡ Prolific Year (5) 🗃️ Keyword Collector (147) 🔥 Unstoppable (6)

Conferences

EMNLP (16) ACL (5) NAACL (5) COLING (3) EACL (3) AACL (1) IJCNLP (1) SEMEVAL (1)

Top co-authors

Barry Haddow (12) Nikolay Bogoychev (8) Alexandra Birch (7) Zheng Zhao (6) Vivek Iyer (5) Kenneth Heafield (5) Bhavitvya Malik (4) Vilém Zouhar (3) Laurie Burchell (3) Pavel Stepachev (3)

Keywords

machine translation (13) large language model (10) neural machine translation (7) multi-task learning (5) multilingual model (4) reverse dictionary (4) cross-lingual transfer (3) multilingual language model (3) constrained decoding (3) instruction tuning (3) word embedding (3) parallel corpus (3) low-resource translation (3) domain adaptation (3) definition modeling (3) definition generation (3) terminology translation (2) parallel datum (2) translation quality (2) pre-trained language model (2)

Papers

When Flores Bloomz Wrong: Cross-Direction Contamination in Machine Translation Evaluation EACL 2026 How Many Languages Make Good Multilingual Instruction Tuning? A Case Study on BLOOM COLING 2025 AveniBench: Accessible and Versatile Evaluation of Finance Intelligence COLING 2025 Findings of the WMT25 Terminology Translation Task: Terminology is Useful Especially for Good MTs EMNLP 2025 An Expanded Massive Multilingual Dataset for High-Performance Language Technologies (HPLT) ACL 2025 DocHPLT: A Massively Multilingual Document-Level Translation Dataset EMNLP 2025 Fine-Tuning Large Language Models with Sequential Instructions NAACL 2025 XL-Suite: Cross-Lingual Synthetic Training and Evaluation Data for Open-Ended Generation EMNLP 2025 Findings of the WMT25 Multilingual Instruction Shared Task: Persistent Hurdles in Reasoning, Generation, and Evaluation EMNLP 2025 Fine-Tuning Large Language Models to Translate: Will a Touch of Noisy Data in Misaligned Languages Suffice? EMNLP 2024 The Ups and Downs of Large Language Model Inference with Vocabulary Trimming by Language Heuristics NAACL 2024 Is It Good Data for Multilingual Instruction Tuning or Just Bad Multilingual Evaluation for Large Language Models? EMNLP 2024 Pitfalls and Outlooks in Using COMET EMNLP 2024 Quality or Quantity? On Data Scale and Diversity in Adapting Large Language Models for Low-Resource Translation EMNLP 2024 Cher at KSAA-CAD 2024: Compressing Words and Definitions into the Same Space for Arabic Reverse Dictionary ACL 2024 EEE-QA: Exploring Effective and Efficient Question-Answer Representations COLING 2024 Monolingual or Multilingual Instruction Tuning: Which Makes a Better Alpaca EACL 2024 Exploring Very Low-Resource Translation with LLMs: The University of Edinburgh’s Submission to AmericasNLP 2024 Translation Task NAACL 2024 UniArk: Improving Generalisation and Consistency for Factual Knowledge Extraction through Debiasing NAACL 2024 Towards Effective Disambiguation for Machine Translation with Large Language Models EMNLP 2023 Exploring Data Augmentation for Code Generation Tasks EACL 2023 PMIndiaSum: Multilingual and Cross-lingual Headline Summarization for Languages in India EMNLP 2023 Terminology-Aware Translation with Constrained Decoding and Large Language Model Prompting EMNLP 2023 Edinburgh at SemEval-2022 Task 1: Jointly Fishing for Word Embeddings and Definitions NAACL 2022 A Unified Model for Reverse Dictionary and Definition Modelling AACL 2022 The University of Edinburgh’s Submission to the WMT22 Code-Mixing Shared Task (MixMT) EMNLP 2022 Edinburgh at SemEval-2022 Task 1: Jointly Fishing for Word Embeddings and Definitions SEMEVAL 2022 A Unified Model for Reverse Dictionary and Definition Modelling IJCNLP 2022 The University of Edinburgh’s English-German and English-Hausa Submissions to the WMT21 News Translation Task EMNLP 2021 Efficient Machine Translation with Model Pruning and Quantization EMNLP 2021 The University of Edinburgh’s Bengali-Hindi Submissions to the WMT21 News Translation Task EMNLP 2021 The Highs and Lows of Simple Lexical Domain Adaptation Approaches for Neural Machine Translation EMNLP 2021 ParaCrawl: Web-Scale Acquisition of Parallel Corpora ACL 2020 Character Mapping and Ad-hoc Adaptation: Edinburgh’s IWSLT 2020 Open Domain Translation System ACL 2020 Parallel Sentence Mining by Constrained Decoding ACL 2020