conftrace_

Philipp Koehn

149 papers · 2001–2025 · 10 conferences · across top CS/AI conferences

Achievements

Jump to papers ↓

+17 more ↓

🧭 Keyword Pioneer 🌉 Interdisciplinary Bridge 🗺️ Taxonomy Completionist (16) 🌈 Renaissance Researcher (5) 🐣 Hot Topic Early Bird

🐣 Hot Topic Early Bird 🌉 Interdisciplinary Bridge 🧭 Keyword Pioneer 🌟 Keyword Trendsetter Combo (3) 🏠 Conference Loyalist (67) 🐺 Lone Wolf (6) 👥 Mega-Team (36) 🏆 Keyword Champion (12) 🤝 Dynamic Duo (17) 🔬 Deep Specialist (11) 🔥 Unstoppable (12) ❓ The Questioner (3) 📈 Trend Setter 🗃️ Keyword Collector (59) 💎 Century Club (149) 🚀 Conference Pioneer ⚡ Prolific Year (9)

Conferences

EMNLP (67) ACL (37) NAACL (13) IJCNLP (10) EACL (9) CONLL (6) ICLR (3) COLING (2) AACL (1) NIPS (1)

Top co-authors

Barry Haddow (17) Christian Federmann (15) Ondřej Bojar (14) Matt Post (14) Christof Monz (13) Kevin Duh (13) Francisco Guzmán (13) Vishrav Chaudhary (12) Huda Khayrallah (12) Yvette Graham (12)

Research topics

Keywords

machine translation (45) neural machine translation (28) low-resource language (17) parallel corpus (12) domain adaptation (10) transfer learning (9) translation quality (9) human evaluation (8) continued training (6) parallel corpus filtering (6) document-level translation (5) low-resource translation (5) corpus filtering (5) large language model (5) translation evaluation (4) sentence alignment (4) multilingual translation (4) news translation (4) transformer architecture (3) multilingual model (3)

Papers

Streaming Sequence Transduction through Dynamic Compression ACL 2025 X-ALMA: Plug & Play Modules and Adaptive Rejection for Quality Translation at Scale ICLR 2025 Findings of the WMT 2025 Shared Task of the Open Language Data Initiative EMNLP 2025 Findings of the WMT25 Multilingual Instruction Shared Task: Persistent Hurdles in Reasoning, Generation, and Evaluation EMNLP 2025 Findings of the WMT25 General Machine Translation Shared Task: Time to Stop Evaluating on Easy Test Sets EMNLP 2025 HiMATE: A Hierarchical Multi-Agent Framework for Machine Translation Evaluation EMNLP 2025 Seeing is Believing: Emotion-Aware Audio-Visual Language Modeling for Expressive Speech Generation EMNLP 2025 Speech Vecalign: an Embedding-based Method for Aligning Parallel Speech Documents EMNLP 2025 Learn and Unlearn: Addressing Misinformation in Multilingual LLMs EMNLP 2025 DiffNorm: Self-Supervised Normalization for Non-autoregressive Speech-to-speech Translation NIPS 2024 The Language Barrier: Dissecting Safety Challenges of LLMs in Multilingual Contexts ACL 2024 Recovering document annotations for sentence-level bitext ACL 2024 Speech Data from Radio Broadcasts for Low Resource Languages ACL 2024 Findings of the WMT24 General Machine Translation Shared Task: The LLM Era Is Here but MT Is Not Solved Yet EMNLP 2024 Findings of the WMT 2024 Shared Task of the Open Language Data Initiative EMNLP 2024 Findings of the WMT 2024 Shared Task on Discourse-Level Literary Translation EMNLP 2024 Benchmarking Visually-Situated Translation of Text in Natural Images EMNLP 2024 Neural Methods for Aligning Large-Scale Parallel Corpora from the Web for South and East Asian Languages EMNLP 2024 Error Norm Truncation: Robust Training in the Presence of Data Noise for Text Generation Models ICLR 2024 Where are you from? Geolocating Speech and Applications to Language Identification NAACL 2024 Narrowing the Gap between Zero- and Few-shot Machine Translation by Matching Styles NAACL 2024 Pointer-Generator Networks for Low-Resource Machine Translation: Don’t Copy That! NAACL 2024 Findings of the Word-Level AutoCompletion Shared Task in WMT 2023 EMNLP 2023 Findings of the WMT 2023 Shared Task on Parallel Data Curation EMNLP 2023 Findings of the WMT 2023 Shared Task on Discourse-Level Literary Translation: A Fresh Orb in the Cosmos of LLMs EMNLP 2023 Findings of the 2023 Conference on Machine Translation (WMT23): LLMs Are Here but Not Quite There Yet EMNLP 2023 Multilingual Representation Distillation with Contrastive Learning EACL 2023 Small Data, Big Impact: Leveraging Minimal Data for Effective Machine Translation ACL 2023 Condensing Multilingual Knowledge with Lightweight Language-Specific Modules EMNLP 2023 Multilingual Pixel Representations for Translation and Effective Cross-lingual Transfer EMNLP 2023 Machine Translation with Large Language Models: Prompting, Few-shot Learning, and Fine-tuning with QLoRA EMNLP 2023 Bilingual Lexicon Induction for Low-Resource Languages using Graph Matching via Optimal Transport EMNLP 2022 Alternative Input Signals Ease Transfer in Multilingual Machine Translation ACL 2022 Learn To Remember: Transformer with Recurrent Memory for Document-Level Machine Translation NAACL 2022 Contrastive Clustering to Mine Pseudo Parallel Data for Unsupervised Translation ICLR 2022 The Importance of Being Parameters: An Intra-Distillation Method for Serious Gains EMNLP 2022 Toward the Limitation of Code-Switching in Cross-Lingual Transfer EMNLP 2022 IsoVec: Controlling the Relative Isomorphism of Word Embedding Spaces EMNLP 2022 Data Selection Curriculum for Neural Machine Translation EMNLP 2022 Findings of the 2022 Conference on Machine Translation (WMT22) EMNLP 2022 Findings of the Word-Level AutoCompletion Shared Task in WMT 2022 EMNLP 2022 Findings of the 2021 Conference on Machine Translation (WMT21) EMNLP 2021 Adapting High-resource NMT Models to Translate Low-resource Related Languages without Parallel Data ACL 2021 Adapting High-resource NMT Models to Translate Low-resource Related Languages without Parallel Data IJCNLP 2021 Evaluating Saliency Methods for Neural Language Models NAACL 2021 Zero-Shot Cross-Lingual Dependency Parsing through Contextual Embedding Transformation EACL 2021 Learning Feature Weights using Reward Modeling for Denoising Parallel Corpora EMNLP 2021 The JHU-Microsoft Submission for WMT21 Quality Estimation Shared Task EMNLP 2021 Findings of the WMT Shared Task on Machine Translation Using Terminologies EMNLP 2021 Facebook AI’s WMT21 News Translation Task Submission EMNLP 2021 An Analysis of Euclidean vs. Graph-Based Framing for Bilingual Lexicon Induction from Word Embedding Spaces EMNLP 2021 XLEnt: Mining a Large Cross-lingual Entity Dataset with Lexical-Semantic-Phonetic Word Alignment EMNLP 2021 Levenshtein Training for Word-level Quality Estimation EMNLP 2021 Findings of the WMT 2020 Shared Task on Machine Translation Robustness EMNLP 2020 An exploratory approach to the Parallel Corpus Filtering shared task WMT20 EMNLP 2020 When Does Unsupervised Machine Translation Work? EMNLP 2020 Dual Conditional Cross Entropy Scores and LASER Similarity Scores for the WMT20 Parallel Corpus Filtering Shared Task EMNLP 2020 Findings of the WMT 2020 Shared Task on Parallel Corpus Filtering and Alignment EMNLP 2020 SimulMT to SimulST: Adapting Simultaneous Text Translation to End-to-End Simultaneous Speech Translation AACL 2020 Statistical Power and Translationese in Machine Translation Evaluation EMNLP 2020 ParaCrawl: Web-Scale Acquisition of Parallel Corpora ACL 2020 Simulated multiple reference training improves low-resource machine translation EMNLP 2020 CCAligned: A Massive Collection of Cross-Lingual Web-Document Pairs EMNLP 2020 Exploiting Sentence Order in Document Alignment EMNLP 2020 TICO-19: the Translation Initiative for COvid-19 EMNLP 2020 Findings of the 2020 Conference on Machine Translation (WMT20) EMNLP 2020 Simple Construction of Mixed-Language Texts for Vocabulary Learning ACL 2019 Saliency-driven Word Alignment Interpretation for Neural Machine Translation ACL 2019 Proceedings of the Fourth Conference on Machine Translation (Volume 2: Shared Task Papers, Day 1) ACL 2019 Findings of the 2019 Conference on Machine Translation (WMT19) ACL 2019 Findings of the First Shared Task on Machine Translation Robustness ACL 2019 Johns Hopkins University Submission for WMT News Translation Task ACL 2019 Proceedings of the Fourth Conference on Machine Translation (Volume 3: Shared Task Papers, Day 2) ACL 2019 Findings of the WMT 2019 Shared Task on Parallel Corpus Filtering for Low-Resource Conditions ACL 2019 Low-Resource Corpus Filtering Using Multilingual Sentence Embeddings ACL 2019 Vecalign: Improved Sentence Alignment in Linear Time and Space EMNLP 2019 HABLex: Human Annotated Bilingual Lexicons for Experiments in Machine Translation EMNLP 2019 The FLORES Evaluation Datasets for Low-Resource Machine Translation: Nepali–English and Sinhala–English EMNLP 2019 Spelling-Aware Construction of Macaronic Texts for Teaching Foreign-Language Vocabulary EMNLP 2019 Parallelizable Stack Long Short-Term Memory NAACL 2019 Overcoming Catastrophic Forgetting During Domain Adaptation of Neural Machine Translation NAACL 2019 Vecalign: Improved Sentence Alignment in Linear Time and Space IJCNLP 2019 HABLex: Human Annotated Bilingual Lexicons for Experiments in Machine Translation IJCNLP 2019 The FLORES Evaluation Datasets for Low-Resource Machine Translation: Nepali–English and Sinhala–English IJCNLP 2019 Spelling-Aware Construction of Macaronic Texts for Teaching Foreign-Language Vocabulary IJCNLP 2019 De-Mixing Sentiment from Code-Mixed Text ACL 2019 Proceedings of the Fourth Conference on Machine Translation (Volume 1: Research Papers) ACL 2019 The JHU Machine Translation Systems for WMT 2018 EMNLP 2018 Document-Level Adaptation for Neural Machine Translation ACL 2018 Regularized Training Objective for Continued Training for Domain Adaptation in Neural Machine Translation ACL 2018 Iterative Back-Translation for Neural Machine Translation ACL 2018 Proceedings of the Third Conference on Machine Translation: Research Papers EMNLP 2018 Freezing Subnetworks to Analyze Domain Adaptation in Neural Machine Translation EMNLP 2018 On the Impact of Various Types of Noise on Neural Machine Translation ACL 2018 Context and Copying in Neural Machine Translation EMNLP 2018 The JHU Parallel Corpus Filtering Systems for WMT 2018 EMNLP 2018 Findings of the WMT 2018 Shared Task on Parallel Corpus Filtering EMNLP 2018 Proceedings of the Third Conference on Machine Translation: Shared Task Papers EMNLP 2018 Findings of the 2018 Conference on Machine Translation (WMT18) EMNLP 2018 Neural Lattice Search for Domain Adaptation in Machine Translation IJCNLP 2017 CADET: Computer Assisted Discovery Extraction and Translation IJCNLP 2017 Knowledge Tracing in Sequential Learning of Inflected Vocabulary CONLL 2017 Zipporah: a Fast and Scalable Data Cleaning System for Noisy Web-Crawled Parallel Corpora EMNLP 2017 Analyzing Learner Understanding of Novel L2 Vocabulary CONLL 2016 User Modeling in Language Learning with Macaronic Texts ACL 2016 Creating Interactive Macaronic Interfaces for Language Learning ACL 2016 Computer Aided Translation ACL 2016 Syntax-Based Statistical Machine Translation EMNLP 2014 The MateCat Tool COLING 2014 Integrating an Unsupervised Transliteration Model into Statistical Machine Translation EACL 2014 Investigating the Usefulness of Generalized Word Representations in SMT COLING 2014 Dynamic Topic Adaptation for Phrase-based MT EACL 2014 CASMACAT: A Computer-assisted Translation Workbench EACL 2014 Refinements to Interactive Translation Prediction Based on Search Graphs ACL 2014 Can Markov Models Over Minimal Translation Units Help Phrase-Based SMT? ACL 2013 Dirt Cheap Web-Scale Parallel Text from the Common Crawl ACL 2013 Scalable Modified Kneser-Ney Language Model Estimation ACL 2013 Grouping Language Model Boundary Words to Speed K–Best Extraction from Hypergraphs NAACL 2013 Learning to Prune: Context-Sensitive Pruning for Syntactic MT ACL 2013 Language Model Rest Costs and Space-Efficient Storage CONLL 2012 Language Model Rest Costs and Space-Efficient Storage EMNLP 2012 Soft Dependency Constraints for Reordering in Hierarchical Phrase-Based Translation EMNLP 2011 Enabling Monolingual Translators: Post-Editing vs. Options NAACL 2010 A Web-Based Interactive Computer Aided Translation Tool ACL 2009 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing EMNLP 2009 Word Lattices for Multi-Source Translation EACL 2009 Improving Mid-Range Re-Ordering Using Templates of Factors EACL 2009 Monte Carlo inference and maximization for phrase-based translation CONLL 2009 Topics in Statistical Machine Translation ACL 2009 A Web-Based Interactive Computer Aided Translation Tool IJCNLP 2009 Topics in Statistical Machine Translation IJCNLP 2009 Large and Diverse Language Models for Statistical Machine Translation IJCNLP 2008 Enriching Morphologically Poor Languages for Statistical Machine Translation ACL 2008 Predicting Success in Machine Translation EMNLP 2008 Factored Translation Models EMNLP 2007 Chinese Syntactic Reordering for Statistical Machine Translation EMNLP 2007 Factored Translation Models CONLL 2007 Chinese Syntactic Reordering for Statistical Machine Translation CONLL 2007 Moses: Open Source Toolkit for Statistical Machine Translation ACL 2007 Improved Statistical Machine Translation Using Paraphrases NAACL 2006 Re-evaluating the Role of Bleu in Machine Translation Research EACL 2006 Clause Restructuring for Statistical Machine Translation ACL 2005 Statistical Significance Tests for Machine Translation Evaluation EMNLP 2004 Feature-Rich Statistical Translation of Noun Phrases ACL 2003 Statistical Phrase-Based Translation NAACL 2003 Desparately Seeking Cebuano NAACL 2003 What’s New in Statistical Machine Translation NAACL 2003 Empirical Methods for Compound Splitting EACL 2003 Knowledge Sources for Word-Level Translation Models EMNLP 2001