Holger Schwenk
46 papers · 2005–2025 · 13 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+12 more ↓ Show less ↑
๐ Renaissance Researcher (7) ๐ Interdisciplinary Bridge ๐ Conference Polyglot (13) ๐ Academic Marathon (20) ๐บ๏ธ Taxonomy Completionist (55)
๐บ๏ธ
Taxonomy Completionist
(55)
๐งญ
Keyword Pioneer
๐ฃ
Hot Topic Early Bird
๐ฌ
Deep Specialist
(14)
๐งฌ
Topic Evolution
๐
Keyword Champion
(6)
๐
Century Club
(46)
โก
Prolific Year
(7)
๐๏ธ
Keyword Collector
(133)
๐ฅ
Unstoppable
(9)
๐
Trend Setter
๐
Conference Pioneer
Conferences
ACL (14)
EMNLP (9)
EACL (5)
IJCNLP (5)
COLING (3)
NAACL (3)
AAAI (1)
CONLL (1)
CVPR (1)
ICLR (1)
INTERSPEECH (1)
JMLR (1)
NIPS (1)
Top co-authors
Keywords
machine translation
(11)
low-resource language
(7)
multilingual sentence embedding
(6)
bitext mining
(6)
speech-to-speech translation
(6)
multilingual translation
(4)
transfer learning
(4)
sentence embedding
(3)
speech translation
(3)
sentence encoder
(3)
neural machine translation
(3)
cross-lingual retrieval
(3)
sentence representation
(2)
cross-lingual transfer
(2)
natural language inference
(2)
cosine similarity
(2)
sentence alignment
(2)
knowledge distillation
(2)
zero-shot learning
(2)
end-to-end translation
(2)
Papers
LCFO: Long Context and Long Form Output Dataset and Benchmarking
ACL 2025
Aligning Speech Segments Beyond Pure Semantics
ACL 2024
Modular Speech-to-Text Translation for Zero-Shot Cross-Modal Transfer
INTERSPEECH 2023
BLASER: A Text-Free Speech-to-Speech Translation Evaluation Metric
ACL 2023
SpeechMatrix: A Large-Scale Mined Corpus of Multilingual Speech-to-Speech Translations
ACL 2023
xSIM++: An Improved Proxy to Bitext Mining Performance for Low-Resource Languages
ACL 2023
Speech-to-Speech Translation for a Real-world Unwritten Language
ACL 2023
Multilingual Representation Distillation with Contrastive Learning
EACL 2023
DiffEdit: Diffusion-based semantic image editing with mask guidance
ICLR 2023
Textless Speech-to-Speech Translation on Real Data
NAACL 2022
Findings of the WMTโ22 Shared Task on Large-Scale Machine Translation Evaluation for African Languages
EMNLP 2022
Bitext Mining Using Distilled Sentence Representations for Low-Resource Languages
EMNLP 2022
stopes - Modular Machine Translation Pipelines
EMNLP 2022
T-Modules: Translation Modules for Zero-Shot Cross-Modal Machine Translation
EMNLP 2022
FlexIT: Towards Flexible Semantic Image Translation
CVPR 2022
Multimodal and Multilingual Embeddings for Large-Scale Speech Mining
NIPS 2021
CCMatrix: Mining Billions of High-Quality Parallel Sentences on the Web
ACL 2021
FST: the FAIR Speech Translation System for the IWSLT21 Multilingual Shared Task
ACL 2021
WikiMatrix: Mining 135M Parallel Sentences in 1620 Language Pairs from Wikipedia
EACL 2021
CCMatrix: Mining Billions of High-Quality Parallel Sentences on the Web
IJCNLP 2021
FST: the FAIR Speech Translation System for the IWSLT21 Multilingual Shared Task
IJCNLP 2021
Beyond English-Centric Multilingual Machine Translation
JMLR 2021
MLQA: Evaluating Cross-lingual Extractive Question Answering
ACL 2020
Margin-based Parallel Corpus Mining with Multilingual Sentence Embeddings
ACL 2019
Low-Resource Corpus Filtering Using Multilingual Sentence Embeddings
ACL 2019
Analysis of Joint Multilingual Sentence Representations and Semantic K-Nearest Neighbor Graphs
AAAI 2019
XNLI: Evaluating Cross-lingual Sentence Representations
EMNLP 2018
Filtering and Mining Parallel Data in a Joint Multilingual Space
ACL 2018
Supervised Learning of Universal Sentence Representations from Natural Language Inference Data
EMNLP 2017
Very Deep Convolutional Networks for Text Classification
EACL 2017
Continuous Adaptation to User Feedback for Statistical Machine Translation
NAACL 2015
Learning Phrase Representations using RNN EncoderโDecoder for Statistical Machine Translation
EMNLP 2014
The MateCat Tool
COLING 2014
A Multi-Domain Translation Model Framework for Statistical Machine Translation
ACL 2013
Multimodal Comparable Corpora as Resources for Extracting Parallel Data: Parallel Phrases Extraction
IJCNLP 2013
Collaborative Machine Translation Service for Scientific texts
EACL 2012
Large, Pruned or Continuous Space Language Models on a GPU for Statistical Machine Translation
NAACL 2012
Continuous Space Translation Models for Phrase-Based Statistical Machine Translation
COLING 2012
Parametric Weighting of Parallel Data for Statistical Machine Translation
IJCNLP 2011
On the Use of Comparable Corpora to Improve SMT performance
EACL 2009
Large and Diverse Language Models for Statistical Machine Translation
IJCNLP 2008
Smooth Bilingual N-Gram Translation
EMNLP 2007
Smooth Bilingual N-Gram Translation
CONLL 2007
Continuous Space Language Models for Statistical Machine Translation
ACL 2006
Continuous Space Language Models for Statistical Machine Translation
COLING 2006
Training Neural Network Language Models on Very Large Corpora
EMNLP 2005