François Yvon
85 papers · 2005–2026 · 8 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+17 more ↓ Show less ↑
🌍 Conference Polyglot (8) 🏃 Academic Marathon (20) 🧭 Keyword Pioneer 🌉 Interdisciplinary Bridge 🐝 Cross-Pollinator (13)
🌉
Interdisciplinary Bridge
🧭
Keyword Pioneer
🌈
Renaissance Researcher
(9)
🌟
Keyword Trendsetter Combo
(4)
🏠
Conference Loyalist
(29)
👥
Mega-Team
(20)
🤝
Dynamic Duo
(14)
🔬
Deep Specialist
(24)
🧬
Topic Evolution
🏆
Keyword Champion
(2)
📈
Trend Setter
⚡
Prolific Year
(5)
🚀
Conference Pioneer
❓
The Questioner
(3)
🔥
Unstoppable
(12)
💎
Century Club
(82)
🗃️
Keyword Collector
(294)
Conferences
EMNLP (29)
ACL (17)
COLING (14)
NAACL (11)
EACL (7)
CONLL (3)
INTERSPEECH (3)
NIPS (1)
Top co-authors
Keywords
machine translation
(16)
neural machine translation
(10)
large language model
(9)
low-resource language
(9)
dependency parsing
(6)
multilingual nlp
(6)
word alignment
(5)
word segmentation
(4)
language documentation
(4)
domain adaptation
(4)
dynamic oracle
(3)
part-of-speech tagging
(3)
language identification
(3)
sequence labeling
(3)
conditional random field
(3)
in-context learning
(3)
text generation
(3)
representation learning
(3)
parallel corpus
(3)
retrieval-augmented generation
(3)
Papers
AdaptBPE: From General Purpose to Specialized Tokenizers
EACL 2026
The GDN-CC Dataset: Automatic Corpus Clarification for AI-enhanced Democratic Citizen Consultations
ACL 2026
Polyglots or Multitudes? Multilingual LLM Answers to Value-laden Multiple-Choice Questions
EACL 2026
How Sampling Affects the Detectability of Machine-written texts: A Comprehensive Study
EMNLP 2025
Tracing Multilingual Factual Knowledge Acquisition in Pretraining
EMNLP 2025
An Interdisciplinary Approach to Human-Centered Machine Translation
EMNLP 2025
On Relation-Specific Neurons in Large Language Models
EMNLP 2025
MOSAIC at GENAI Detection Task 3 : Zero-Shot Detection Using an Ensemble of Models
COLING 2025
Understanding In-Context Machine Translation for Low-Resource Languages: A Case Study on Manchu
ACL 2025
MockConf: A Student Interpretation Dataset: Analysis, Word- and Span-level Alignment and Baselines
ACL 2025
MOSAIC: Multiple Observers Spotting AI Content
ACL 2025
How Programming Concepts and Neurons Are Shared in Code Language Models
ACL 2025
MEXA: Multilingual Evaluation of English-Centric LLMs via Cross-Lingual Alignment
ACL 2025
Prompting LLMs: Length Control for Isometric Machine Translation
ACL 2025
How Transliterations Improve Crosslingual Alignment
COLING 2025
Unlike “Likely”, “Unlike” is Unlikely: BPE-based Segmentation hurts Morphological Derivations in LLMs
COLING 2025
Towards the Machine Translation of Scientific Neologisms
COLING 2025
Self-Retrieval from Distant Contexts for Document-Level Machine Translation
EMNLP 2025
GlotCC: An Open Broad-Coverage CommonCrawl Corpus and Pipeline for Minority Languages
NIPS 2024
MaskLID: Code-Switching Language Identification through Iterative Masking
ACL 2024
GlotScript: A Resource and Tool for Low Resource Writing System Identification
COLING 2024
Invited Talk: The Way Towards Massively Multilingual Language Models
COLING 2024
Retrieving Examples from Memory for Retrieval Augmented Neural Machine Translation: A Systematic Comparison
NAACL 2024
Towards Multilingual Interlinear Morphological Glossing
EMNLP 2023
Towards Example-Based NMT with Multi-Levenshtein Transformers
EMNLP 2023
Structural generalization in COGS: Supertagging is (almost) all you need
EMNLP 2023
Glot500: Scaling Multilingual Corpora and Language Models to 500 Languages
ACL 2023
Integrating Translation Memories into Non-Autoregressive Machine Translation
EACL 2023
Joint Word and Morpheme Segmentation with Bayesian Non-Parametric Models
EACL 2023
BiSync: A Bilingual Editor for Synchronized Monolingual Texts
ACL 2023
LISN @ SIGMORPHON 2023 Shared Task on Interlinear Glossing
ACL 2023
Assessing Word Importance Using Models Trained for Semantic Tasks
ACL 2023
GlotLID: Language Identification for Low-Resource Languages
EMNLP 2023
Bilingual Synchronization: Restoring Translational Relationships with Editing Operations
EMNLP 2022
Latent Group Dropout for Multilingual and Multidomain Machine Translation
NAACL 2022
Weakly Supervised Word Segmentation for Computational Language Documentation
ACL 2022
Graph Neural Networks for Multiparallel Word Alignment
ACL 2022
Joint Generation of Captions and Subtitles with Dual Decoding
ACL 2022
Analyzing Gender Translation Errors to Identify Information Flows between the Encoder and Decoder of a NMT System
EMNLP 2022
Graph-Based Multilingual Label Propagation for Low-Resource Part-of-Speech Tagging
EMNLP 2022
Screening Gender Transfer in Neural Machine Translation
EMNLP 2021
Graph Algorithms for Multiparallel Word Alignment
EMNLP 2021
One Source, Two Targets: Challenges and Rewards of Dual Decoding
EMNLP 2021
LISN @ WMT 2021
EMNLP 2021
Toward Genre Adapted Closed Captioning
INTERSPEECH 2021
Can You Traducir This? Machine Translation for Code-Switched Input
NAACL 2021
SimAlign: High Quality Word Alignments Without Parallel Training Data Using Static and Contextualized Embeddings
EMNLP 2020
Priming Neural Machine Translation
EMNLP 2020
A Study of Residual Adapters for Multi-Domain Neural Machine Translation
EMNLP 2020
LIMSI @ WMT 2020
EMNLP 2020
How Bad are PoS Tagger in Cross-Corpora Settings? Evaluating Annotation Divergence in the UD Project.
NAACL 2019
Measuring text readability with machine comprehension: a pilot study
ACL 2019
Using Monolingual Data in Neural Machine Translation: a Systematic Study
EMNLP 2018
The WMT’18 Morpheval test suites for English-Czech, English-German, English-Finnish and Turkish-English
EMNLP 2018
Quantifying training challenges of dependency parsers
COLING 2018
Unsupervised Word Segmentation from Speech with Attention
INTERSPEECH 2018
Automatically Selecting the Best Dependency Annotation Design with Dynamic Oracles
NAACL 2018
Exploiting Dynamic Oracles to Train Projective Dependency Parsers on Non-Projective Trees
NAACL 2018
Fixing Translation Divergences in Parallel Corpora for Neural MT
EMNLP 2018
Adaptor Grammars for the Linguist: Word Segmentation Experiments for Very Low-Resource Languages
EMNLP 2018
Don’t Stop Me Now! Using Global Dynamic Oracles to Correct Training Biases of Transition-Based Dependency Parsers
EACL 2017
LIMSI@CoNLL’17: UD Shared Task
CONLL 2017
Learning the Structure of Variable-Order CRFs: a finite-state perspective
EMNLP 2017
TransRead: Designing a Bilingual Reading Experience with Machine Translation Technologies
NAACL 2016
Preliminary Experiments on Unsupervised Word Discovery in Mboshi
INTERSPEECH 2016
Parallel Sentence Compression
COLING 2016
Zero-resource Dependency Parsing: Boosting Delexicalized Cross-lingual Transfer with Linguistic Knowledge
COLING 2016
Frustratingly Easy Cross-Lingual Transfer for Transition-Based Dependency Parsing
NAACL 2016
A Discriminative Training Procedure for Continuous Translation Models
EMNLP 2015
Cross-Lingual Part-of-Speech Tagging through Ambiguous Learning
EMNLP 2014
Computing Lattice BLEU Oracle Scores for Machine Translation
EACL 2012
Measuring the Influence of Long Range Dependencies with Neural Network Language Models
NAACL 2012
Continuous Space Translation Models with Neural Networks
NAACL 2012
Aligning Bilingual Literary Works: a Pilot Study
NAACL 2012
Local lexical adaptation in Machine Translation through triangulation: SMT helping SMT
COLING 2010
Improving Reordering with Linguistically Informed Bilingual n-grams
COLING 2010
Training Continuous Space Language Models: Some Practical Issues
EMNLP 2010
Assessing Phrase-Based Translation Models with Oracle Decoding
EMNLP 2010
Practical Very Large Scale CRFs
ACL 2010
Improvements in Analogical Learning: Application to Translating Multi-Terms of the Medical Domain
EACL 2009
Normalizing SMS: are Two Metaphors Better than One ?
COLING 2008
Robust Similarity Measures for Named Entities Matching
COLING 2008
Using LDA to detect semantically incoherent documents
CONLL 2008
Scaling up Analogical Learning
COLING 2008
An Analogical Learner for Morphological Analysis
CONLL 2005