Menan Velayuthan
5 papers · 2024–2025 · 4 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓
π
Conference Polyglot
(4)
π
Interdisciplinary Bridge
πΊοΈ
Taxonomy Completionist
(11)
π§
Keyword Pioneer
π
Cross-Pollinator
(15)
Conferences
EMNLP (2)
COLING (1)
EACL (1)
NAACL (1)
Top co-authors
Keywords
neural machine translation
(4)
low-resource language
(3)
parallel corpus
(2)
data curation
(2)
language identification
(1)
language model
(1)
jensen-shannon divergence
(1)
low-resource translation
(1)
corpus quality
(1)
multilingual language model
(1)
corpus filtering
(1)
sequence-level training
(1)
byte pair encoding
(1)
sequence-level distillation
(1)
encoder alignment
(1)
data deduplication
(1)
corpus mining
(1)
statistical method
(1)
statistical filtration
(1)
jensen shannon divergence
(1)
Papers
Egalitarian Language Representation in Language Models: It All Begins with Tokenizers
COLING 2025
Improving the Quality of Web-mined Parallel Corpora of Low-Resource Languages using Debiasing Heuristics
EMNLP 2025
Encoder-Aware Sequence-Level Knowledge Distillation for Low-Resource Neural Machine Translation
NAACL 2025
Quality Does Matter: A Detailed Look at the Quality and Utility of Web-Mined Parallel Corpora
EACL 2024
Back to the Stats: Rescuing Low Resource Neural Machine Translation with Statistical Methods
EMNLP 2024