Aloka Fernando
4 papers · 2024–2025 · 3 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓
π
Conference Polyglot
(3)
π
Interdisciplinary Bridge
πΊοΈ
Taxonomy Completionist
(10)
π§
Keyword Pioneer
π
Cross-Pollinator
(14)
Conferences
EMNLP (2)
ACL (1)
EACL (1)
Top co-authors
Keywords
low-resource language
(3)
neural machine translation
(3)
data curation
(2)
parallel corpus
(2)
multilingual language model
(1)
corpus filtering
(1)
open science
(1)
nlp research
(1)
data deduplication
(1)
corpus mining
(1)
statistical method
(1)
statistical filtration
(1)
jensen shannon divergence
(1)
constrained training
(1)
statistical filtering
(1)
artifact reuse
(1)
data filtration
(1)
debiasing heuristics
(1)
language identification
(1)
jensen-shannon divergence
(1)
Papers
Improving the Quality of Web-mined Parallel Corpora of Low-Resource Languages using Debiasing Heuristics
EMNLP 2025
Shoulders of Giants: A Look at the Degree and Utility of Openness in NLP Research
ACL 2024
Quality Does Matter: A Detailed Look at the Quality and Utility of Web-Mined Parallel Corpora
EACL 2024
Back to the Stats: Rescuing Low Resource Neural Machine Translation with Statistical Methods
EMNLP 2024