Matthew Theodore Roque
5 papers · 2024–2025 · 2 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+3 more ↓ Show less ↑
π Renaissance Researcher (6) π Interdisciplinary Bridge π§ Keyword Pioneer π Conference Polyglot (2) πΊοΈ Taxonomy Completionist (16)
π
Cross-Pollinator
(14)
π₯
Mega-Team
(92)
β
The Questioner
Conferences
EMNLP (4)
ACL (1)
Top co-authors
Research topics
Keywords
model scaling
(2)
machine translation
(2)
sample efficiency
(1)
transfer learning
(1)
curriculum learning
(1)
dataset creation
(1)
data augmentation
(1)
multimodal learning
(1)
text complexity
(1)
multilingual pretraining
(1)
text simplification
(1)
low-resource language
(1)
vision language model
(1)
vision-language model
(1)
data synthesis
(1)
sequence-to-sequence model
(1)
sequence to sequence
(1)
multilingual model
(1)
low-resource translation
(1)
back translation
(1)
Papers
Crowdsource, Crawl, or Generate? Creating SEA-VL, a Multicultural Vision-Language Dataset for Southeast Asia
ACL 2025
Rethinking the Role of Text Complexity in Language Model Pretraining
EMNLP 2025
Beyond Repetition: Text Simplification and Curriculum Learning for Data-Constrained Pretraining
EMNLP 2025
Scaling, Simplification, and Adaptation: Lessons from Pretraining on Machine-Translated Text
EMNLP 2025
Samsung R&D Institute Philippines @ WMT 2024 Indic MT Task
EMNLP 2024