Samuel Cahyawijaya
69 papers · 2020–2026 · 11 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+14 more ↓ Show less ↑
π Academic Marathon (5) π Conference Polyglot (11) π§ Keyword Pioneer π Interdisciplinary Bridge π Cross-Pollinator (5)
π§
Keyword Pioneer
π
Renaissance Researcher
(10)
π
Conference Polyglot
(11)
π
Keyword Trendsetter Combo
(3)
π€
Dynamic Duo
(42)
π₯
Mega-Team
(92)
π¬
Deep Specialist
(21)
π
Keyword Champion
(2)
β‘
Prolific Year
(13)
π
Conference Pioneer
β
The Questioner
(7)
ποΈ
Keyword Collector
(257)
π₯
Unstoppable
(6)
π
Century Club
(68)
Conferences
ACL (17)
EMNLP (17)
AACL (9)
IJCNLP (7)
NAACL (6)
COLING (4)
AAAI (2)
EACL (2)
INTERSPEECH (2)
NIPS (2)
ICML (1)
Top co-authors
Research topics
Keywords
large language model
(12)
low-resource language
(9)
cross-lingual transfer
(7)
multilingual nlp
(7)
zero-shot learning
(5)
machine translation
(5)
multilingual model
(5)
few-shot learning
(5)
multilingual language model
(5)
dialogue system
(4)
multimodal learning
(4)
named entity recognition
(4)
representation learning
(3)
transfer learning
(3)
reinforcement learning
(3)
dialogue generation
(3)
visual question answering
(3)
text generation
(3)
natural language understanding
(3)
cross-lingual alignment
(3)
Papers
Lost in Simulation: LLM-Simulated Users are Unreliable Proxies for Human Users in Agentic Evaluations
ACL 2026
Shortcut Learning in Safety: The Impact of Keyword Bias in Safeguards
ACL 2025
High-Dimensional Interlingual Representations of Large Language Models
ACL 2025
Towards Efficient and Robust VQA-NLE Data Generation with Large Vision-Language Models
COLING 2025
Linguistic Minimal Pairs Elicit Linguistic Similarity in Large Language Models
COLING 2025
NusaDialogue: Dialogue Summarization and Generation for Underrepresented and Extremely Low-Resource Languages
COLING 2025
Thank You, Stingray: Multilingual Large Language Models Can Not (Yet) Disambiguate Cross-Lingual Word Senses
NAACL 2025
High-Dimension Human Value Representation in Large Language Models
NAACL 2025
Towards Better Understanding of Program-of-Thought Reasoning in Cross-Lingual and Multilingual Environments
ACL 2025
WorldCuisines: A Massive-Scale Benchmark for Multilingual and Multicultural Visual Question Answering on Global Cuisines
NAACL 2025
Subobject-level Image Tokenization
ICML 2025
Command-A-Translate: Raising the Bar of Machine Translation with Difficulty Filtering
EMNLP 2025
Language Surgery in Multilingual Large Language Models
EMNLP 2025
Entropy2Vec: Crosslingual Language Modeling Entropy as End-to-End Learnable Language Representations
EMNLP 2025
What Makes for Good Image Captions?
EMNLP 2025
Crowdsource, Crawl, or Generate? Creating SEA-VL, a Multicultural Vision-Language Dataset for Southeast Asia
ACL 2025
What Causes Knowledge Loss in Multilingual Language Models?
ACL 2025
Negative Object Presence Evaluation (NOPE) to Measure Object Hallucination in Vision-Language Models
ACL 2024
Belief Revision: The Adaptability of Large Language Models Reasoning
EMNLP 2024
SEACrowd: A Multilingual Multimodal Data Hub and Benchmark Suite for Southeast Asian Languages
EMNLP 2024
LLM Internal States Reveal Hallucination Risk Faced With a Query
EMNLP 2024
LLMs Are Few-Shot In-Context Low-Resource Language Learners
NAACL 2024
LinguAlchemy: Fusing Typological and Geographical Elements for Unseen Language Generalization
EMNLP 2024
CVQA: Culturally-diverse Multilingual Visual Question Answering Benchmark
NIPS 2024
Re-Evaluating Evaluation for Multilingual Summarization
EMNLP 2024
Cendol: Open Instruction-tuned Generative Large Language Models for Indonesian Languages
ACL 2024
Prompting Multilingual Large Language Models to Generate Code-Mixed Texts: The Case of South East Asian Languages
EMNLP 2023
A Multitask, Multilingual, Multimodal Evaluation of ChatGPT on Reasoning, Hallucination, and Interactivity
AACL 2023
NusaWrites: Constructing High-Quality Corpora for Underrepresented and Extremely Low-Resource Languages
AACL 2023
PICK: Polished & Informed Candidate Scoring for Knowledge-Grounded Dialogue Systems
AACL 2023
InstructTODS: Large Language Models for End-to-End Task-Oriented Dialogue Systems
AACL 2023
InstructAlign: High-and-Low Resource Language Alignment via Continual Crosslingual Instruction Tuning
AACL 2023
IndoToD: A Multi-Domain Indonesian Benchmark For End-to-End Task-Oriented Dialogue Systems
AACL 2023
Multi-lingual and Multi-cultural Figurative Language Understanding
ACL 2023
NusaCrowd: Open Source Initiative for Indonesian NLP Resources
ACL 2023
NusaX: Multilingual Parallel Sentiment Dataset for 10 Indonesian Local Languages
EACL 2023
Which One Are You Referring To? Multimodal Object Identification in Situated Dialogue
EACL 2023
Multilingual Large Language Models Are Not (Yet) Code-Switchers
EMNLP 2023
GlobalBench: A Benchmark for Global Progress in Natural Language Processing
EMNLP 2023
A Multitask, Multilingual, Multimodal Evaluation of ChatGPT on Reasoning, Hallucination, and Interactivity
IJCNLP 2023
NusaWrites: Constructing High-Quality Corpora for Underrepresented and Extremely Low-Resource Languages
IJCNLP 2023
PICK: Polished & Informed Candidate Scoring for Knowledge-Grounded Dialogue Systems
IJCNLP 2023
InstructTODS: Large Language Models for End-to-End Task-Oriented Dialogue Systems
IJCNLP 2023
InstructAlign: High-and-Low Resource Language Alignment via Continual Crosslingual Instruction Tuning
IJCNLP 2023
IndoToD: A Multi-Domain Indonesian Benchmark For End-to-End Task-Oriented Dialogue Systems
IJCNLP 2023
Cross-Lingual Cross-Age Adaptation for Low-Resource Elderly Speech Emotion Recognition
INTERSPEECH 2023
Every picture tells a story: Image-grounded controllable stylistic story generation
COLING 2022
Can Question Rewriting Help Conversational Question Answering?
ACL 2022
Retrieval-Free Knowledge-Grounded Dialogue Response Generation with Adapters
ACL 2022
SNP2Vec: Scalable Self-Supervised Pre-Training for Genome-Wide Association Study
ACL 2022
Integrating Question Rewrites in Conversational Question Answering: A Reinforcement Learning Approach
ACL 2022
One Country, 700+ Languages: NLP Challenges for Underrepresented Languages and Dialects in Indonesia
ACL 2022
VScript: Controllable Script Generation with Visual Presentation
IJCNLP 2022
IndoRobusta: Towards Robustness Against Diverse Code-Mixed Indonesian Local Languages
AACL 2022
VScript: Controllable Script Generation with Visual Presentation
AACL 2022
GEMv2: Multilingual NLG Benchmarking in a Single Line of Code
EMNLP 2022
How Long Is Enough? Exploring the Optimal Intervals of Long-Range Clinical Note Language Modeling
EMNLP 2022
BigBio: A Framework for Data-Centric Biomedical Natural Language Processing
NIPS 2022
Clozerβ:β Adaptable Data Augmentation for Cloze-style Reading Comprehension
ACL 2022
On the Importance of Word Order Information in Cross-lingual Sequence Labeling
AAAI 2021
XPersona: Evaluating Multilingual Personalized Chatbot
EMNLP 2021
IndoNLG: Benchmark and Resources for Evaluating Indonesian Natural Language Generation
EMNLP 2021
Multimodal End-to-End Sparse Model for Emotion Recognition
NAACL 2021
Are Multilingual Models Effective in Code-Switching?
NAACL 2021
CrossNER: Evaluating Cross-Domain Named Entity Recognition
AAAI 2021
Learning Knowledge Bases with Parameters for Task-Oriented Dialogue Systems
EMNLP 2020
Meta-Transfer Learning for Code-Switched Speech Recognition
ACL 2020
IndoNLU: Benchmark and Resources for Evaluating Indonesian Natural Language Understanding
AACL 2020
Learning Fast Adaptation on Cross-Accented Speech Recognition
INTERSPEECH 2020