conftrace_

Samuel Cahyawijaya

69 papers · 2020–2026 · 11 conferences · across top CS/AI conferences

Achievements

Jump to papers ↓

+14 more ↓

🏃 Academic Marathon (5) 🌍 Conference Polyglot (11) 🧭 Keyword Pioneer 🌉 Interdisciplinary Bridge 🐝 Cross-Pollinator (5)

🧭 Keyword Pioneer 🌈 Renaissance Researcher (10) 🌍 Conference Polyglot (11) 🌟 Keyword Trendsetter Combo (3) 🤝 Dynamic Duo (42) 👥 Mega-Team (92) 🔬 Deep Specialist (21) 🏆 Keyword Champion (2) ⚡ Prolific Year (13) 🚀 Conference Pioneer ❓ The Questioner (7) 🗃️ Keyword Collector (257) 🔥 Unstoppable (6) 💎 Century Club (68)

Conferences

ACL (17) EMNLP (17) AACL (9) IJCNLP (7) NAACL (6) COLING (4) AAAI (2) EACL (2) INTERSPEECH (2) NIPS (2) ICML (1)

Top co-authors

Pascale Fung (42) Holy Lovenia (25) Genta Indra Winata (23) Bryan Wilie (23) Alham Fikri Aji (20) Ayu Purwarianti (15) Yan Xu (13) Willy Chung (11) Zihan Liu (11) Fajri Koto (11)

Research topics

Digital Humanities (1)

Keywords

large language model (12) low-resource language (9) cross-lingual transfer (7) multilingual nlp (7) zero-shot learning (5) machine translation (5) multilingual model (5) few-shot learning (5) multilingual language model (5) dialogue system (4) multimodal learning (4) named entity recognition (4) representation learning (3) transfer learning (3) reinforcement learning (3) dialogue generation (3) visual question answering (3) text generation (3) natural language understanding (3) cross-lingual alignment (3)

Papers

Lost in Simulation: LLM-Simulated Users are Unreliable Proxies for Human Users in Agentic Evaluations ACL 2026 Shortcut Learning in Safety: The Impact of Keyword Bias in Safeguards ACL 2025 High-Dimensional Interlingual Representations of Large Language Models ACL 2025 Towards Efficient and Robust VQA-NLE Data Generation with Large Vision-Language Models COLING 2025 Linguistic Minimal Pairs Elicit Linguistic Similarity in Large Language Models COLING 2025 NusaDialogue: Dialogue Summarization and Generation for Underrepresented and Extremely Low-Resource Languages COLING 2025 Thank You, Stingray: Multilingual Large Language Models Can Not (Yet) Disambiguate Cross-Lingual Word Senses NAACL 2025 High-Dimension Human Value Representation in Large Language Models NAACL 2025 Towards Better Understanding of Program-of-Thought Reasoning in Cross-Lingual and Multilingual Environments ACL 2025 WorldCuisines: A Massive-Scale Benchmark for Multilingual and Multicultural Visual Question Answering on Global Cuisines NAACL 2025 Subobject-level Image Tokenization ICML 2025 Command-A-Translate: Raising the Bar of Machine Translation with Difficulty Filtering EMNLP 2025 Language Surgery in Multilingual Large Language Models EMNLP 2025 Entropy2Vec: Crosslingual Language Modeling Entropy as End-to-End Learnable Language Representations EMNLP 2025 What Makes for Good Image Captions? EMNLP 2025 Crowdsource, Crawl, or Generate? Creating SEA-VL, a Multicultural Vision-Language Dataset for Southeast Asia ACL 2025 What Causes Knowledge Loss in Multilingual Language Models? ACL 2025 Negative Object Presence Evaluation (NOPE) to Measure Object Hallucination in Vision-Language Models ACL 2024 Belief Revision: The Adaptability of Large Language Models Reasoning EMNLP 2024 SEACrowd: A Multilingual Multimodal Data Hub and Benchmark Suite for Southeast Asian Languages EMNLP 2024 LLM Internal States Reveal Hallucination Risk Faced With a Query EMNLP 2024 LLMs Are Few-Shot In-Context Low-Resource Language Learners NAACL 2024 LinguAlchemy: Fusing Typological and Geographical Elements for Unseen Language Generalization EMNLP 2024 CVQA: Culturally-diverse Multilingual Visual Question Answering Benchmark NIPS 2024 Re-Evaluating Evaluation for Multilingual Summarization EMNLP 2024 Cendol: Open Instruction-tuned Generative Large Language Models for Indonesian Languages ACL 2024 Prompting Multilingual Large Language Models to Generate Code-Mixed Texts: The Case of South East Asian Languages EMNLP 2023 A Multitask, Multilingual, Multimodal Evaluation of ChatGPT on Reasoning, Hallucination, and Interactivity AACL 2023 NusaWrites: Constructing High-Quality Corpora for Underrepresented and Extremely Low-Resource Languages AACL 2023 PICK: Polished & Informed Candidate Scoring for Knowledge-Grounded Dialogue Systems AACL 2023 InstructTODS: Large Language Models for End-to-End Task-Oriented Dialogue Systems AACL 2023 InstructAlign: High-and-Low Resource Language Alignment via Continual Crosslingual Instruction Tuning AACL 2023 IndoToD: A Multi-Domain Indonesian Benchmark For End-to-End Task-Oriented Dialogue Systems AACL 2023 Multi-lingual and Multi-cultural Figurative Language Understanding ACL 2023 NusaCrowd: Open Source Initiative for Indonesian NLP Resources ACL 2023 NusaX: Multilingual Parallel Sentiment Dataset for 10 Indonesian Local Languages EACL 2023 Which One Are You Referring To? Multimodal Object Identification in Situated Dialogue EACL 2023 Multilingual Large Language Models Are Not (Yet) Code-Switchers EMNLP 2023 GlobalBench: A Benchmark for Global Progress in Natural Language Processing EMNLP 2023 A Multitask, Multilingual, Multimodal Evaluation of ChatGPT on Reasoning, Hallucination, and Interactivity IJCNLP 2023 NusaWrites: Constructing High-Quality Corpora for Underrepresented and Extremely Low-Resource Languages IJCNLP 2023 PICK: Polished & Informed Candidate Scoring for Knowledge-Grounded Dialogue Systems IJCNLP 2023 InstructTODS: Large Language Models for End-to-End Task-Oriented Dialogue Systems IJCNLP 2023 InstructAlign: High-and-Low Resource Language Alignment via Continual Crosslingual Instruction Tuning IJCNLP 2023 IndoToD: A Multi-Domain Indonesian Benchmark For End-to-End Task-Oriented Dialogue Systems IJCNLP 2023 Cross-Lingual Cross-Age Adaptation for Low-Resource Elderly Speech Emotion Recognition INTERSPEECH 2023 Every picture tells a story: Image-grounded controllable stylistic story generation COLING 2022 Can Question Rewriting Help Conversational Question Answering? ACL 2022 Retrieval-Free Knowledge-Grounded Dialogue Response Generation with Adapters ACL 2022 SNP2Vec: Scalable Self-Supervised Pre-Training for Genome-Wide Association Study ACL 2022 Integrating Question Rewrites in Conversational Question Answering: A Reinforcement Learning Approach ACL 2022 One Country, 700+ Languages: NLP Challenges for Underrepresented Languages and Dialects in Indonesia ACL 2022 VScript: Controllable Script Generation with Visual Presentation IJCNLP 2022 IndoRobusta: Towards Robustness Against Diverse Code-Mixed Indonesian Local Languages AACL 2022 VScript: Controllable Script Generation with Visual Presentation AACL 2022 GEMv2: Multilingual NLG Benchmarking in a Single Line of Code EMNLP 2022 How Long Is Enough? Exploring the Optimal Intervals of Long-Range Clinical Note Language Modeling EMNLP 2022 BigBio: A Framework for Data-Centric Biomedical Natural Language Processing NIPS 2022 Clozer”:” Adaptable Data Augmentation for Cloze-style Reading Comprehension ACL 2022 On the Importance of Word Order Information in Cross-lingual Sequence Labeling AAAI 2021 XPersona: Evaluating Multilingual Personalized Chatbot EMNLP 2021 IndoNLG: Benchmark and Resources for Evaluating Indonesian Natural Language Generation EMNLP 2021 Multimodal End-to-End Sparse Model for Emotion Recognition NAACL 2021 Are Multilingual Models Effective in Code-Switching? NAACL 2021 CrossNER: Evaluating Cross-Domain Named Entity Recognition AAAI 2021 Learning Knowledge Bases with Parameters for Task-Oriented Dialogue Systems EMNLP 2020 Meta-Transfer Learning for Code-Switched Speech Recognition ACL 2020 IndoNLU: Benchmark and Resources for Evaluating Indonesian Natural Language Understanding AACL 2020 Learning Fast Adaptation on Cross-Accented Speech Recognition INTERSPEECH 2020