Alice Oh

76 papers · 2012–2026 · 12 conferences · across top CS/AI conferences

Achievements

+16 more ↓

🐣 Hot Topic Early Bird 🧭 Keyword Pioneer 🌉 Interdisciplinary Bridge 🌈 Renaissance Researcher (5) 🌍 Conference Polyglot (11)

🌉 Interdisciplinary Bridge 🧭 Keyword Pioneer 🐣 Hot Topic Early Bird 🏠 Conference Loyalist (23) 🤝 Dynamic Duo (11) 👥 Mega-Team (51) 🔬 Deep Specialist (17) 🧬 Topic Evolution 🏆 Keyword Champion (5) ⚡ Prolific Year (11) ❓ The Questioner (3) 📈 Trend Setter 💎 Century Club (69) 🚀 Conference Pioneer 🔥 Unstoppable (9) 🗃️ Keyword Collector (295)

Conferences

ACL (25) EMNLP (23) NAACL (8) IJCNLP (4) COLING (3) EACL (3) AACL (2) ICLR (2) ICML (2) NIPS (2) IJCAI (1) SEMEVAL (1)

Top co-authors

Eunsu Kim (13) Haneul Yoo (12) Jiho Jin (11) Yeon Seonwoo (11) JinYeong Bak (10) Sungjoon Park (9) Juhyun Oh (8) Junho Myung (8) Dongkwan Kim (7) Seyoung Song (6)

Research topics

Education (2) Linguistics (1)

Keywords

large language model (22) language model (8) korean language (6) multilingual nlp (6) low-resource language (6) benchmark evaluation (6) cultural knowledge (5) question answering (5) machine translation (5) named entity recognition (4) benchmark dataset (4) historical document (4) variational inference (3) response generation (3) zero-shot learning (3) bias evaluation (3) cross-lingual transfer (3) text classification (3) representation learning (3) semantic similarity (3)

Papers

Are they lovers or friends? Evaluating LLMs’ Social Reasoning in English and Korean Dialogues ACL 2026 Investigating Counterfactual Unfairness in LLMs towards Identities through Humor ACL 2026 LoCar: Localization-Aware Evaluation of In-Vehicle Assistants through Fine-Grained Sociolinguistic Control ACL 2026 FINEST: Improving LLM Responses to Sensitive Topics Through Fine-Grained Evaluation EACL 2026 OLA: Output Language Alignment in Code-Switched LLM Interactions ACL 2026 MUG-Eval: A Proxy Evaluation Framework for Multilingual Generation Capabilities in Any Language EMNLP 2025 Culture is Everywhere: A Call for Intentionally Cultural Evaluation EMNLP 2025 BLUCK: A Benchmark Dataset for Bengali Linguistic Understanding and Cultural Knowledge AACL 2025 LLM-as-an-Interviewer: Beyond Static Testing Through Dynamic LLM Evaluation ACL 2025 PapersPlease: A Benchmark for Evaluating Motivational Values of Large Language Models Based on ERG Theory ACL 2025 Team ACK at SemEval-2025 Task 2: Beyond Word-for-Word Machine Translation for English-Korean Pairs ACL 2025 Team ACK at SemEval-2025 Task 2: Beyond Word-for-Word Machine Translation for English-Korean Pairs SEMEVAL 2025 WHEN TOM EATS KIMCHI: Evaluating Cultural Awareness of Multimodal Large Language Models in Cultural Mixture Contexts NAACL 2025 LLM-C3MOD: A Human-LLM Collaborative System for Cross-Cultural Hate Speech Moderation NAACL 2025 WorldCuisines: A Massive-Scale Benchmark for Multilingual and Multicultural Visual Question Answering on Global Cuisines NAACL 2025 BLUCK: A Benchmark Dataset for Bengali Linguistic Understanding and Cultural Knowledge IJCNLP 2025 Shared Heritage, Distinct Writing: Rethinking Resource Selection for East Asian Historical Documents IJCNLP 2025 Generalizing Weisfeiler-Lehman Kernels to Subgraphs ICLR 2025 Uncovering Factor-Level Preference to Improve Human-Model Alignment EMNLP 2025 Shared Heritage, Distinct Writing: Rethinking Resource Selection for East Asian Historical Documents AACL 2025 DREsS: Dataset for Rubric-based Essay Scoring on EFL Writing ACL 2025 Global MMLU: Understanding and Addressing Cultural and Linguistic Biases in Multilingual Evaluation ACL 2025 XDAC: XAI-Driven Detection and Attribution of LLM-Generated News Comments in Korean ACL 2025 Diffusion Models Through a Global Lens: Are They Culturally Inclusive? ACL 2025 Code-Switching Curriculum Learning for Multilingual Transfer in LLMs ACL 2025 Social Bias Benchmark for Generation: A Comparison of Generation and QA-Based Evaluations ACL 2025 Spotting Out-of-Character Behavior: Atomic-Level Evaluation of Persona Fidelity in Open-Ended Generation ACL 2025 BLEnD: A Benchmark for LLMs on Everyday Knowledge in Diverse Cultures and Languages NIPS 2024 LLM-as-a-tutor in EFL Writing Education: Focusing on Evaluation of Student-LLM Interaction EMNLP 2024 Perceptions to Beliefs: Exploring Precursory Inferences for Theory of Mind in Large Language Models EMNLP 2024 BEnQA: A Question Answering Benchmark for Bengali and English ACL 2024 Multi-hop Database Reasoning with Virtual Knowledge Graph ACL 2024 The Generative AI Paradox in Evaluation: “What It Can Solve, It May Not Evaluate” EACL 2024 Exploring Cross-Cultural Differences in English Hate Speech Annotations: From Dataset Construction to Analysis NAACL 2024 Translating Subgraphs to Nodes Makes Simple GNNs Strong and Efficient for Subgraph Representation Learning ICML 2024 Can LLM Generate Culturally Relevant Commonsense QA Data? Case Study in Indonesian and Sundanese EMNLP 2024 CLIcK: A Benchmark Dataset of Cultural and Linguistic Intelligence in Korean COLING 2024 RECIPE4U: Student-ChatGPT Interaction Dataset in EFL Writing Education COLING 2024 Rethinking Annotation: Can Language Learners Contribute? ACL 2023 SQuARe: A Large-Scale Dataset of Sensitive Questions and Acceptable Responses Created through Human-Machine Collaboration ACL 2023 Towards standardizing Korean Grammatical Error Correction: Datasets and Annotation ACL 2023 Ranking-Enhanced Unsupervised Sentence Representation Learning ACL 2023 Hate Speech Classifiers are Culturally Insensitive EACL 2023 Time-Aware Representation Learning for Time-Sensitive Question Answering EMNLP 2023 Translating Hanja Historical Documents to Contemporary Korean and English EMNLP 2022 Why Knowledge Distillation Amplifies Gender Bias and How to Mitigate from the Perspective of DistilBERT NAACL 2022 HUE: Pretrained Model and Dataset for Understanding Hanja Documents of Ancient Korea NAACL 2022 KOLD: Korean Offensive Language Dataset EMNLP 2022 Virtual Knowledge Graph Construction for Zero-Shot Domain-Specific Document Retrieval COLING 2022 CS1QA: A Dataset for Assisting Code-based Question Answering in an Introductory Programming Course NAACL 2022 IDK-MRC: Unanswerable Questions for Indonesian Machine Reading Comprehension EMNLP 2022 Two-Step Question Retrieval for Open-Domain QA ACL 2022 Weakly Supervised Pre-Training for Multi-Hop Retriever ACL 2021 Knowledge-Enhanced Evidence Retrieval for Counterargument Generation EMNLP 2021 Emergent Communication under Varying Sizes and Connectivities NIPS 2021 Learning Bill Similarity with Annotated and Augmented Corpora of Bills EMNLP 2021 Dimensional Emotion Detection from Categorical Emotion EMNLP 2021 Efficient Contrastive Learning via Novel Data Augmentation and Curriculum Learning EMNLP 2021 How to Find Your Friendly Neighborhood: Graph Attention Design with Self-Supervision ICLR 2021 Mitigating Language-Dependent Ethnic Bias in BERT EMNLP 2021 Weakly Supervised Pre-Training for Multi-Hop Retriever IJCNLP 2021 Context-Aware Answer Extraction in Question Answering EMNLP 2020 Speaker Sensitive Response Evaluation Model ACL 2020 Suicidal Risk Detection for Military Personnel EMNLP 2020 Conversation Model Fine-Tuning for Classifying Client Utterances in Counseling Dialogues NAACL 2019 Variational Hierarchical User-based Conversation Model IJCNLP 2019 Additive Compositionality of Word Vectors EMNLP 2019 Variational Hierarchical User-based Conversation Model EMNLP 2019 Subword-level Word Vector Representations for Korean ACL 2018 Conversational Decision-Making Model for Predicting the King’s Decision in the Annals of the Joseon Dynasty EMNLP 2018 Hierarchical Dirichlet Gaussian Marked Hawkes Process for Narrative Reconstruction in Continuous Time Domain EMNLP 2018 Rotated Word Vector Representations and their Interpretability EMNLP 2017 Self-disclosure topic model for classifying and analyzing Twitter conversations EMNLP 2014 Hierarchical Dirichlet Scaling Process ICML 2014 Context-Dependent Conceptualization IJCAI 2013 Self-Disclosure and Relationship Strength in Twitter Conversations ACL 2012