Alice Oh
76 papers · 2012–2026 · 12 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+16 more ↓ Show less ↑
π£ Hot Topic Early Bird π§ Keyword Pioneer π Interdisciplinary Bridge π Renaissance Researcher (5) π Conference Polyglot (11)
π
Interdisciplinary Bridge
π§
Keyword Pioneer
π£
Hot Topic Early Bird
π
Conference Loyalist
(23)
π€
Dynamic Duo
(11)
π₯
Mega-Team
(51)
π¬
Deep Specialist
(17)
π§¬
Topic Evolution
π
Keyword Champion
(5)
β‘
Prolific Year
(11)
β
The Questioner
(3)
π
Trend Setter
π
Century Club
(69)
π
Conference Pioneer
π₯
Unstoppable
(9)
ποΈ
Keyword Collector
(295)
Conferences
ACL (25)
EMNLP (23)
NAACL (8)
IJCNLP (4)
COLING (3)
EACL (3)
AACL (2)
ICLR (2)
ICML (2)
NIPS (2)
IJCAI (1)
SEMEVAL (1)
Top co-authors
Research topics
Keywords
large language model
(22)
language model
(8)
korean language
(6)
multilingual nlp
(6)
low-resource language
(6)
benchmark evaluation
(6)
cultural knowledge
(5)
question answering
(5)
machine translation
(5)
named entity recognition
(4)
benchmark dataset
(4)
historical document
(4)
variational inference
(3)
response generation
(3)
zero-shot learning
(3)
bias evaluation
(3)
cross-lingual transfer
(3)
text classification
(3)
representation learning
(3)
semantic similarity
(3)
Papers
Are they lovers or friends? Evaluating LLMsβ Social Reasoning in English and Korean Dialogues
ACL 2026
Investigating Counterfactual Unfairness in LLMs towards Identities through Humor
ACL 2026
LoCar: Localization-Aware Evaluation of In-Vehicle Assistants through Fine-Grained Sociolinguistic Control
ACL 2026
FINEST: Improving LLM Responses to Sensitive Topics Through Fine-Grained Evaluation
EACL 2026
OLA: Output Language Alignment in Code-Switched LLM Interactions
ACL 2026
MUG-Eval: A Proxy Evaluation Framework for Multilingual Generation Capabilities in Any Language
EMNLP 2025
Culture is Everywhere: A Call for Intentionally Cultural Evaluation
EMNLP 2025
BLUCK: A Benchmark Dataset for Bengali Linguistic Understanding and Cultural Knowledge
AACL 2025
LLM-as-an-Interviewer: Beyond Static Testing Through Dynamic LLM Evaluation
ACL 2025
PapersPlease: A Benchmark for Evaluating Motivational Values of Large Language Models Based on ERG Theory
ACL 2025
Team ACK at SemEval-2025 Task 2: Beyond Word-for-Word Machine Translation for English-Korean Pairs
ACL 2025
Team ACK at SemEval-2025 Task 2: Beyond Word-for-Word Machine Translation for English-Korean Pairs
SEMEVAL 2025
WHEN TOM EATS KIMCHI: Evaluating Cultural Awareness of Multimodal Large Language Models in Cultural Mixture Contexts
NAACL 2025
LLM-C3MOD: A Human-LLM Collaborative System for Cross-Cultural Hate Speech Moderation
NAACL 2025
WorldCuisines: A Massive-Scale Benchmark for Multilingual and Multicultural Visual Question Answering on Global Cuisines
NAACL 2025
BLUCK: A Benchmark Dataset for Bengali Linguistic Understanding and Cultural Knowledge
IJCNLP 2025
Shared Heritage, Distinct Writing: Rethinking Resource Selection for East Asian Historical Documents
IJCNLP 2025
Generalizing Weisfeiler-Lehman Kernels to Subgraphs
ICLR 2025
Uncovering Factor-Level Preference to Improve Human-Model Alignment
EMNLP 2025
Shared Heritage, Distinct Writing: Rethinking Resource Selection for East Asian Historical Documents
AACL 2025
DREsS: Dataset for Rubric-based Essay Scoring on EFL Writing
ACL 2025
Global MMLU: Understanding and Addressing Cultural and Linguistic Biases in Multilingual Evaluation
ACL 2025
XDAC: XAI-Driven Detection and Attribution of LLM-Generated News Comments in Korean
ACL 2025
Diffusion Models Through a Global Lens: Are They Culturally Inclusive?
ACL 2025
Code-Switching Curriculum Learning for Multilingual Transfer in LLMs
ACL 2025
Social Bias Benchmark for Generation: A Comparison of Generation and QA-Based Evaluations
ACL 2025
Spotting Out-of-Character Behavior: Atomic-Level Evaluation of Persona Fidelity in Open-Ended Generation
ACL 2025
BLEnD: A Benchmark for LLMs on Everyday Knowledge in Diverse Cultures and Languages
NIPS 2024
LLM-as-a-tutor in EFL Writing Education: Focusing on Evaluation of Student-LLM Interaction
EMNLP 2024
Perceptions to Beliefs: Exploring Precursory Inferences for Theory of Mind in Large Language Models
EMNLP 2024
BEnQA: A Question Answering Benchmark for Bengali and English
ACL 2024
Multi-hop Database Reasoning with Virtual Knowledge Graph
ACL 2024
The Generative AI Paradox in Evaluation: βWhat It Can Solve, It May Not Evaluateβ
EACL 2024
Exploring Cross-Cultural Differences in English Hate Speech Annotations: From Dataset Construction to Analysis
NAACL 2024
Translating Subgraphs to Nodes Makes Simple GNNs Strong and Efficient for Subgraph Representation Learning
ICML 2024
Can LLM Generate Culturally Relevant Commonsense QA Data? Case Study in Indonesian and Sundanese
EMNLP 2024
CLIcK: A Benchmark Dataset of Cultural and Linguistic Intelligence in Korean
COLING 2024
RECIPE4U: Student-ChatGPT Interaction Dataset in EFL Writing Education
COLING 2024
Rethinking Annotation: Can Language Learners Contribute?
ACL 2023
SQuARe: A Large-Scale Dataset of Sensitive Questions and Acceptable Responses Created through Human-Machine Collaboration
ACL 2023
Towards standardizing Korean Grammatical Error Correction: Datasets and Annotation
ACL 2023
Ranking-Enhanced Unsupervised Sentence Representation Learning
ACL 2023
Hate Speech Classifiers are Culturally Insensitive
EACL 2023
Time-Aware Representation Learning for Time-Sensitive Question Answering
EMNLP 2023
Translating Hanja Historical Documents to Contemporary Korean and English
EMNLP 2022
Why Knowledge Distillation Amplifies Gender Bias and How to Mitigate from the Perspective of DistilBERT
NAACL 2022
HUE: Pretrained Model and Dataset for Understanding Hanja Documents of Ancient Korea
NAACL 2022
KOLD: Korean Offensive Language Dataset
EMNLP 2022
Virtual Knowledge Graph Construction for Zero-Shot Domain-Specific Document Retrieval
COLING 2022
CS1QA: A Dataset for Assisting Code-based Question Answering in an Introductory Programming Course
NAACL 2022
IDK-MRC: Unanswerable Questions for Indonesian Machine Reading Comprehension
EMNLP 2022
Two-Step Question Retrieval for Open-Domain QA
ACL 2022
Weakly Supervised Pre-Training for Multi-Hop Retriever
ACL 2021
Knowledge-Enhanced Evidence Retrieval for Counterargument Generation
EMNLP 2021
Emergent Communication under Varying Sizes and Connectivities
NIPS 2021
Learning Bill Similarity with Annotated and Augmented Corpora of Bills
EMNLP 2021
Dimensional Emotion Detection from Categorical Emotion
EMNLP 2021
Efficient Contrastive Learning via Novel Data Augmentation and Curriculum Learning
EMNLP 2021
How to Find Your Friendly Neighborhood: Graph Attention Design with Self-Supervision
ICLR 2021
Mitigating Language-Dependent Ethnic Bias in BERT
EMNLP 2021
Weakly Supervised Pre-Training for Multi-Hop Retriever
IJCNLP 2021
Context-Aware Answer Extraction in Question Answering
EMNLP 2020
Speaker Sensitive Response Evaluation Model
ACL 2020
Suicidal Risk Detection for Military Personnel
EMNLP 2020
Conversation Model Fine-Tuning for Classifying Client Utterances in Counseling Dialogues
NAACL 2019
Variational Hierarchical User-based Conversation Model
IJCNLP 2019
Additive Compositionality of Word Vectors
EMNLP 2019
Variational Hierarchical User-based Conversation Model
EMNLP 2019
Subword-level Word Vector Representations for Korean
ACL 2018
Conversational Decision-Making Model for Predicting the Kingβs Decision in the Annals of the Joseon Dynasty
EMNLP 2018
Hierarchical Dirichlet Gaussian Marked Hawkes Process for Narrative Reconstruction in Continuous Time Domain
EMNLP 2018
Rotated Word Vector Representations and their Interpretability
EMNLP 2017
Self-disclosure topic model for classifying and analyzing Twitter conversations
EMNLP 2014
Hierarchical Dirichlet Scaling Process
ICML 2014
Context-Dependent Conceptualization
IJCAI 2013
Self-Disclosure and Relationship Strength in Twitter Conversations
ACL 2012