Roy Ka-Wei Lee

39 papers · 2020–2026 · 10 conferences · across top CS/AI conferences

Achievements

+10 more ↓

🏃 Academic Marathon (5) 🌉 Interdisciplinary Bridge 🧭 Keyword Pioneer 🌍 Conference Polyglot (8) 🐝 Cross-Pollinator (12)

🐣 Hot Topic Early Bird 🏃 Academic Marathon (5) 🧭 Keyword Pioneer 🧬 Topic Evolution 🔥 Unstoppable (6) ⚡ Prolific Year (12) 💎 Century Club (31) ❓ The Questioner 📈 Trend Setter 🗃️ Keyword Collector (160)

Conferences

EMNLP (14) ACL (8) AAAI (4) IJCAI (4) NAACL (3) EACL (2) COLING (1) ICLR (1) ICML (1) IJCNLP (1)

Top co-authors

Ming Shan Hee (7) Zhengyuan Liu (6) Bryan Chen Zhengyu Tan (5) Yujia Hu (5) Kenny Tsu Wei Choo (5) Nancy F. Chen (5) Rui Cao (4) Zhiqiang Hu (4) Nirmalendu Prakash (4) Lei Wang (3)

Keywords

large language model (13) hate speech detection (9) multimodal learning (8) content moderation (4) low-resource language (3) text classification (3) machine translation (3) multilingual nlp (3) orthogonality constraint (2) numerical stability (2) retrieval-augmented generation (2) sparse autoencoder (2) few-shot learning (2) refusal behavior (2) large multimodal model (2) cross-lingual transfer (2) offensive language detection (2) domain adaptation (2) in-context learning (2) vision-language model (2)

Papers

AdaMCoT: Rethinking Cross-Lingual Factual Reasoning Through Adaptive Multilingual Chain-of-Thought AAAI 2026 BLEnD-Vis: Benchmarking Multimodal Cultural Understanding in Vision Language Models EACL 2026 HateXScore: A Metric Suite for Evaluating Reasoning Quality in Hate Speech Explanations EACL 2026 SafeLens: Segment-Level Hate Speech Detection in Online Videos AAAI 2026 Can Persona-Prompted LLMs Emulate Subgroup Values? An Empirical Analysis of Generalisability and Fairness in Cultural Alignment ACL 2026 MMAC: A Multilingual, Multimodal Alignment Framework for Cultural Grounding Evaluation ACL 2026 Multi-Agent VLMs Guided Self-Training with PNU Loss for Low-Resource Offensive Content Detection AAAI 2026 Beyond I’m Sorry, I Can’t: Dissecting Large-Language-Model Refusal AAAI 2026 Understanding Refusal in Language Models with Sparse Autoencoders EMNLP 2025 Finding the Sweet Spot: Preference Data Construction for Scaling Preference Optimization ACL 2025 Is LLM an Overconfident Judge? Unveiling the Capabilities of LLMs in Detecting Offensive Language with Annotation Disagreement ACL 2025 Persuasion Dynamics in LLMs: Investigating Robustness and Adaptability in Knowledge and Safety with DuET-PD EMNLP 2025 Toxicity Red-Teaming: Benchmarking LLM Safety in Singapore’s Low-Resource Languages EMNLP 2025 LionGuard 2: Building Lightweight, Data-Efficient & Localised Multilingual Content Moderators EMNLP 2025 CCL-XCoT: An Efficient Cross-Lingual Knowledge Transfer Method for Mitigating Hallucination Generation EMNLP 2025 Humor in Pixels: Benchmarking Large Multimodal Models Understanding of Online Comics EMNLP 2025 LongGenBench: Benchmarking Long-Form Generation in Long Context LLMs ICLR 2025 A Closer Look at Backdoor Attacks on CLIP ICML 2025 Resolving Conflicting Evidence in Automated Fact-Checking: A Study on Retrieval-Augmented LLMs IJCAI 2025 Unmasking Implicit Bias: Evaluating Persona-Prompted LLM Responses in Power-Disparate Social Scenarios NAACL 2025 Recent Advances in Online Hate Speech Moderation: Multimodality and the Role of Large Models EMNLP 2024 Math-LLaVA: Bootstrapping Mathematical Reasoning for Multimodal Large Language Models EMNLP 2024 Improving Covert Toxicity Detection by Retrieving and Generating References NAACL 2024 SGHateCheck: Functional Tests for Detecting Hate Speech in Low-Resource Languages of Singapore NAACL 2024 ToxiCloakCN: Evaluating Robustness of Offensive Language Detection in Chinese with Cloaking Perturbations EMNLP 2024 Bridging Modalities: Enhancing Cross-Modality Hate Speech Detection with Few-Shot In-Context Learning EMNLP 2024 Plan-and-Solve Prompting: Improving Zero-Shot Chain-of-Thought Reasoning by Large Language Models ACL 2023 Layered Bias: Interpreting Bias in Pretrained Large Language Models EMNLP 2023 Improving the Detection of Multilingual Online Attacks with Rich Social Media Data from Singapore ACL 2023 Decoding the Underlying Meaning of Multimodal Hateful Memes IJCAI 2023 Evaluating GPT-3 Generated Explanations for Hateful Content Moderation IJCAI 2023 Prompting for Multimodal Hateful Meme Classification EMNLP 2022 On Orthogonality Constraints for Transformers ACL 2021 Improving Text Auto-Completion with Next Phrase Prediction EMNLP 2021 On Orthogonality Constraints for Transformers IJCNLP 2021 Graph-to-Tree Learning for Solving Math Word Problems ACL 2020 HateGAN: Adversarial Generative-Based Data Augmentation for Hate Speech Detection COLING 2020 Teacher-Student Networks with Multiple Decoders for Solving Math Word Problem IJCAI 2020 I miss you babe: Analyzing Emotion Dynamics During COVID-19 Pandemic EMNLP 2020