Jey Han Lau

81 papers · 2010–2026 · 11 conferences · across top CS/AI conferences

Achievements

+13 more ↓

🌍 Conference Polyglot (11) 🧭 Keyword Pioneer 🗺️ Taxonomy Completionist (12) 🌉 Interdisciplinary Bridge 🏃 Academic Marathon (15)

🏃 Academic Marathon (15) 🗺️ Taxonomy Completionist (12) 🧭 Keyword Pioneer 🏠 Conference Loyalist (23) 🏆 Keyword Champion (3) 🤝 Dynamic Duo (48) 📈 Trend Setter 🔥 Unstoppable (16) 🚀 Conference Pioneer ⚡ Prolific Year (11) 🗃️ Keyword Collector (269) ❓ The Questioner (3) 💎 Century Club (75)

Conferences

ACL (25) COLING (12) EACL (11) NAACL (11) EMNLP (9) IJCNLP (5) AACL (2) IJCAI (2) SEMEVAL (2) AAAI (1) CONLL (1)

Top co-authors

Timothy Baldwin (50) Fajri Koto (14) Yulia Otmakhova (7) David Newman (6) Karin Verspoor (6) Trevor Cohn (6) Paul Cook (5) Kemal Kurniawan (4) Jianzhong Qi (4) Shraey Bhatia (4)

Research topics

Privacy (1)

Keywords

language model (8) text summarization (7) text classification (6) topic model (5) pretrained language model (5) sentiment analysis (4) large language model (4) multi-document summarization (4) low-resource language (4) unsupervised learning (4) neural network (4) natural language processing (3) task-oriented dialogue (3) rumour detection (3) multimodal learning (3) text generation (3) word embedding (3) dialogue generation (3) abstractive summarization (3) zero-shot learning (2)

Papers

On the Interplay between Human Label Variation and Model Fairness EACL 2026 Context Volume Drives Performance: Tackling Domain Shift in Extremely Low-Resource Translation via RAG EACL 2026 Controlling Distributional Bias in Multi-Round LLM Generation via KL-Optimized Fine-Tuning ACL 2026 FLUKE: A Linguistically-Driven and Task-Agnostic Framework for Robustness Evaluation EACL 2026 COMMUNITYNOTES: A Dataset for Exploring the Helpfulness of Fact-Checking Explanations EACL 2026 CIG: Measuring Conversational Information Gain in Deliberative Dialogues with Semantic Memory Dynamics ACL 2026 Beyond Seen Data: Improving KBQA Generalization Through Schema-Guided Logical Form Generation EMNLP 2025 Beyond Perception: Evaluating Abstract Visual Reasoning through Multi-Stage Task ACL 2025 Interaction Matters: An Evaluation Framework for Interactive Dialogue Assessment on English Second Language Conversations COLING 2025 Factual Dialogue Summarization via Learning from Large Language Models COLING 2025 Decomposed Opinion Summarization with Verified Aspect-Aware Modules ACL 2025 Moderation Matters: Measuring Conversational Moderation Impact in English as a Second Language Group Discussion ACL 2025 Can LLMs Simulate L2-English Dialogue? An Information-Theoretic Analysis of L1-Dependent Biases ACL 2025 Evaluating Evidence Attribution in Generated Fact Checking Explanations NAACL 2025 WHoW: A Cross-domain Approach for Analysing Conversation Moderation NAACL 2025 An Interpretable and Crosslingual Method for Evaluating Second-Language Dialogues NAACL 2025 WET: Overcoming Paraphrasing Vulnerabilities in Embeddings-as-a-Service with Linear Transformation Watermarks ACL 2025 To Aggregate or Not to Aggregate. That is the Question: A Case Study on Annotation Subjectivity in Span Prediction ACL 2024 KALE: An Artwork Image Captioning System Augmented with Heterogeneous Graph IJCAI 2024 CMA-R: Causal Mediation Analysis for Explaining Rumour Detection EACL 2024 A Sentiment Consolidation Framework for Meta-Review Generation ACL 2024 Compressed Heterogeneous Graph for Abstractive Multi-Document Summarization AAAI 2023 Annotating and Detecting Fine-grained Factual Errors for Dialogue Summarization ACL 2023 Unsupervised Paraphrasing of Multiword Expressions ACL 2023 NusaX: Multilingual Parallel Sentiment Dataset for 10 Indonesian Local Languages EACL 2023 Improving Visual-Semantic Embedding with Adaptive Pooling and Optimization Objective EACL 2023 Unsupervised Lexical Simplification with Context Augmentation EMNLP 2023 Cross-linguistic Comparison of Linguistic Feature Encoding in BERT Models for Typologically Different Languages NAACL 2022 Not another Negation Benchmark: The NaN-NLI Test Suite for Sub-clausal Negation AACL 2022 DUCK: Rumour Detection on Social Media by Modelling User and Comment Propagation Networks NAACL 2022 Not another Negation Benchmark: The NaN-NLI Test Suite for Sub-clausal Negation IJCNLP 2022 Cloze Evaluation for Deeper Understanding of Commonsense Stories in Indonesian ACL 2022 LipKey: A Large-Scale News Dataset for Absent Keyphrases Generation and Abstractive Summarization COLING 2022 Unsupervised Lexical Substitution with Decontextualised Embeddings COLING 2022 Easy-First Bottom-Up Discourse Parsing via Sequence Labelling COLING 2022 LED down the rabbit hole: exploring the potential of global attention for biomedical multi-document summarisation COLING 2022 One Country, 700+ Languages: NLP Challenges for Underrepresented Languages and Dialects in Indonesia ACL 2022 The patient is more dead than alive: exploring the current state of the multi-document summarisation of the biomedical literature ACL 2022 M3: Multi-level dataset for Multi-document summarisation of Medical studies EMNLP 2022 Can Pretrained Language Models Generate Persuasive, Faithful, and Informative Ad Text for Product Descriptions? ACL 2022 Robust Task-Oriented Dialogue Generation with Contrastive Pre-training and Adversarial Filtering EMNLP 2022 An Interpretable Neuro-Symbolic Reasoning Framework for Task-Oriented Dialogue Generation ACL 2022 Discourse Probing of Pretrained Language Models NAACL 2021 Top-down Discourse Parsing via Sequence Labelling EACL 2021 IndoBERTweet: A Pretrained Language Model for Indonesian Twitter with Effective Domain-Specific Vocabulary Initialization EMNLP 2021 Learning Contextualised Cross-lingual Word Embeddings and Alignments for Extremely Low-Resource Languages Using Parallel Corpora EMNLP 2021 Semi-automatic Triage of Requests for Free Legal Assistance EMNLP 2021 Evaluating the Efficacy of Summarization Evaluation across Languages ACL 2021 UniMF: A Unified Framework to Incorporate Multimodal Knowledge Bases intoEnd-to-End Task-Oriented Dialogue Systems IJCAI 2021 Evaluating the Efficacy of Summarization Evaluation across Languages IJCNLP 2021 Automatic Classification of Neutralization Techniques in the Narrative of Climate Change Scepticism NAACL 2021 Grey-box Adversarial Attack And Defence For Sentiment Classification NAACL 2021 Give Me Convenience and Give Her Death: Who Should Decide What Uses of NLP are Appropriate, and on What Basis? ACL 2020 IndoLEM and IndoBERT: A Benchmark Dataset and Pre-trained Language Model for Indonesian NLP COLING 2020 Liputan6: A Large-scale Indonesian Dataset for Text Summarization AACL 2020 Early Rumour Detection NAACL 2019 Preferred Answer Selection in Stack Overflow: Better Text Representations ... and Metadata, Metadata, Metadata EMNLP 2018 Deep-speare: A joint neural model of poetic language, meter and rhyme ACL 2018 The Influence of Context on Sentence Acceptability Judgements ACL 2018 Topic Intrusion for Automatic Topic Model Evaluation EMNLP 2018 Multimodal Topic Labelling EACL 2017 An Automatic Approach for Document-level Topic Model Evaluation CONLL 2017 End-to-end Network for Twitter Geolocation Prediction and Hashing IJCNLP 2017 Topically Driven Neural Language Model ACL 2017 The Sensitivity of Topic Coherence Evaluation to Topic Cardinality NAACL 2016 Automatic Labelling of Topics with Neural Embeddings COLING 2016 LexSemTm: A Semantic Dataset Based on All-words Unsupervised Sense Distribution Learning ACL 2016 Unsupervised Prediction of Acceptability Judgements ACL 2015 Unsupervised Prediction of Acceptability Judgements IJCNLP 2015 Learning Word Sense Distributions, Detecting Unattested Senses and Identifying Novel Senses Using Topic Models ACL 2014 Novel Word-sense Identification COLING 2014 Machine Reading Tea Leaves: Automatically Evaluating Topic Coherence and Topic Model Quality EACL 2014 unimelb: Topic Modelling-based Word Sense Induction SEMEVAL 2013 Unsupervised Word Class Induction for Under-resourced Languages: A Case Study on Indonesian IJCNLP 2013 unimelb: Topic Modelling-based Word Sense Induction for Web Snippet Clustering SEMEVAL 2013 On-line Trend Analysis with Topic Models: #twitter Trends Detection Topic Model Online COLING 2012 Word Sense Induction for Novel Sense Detection EACL 2012 Bayesian Text Segmentation for Index Term Identification and Keyphrase Extraction COLING 2012 Automatic Labelling of Topic Models ACL 2011 Automatic Evaluation of Topic Coherence NAACL 2010 Best Topic Word Selection for Topic Labelling COLING 2010