Research Explorer

RiddleBench: A New Generative Reasoning Benchmark for LLMs

Deepon Halder, Alan Saji, Thanmay Jayakumar et al.

2026 EACL

ExpressivityBench: Can LLMs Communicate Implicitly?

Joshua Tint, Som Sagar, Aditya Taparia et al.

2026 EACL

SpARK: An Embarrassingly Simple Sparse Watermarking in LLMs with Enhanced Text Quality

Duy Cao Hoang, Thanh Quoc Hung Le, Rui Chu et al.

2026 EACL

UniToolBench: A Benchmark for Tool-Augmented LLMs in Cross-Domain, Universal Task Automation

Xiaojie Guo, Yang Zhang, Bing Zhang et al.

2026 EACL

Thunder-NUBench: A Benchmark for LLMs’ Sentence-Level Negation Understanding

Yeonkyoung So, Gyuseong Lee, Sungmok Jung et al.

2026 EACL

What Makes a Good Query? Measuring the Impact of Human-Confusing Linguistic Features on LLM Performance

William Watson, Nicole Cho, Sumitra Ganesh et al.

2026 EACL

Program-of-Thought Reveals LLM Abstraction Ceilings

Mike Zhou, Fenil Bardoliya, Vivek Gupta et al.

2026 EACL

Show or Tell? Modeling the evolution of request-making in Human-LLM conversations

Shengqi Zhu, Jeffrey Rzeszotarski, David Mimno

2026 EACL

Sparse Brains are Also Adaptive Brains: Cognitive-Load-Aware Dynamic Activation for LLMs

Yiheng Yang, Yujie Wang, Chi Ma et al.

2026 EACL

DuFFin: A Dual-Level Fingerprinting Framework for LLMs IP Protection

Yuliang Yan, Haochun Tang, Shuo Yan et al.

2026 EACL

Let’s Simplify Step by Step: Guiding LLM Towards Multilingual Unsupervised Proficiency-Controlled Sentence Simplification

Jingshen Zhang, Xin Ying Qiu, Lifang Lu et al.

2026 EACL

LogToP: Logic Tree-of-Program with Table Instruction-tuned LLMs for Controlled Logical Table-to-Text Generation

Yupian Lin, Guangya Yu, Cheng Yuan et al.

2026 EACL

DFPE: A Diverse Fingerprint Ensemble for Enhancing LLM Performance

Seffi Cohen, Nurit Cohen Inger, Niv Goldshlager et al.

2026 EACL

Ranking Human and LLM Texts Using Locality Statistics

Yiyang Wang, Chen Ding, Hangfeng He

2026 EACL

Improving the OOD Performance of Closed-Source LLMs on NLI Through Strategic Data Selection

Joe Stacey, Lisa Alazraki, Aran Ubhi et al.

2026 EACL

Do LLMs model human linguistic variation? A case study in Hindi-English Verb code-mixing

Mukund Choudhary, Madhur Jindal, Gaurja Aeron et al.

2026 EACL

FactSelfCheck: Fact-Level Black-Box Hallucination Detection for LLMs

Albert Sawczyn, Jakub Binkowski, Denis Janiak et al.

2026 EACL

What Matters to an LLM? Behavioral and Computational Evidences from Summarization

Yongxin Zhou, Changshun Wu, Philippe Mulhem et al.

2026 EACL

Better Call CLAUSE: A Discrepancy Benchmark for Auditing LLMs Legal Reasoning Capabilities

Manan Roy Choudhury, Adithya Chandramouli, Mannan Anand et al.

2026 EACL

Beyond a Single Extractor: Re-thinking HTML-to-Text Extraction for LLM Pre-training

Jeffrey Li, Joshua P Gardner, Doug Kang et al.

2026 EACL

Can Models Help Us Create Better Models? Evaluating LLMs as Data Scientists

Michał Pietruszka, Łukasz Borchmann, Aleksander Jędrosz et al.

2026 EACL

Argument-Based Consistency in Toxicity Explanations of LLMs

Ramaravind Kommiya Mothilal, Joanna Roy, Syed Ishtiaque Ahmed et al.

2026 EACL

Quantifying Data Contamination in Psychometric Evaluations of LLMs

Jongwook Han, Woojung Song, Jonggeun Lee et al.

2026 EACL

How to Contextualize Empirical Data for Risk Analysis with LLMs: A Case Study of Power Outages

Haiyun Huang, Yukun Li, Marco A Pretell et al.

2026 EACL

Thinking Beyond the Local: Multi-View Instructed Adaptive Reasoning in KG-Enhanced LLMs

Minghan Zhang, Shu Zhao, Zhen Yang et al.

2026 EACL

Papers