Research Explorer

Teams of LLM Agents can Exploit Zero-Day Vulnerabilities

Yuxuan Zhu, Antony Kellermann, Akul Gupta et al.

2026 EACL

Do Political Opinions Transfer Between Western Languages? An Analysis of Unaligned and Aligned Multilingual LLMs

Franziska Weeber, Tanise Ceron, Sebastian Padó

2026 EACL

H-MEM: Hierarchical Memory for High-Efficiency Long-Term Reasoning in LLM Agents

Haoran Sun, Shaoning Zeng, Bob Zhang

2026 EACL

Cetvel: A Unified Benchmark for Evaluating Language Understanding, Generation and Cultural Capacity of LLMs for Turkish

Yakup Abrek Er, Ilker Kesen, Gözde Gül Şahin et al.

2026 EACL

Persona Prompting as a Lens on LLM Social Reasoning

Jing Yang, Moritz Hechtbauer, Elisabeth Khalilov et al.

2026 EACL

Elena Sofia Ruzzetti, Fabio Massimo Zanzotto, Tommaso Caselli

2026 EACL

Uncovering Hidden Correctness in LLM Causal Reasoning via Symbolic Verification

Paul He, Yinya Huang, Mrinmaya Sachan et al.

2026 EACL

CORE: Measuring Multi-Agent LLM Interaction Quality under Game-Theoretic Pressures

Punya Syon Pandey, Yongjin Yang, Jiarui Liu et al.

2026 EACL

Aleph-Alpha-GermanWeb: Improving German-language LLM pre-training with model-based data curation and synthetic data generation

Thomas F Burns, Letitia Parcalabescu, Stephan Waeldchen et al.

2026 EACL

Attacker’s Noise Can Manipulate Your Audio-based LLM in the Real World

Vinu Sankar Sadasivan, Soheil Feizi, Rajiv Mathews et al.

2026 EACL

Say It Another Way: Auditing LLMs with a User-Grounded Automated Paraphrasing Framework

Clea Chataigner, Rebecca Ma, Prakhar Ganesh et al.

2026 EACL

AutoBool: Reinforcement-Learned LLM for Effective Automatic Systematic Reviews Boolean Query Generation

Shuai Wang, Harrisen Scells, Bevan Koopman et al.

2026 EACL

Improving LLM Domain Certification with Pretrained Guide Models

Jiaqian Zhang, Zhaozhi Qian, Faroq AL-Tam et al.

2026 EACL

Coordinates from Context: Using LLMs to Ground Complex Location References

Tessa Masis, Brendan O'Connor

2026 EACL

SearchLLM: Detecting LLM Paraphrased Text by Measuring the Similarity with Regeneration of the Candidate Source via Search Engine

Hoang-Quoc Nguyen-Son, Minh-Son Dao, Koji Zettsu

2026 EACL

Unraveling LLM Jailbreaks Through Safety Knowledge Neurons

Chongwen Zhao, Yutong Ke, Kaizhu Huang

2026 EACL

Knowledge Extraction on Semi-Structured Content: Does It Remain Relevant for Question Answering in the Era of LLMs?

Kai Sun, Yin Huang, Srishti Mehra et al.

2026 EACL

Don’t Judge a Book by its Cover: Testing LLMs’ Robustness Under Logical Obfuscation

Abhilekh Borah, Shubhra Ghosh, Kedar Joshi et al.

2026 EACL

Mind the Gap: Benchmarking LLM Uncertainty and Calibration with Specialty-Aware Clinical QA and Reasoning-Based Behavioural Features

Alberto Testoni, Iacer Calixto

2026 EACL

Reasoning or Knowledge: Stratified Evaluation of Biomedical LLMs

Rahul Thapa, Qingyang Wu, Kevin Wu et al.

2026 EACL

AfriVox: Probing Multilingual and Accent Robustness of Speech LLMs

Busayo Awobade, Mardhiyah Sanni, Tassallah Abdullahi et al.

2026 EACL

PTEB: Towards Robust Text Embedding Evaluation via Stochastic Paraphrasing at Evaluation Time with LLMs

Manuel Frank, Haithem Afli

2026 EACL

How Good Are LLMs at Processing Tool Outputs?

Kiran Kate, Yara Rizk, Poulami Ghosh et al.

2026 EACL

Tug-of-war between idioms’ figurative and literal interpretations in LLMs

Soyoung Oh, Xinting Huang, Mathis Pink et al.

2026 EACL

Do LLM hallucination detectors suffer from low-resource effect?

Debtanu Datta, Mohan Kishore Chilukuri, Yash Kumar et al.

2026 EACL

Papers