Research Explorer

Reward-Weighted Sampling: Enhancing Non-Autoregressive Characteristics in Masked Diffusion LLMs

Daehoon Gwak, Minseo Jung, Junwoo Park et al.

2025 EMNLP

AI Argues Differently: Distinct Argumentative and Linguistic Patterns of LLMs in Persuasive Contexts

Esra Dönmez, Maximilian Maurer, Gabriella Lapesa et al.

2025 EMNLP

The Illusion of Progress: Re-evaluating Hallucination Detection in LLMs

Denis Janiak, Jakub Binkowski, Albert Sawczyn et al.

2025 EMNLP

Breaking Agents: Compromising Autonomous LLM Agents Through Malfunction Amplification

Boyang Zhang, Yicong Tan, Yun Shen et al.

2025 EMNLP

Trojsten Benchmark: Evaluating LLM Problem-Solving in Slovak STEM Competition Problems

Adam Zahradník, Marek Suppa

2025 EMNLP

A Simple Yet Effective Method for Non-Refusing Context Relevant Fine-grained Safety Steering in LLMs

Shaona Ghosh, Amrita Bhattacharjee, Yftah Ziser et al.

2025 EMNLP

so much depends / upon / a whitespace: Why Whitespace Matters for Poets and LLMs

Sriharsh Bhyravajjula, Melanie Walsh, Anna Preus et al.

2025 EMNLP

Certified Mitigation of Worst-Case LLM Copyright Infringement

Jingyu Zhang, Jiacan Yu, Marc Marone et al.

2025 EMNLP

CourtReasoner: Can LLM Agents Reason Like Judges?

Sophia Simeng Han, Yoshiki Takashima, Shannon Zejiang Shen et al.

2025 EMNLP

Retracing the Past: LLMs Emit Training Data When They Get Lost

Myeongseob Ko, Nikhil Reddy Billa, Adam Nguyen et al.

2025 EMNLP

Table-LLM-Specialist: Language Model Specialists for Tables using Iterative Fine-tuning

Junjie Xing, Yeye He, Mengyu Zhou et al.

2025 EMNLP

Mind the Blind Spots: A Focus-Level Evaluation Framework for LLM Reviews

Hyungyu Shin, Jingyu Tang, Yoonjoo Lee et al.

2025 EMNLP

A Head to Predict and a Head to Question: Pre-trained Uncertainty Quantification Heads for Hallucination Detection in LLM Outputs

Artem Shelmanov, Ekaterina Fadeeva, Akim Tsvigun et al.

2025 EMNLP

AgentDiagnose: An Open Toolkit for Diagnosing LLM Agent Trajectories

Tianyue Ou, Wanyao Guo, Apurva Gandhi et al.

2025 EMNLP

MedTutor: A Retrieval-Augmented LLM System for Case-Based Medical Education

Dongsuk Jang, Ziyao Shangguan, Kyle Tegtmeyer et al.

2025 EMNLP

LLM×MapReduce-V3: Enabling Interactive In-Depth Survey Generation through a MCP-Driven Hierarchically Modular Agent System

Yu Chao, Siyu Lin, Xiaorong Wang et al.

2025 EMNLP

TruthTorchLM: A Comprehensive Library for Predicting Truthfulness in LLM Outputs

Duygu Nur Yaldiz, Yavuz Faruk Bakman, Sungmin Kang et al.

2025 EMNLP

Easy Dataset: A Unified and Extensible Framework for Synthesizing LLM Fine-Tuning Data from Unstructured Documents

Ziyang Miao, Qiyu Sun, Jingyuan Wang et al.

2025 EMNLP

SAGE: A Generic Framework for LLM Safety Evaluation

Madhur Jindal, Hari Shrawgi, Parag Agrawal et al.

2025 EMNLP

CRAB: A Benchmark for Evaluating Curation of Retrieval-Augmented LLMs in Biomedicine

Hanmeng Zhong, Linqing Chen, Wentao Wu et al.

2025 EMNLP

Aligning LLMs for Multilingual Consistency in Enterprise Applications

Amit Agarwal, Hansa Meghwani, Hitesh Laxmichand Patel et al.

2025 EMNLP

Mirror in the Model: Ad Banner Image Generation via Reflective Multi-LLM and Multi-modal Agents

Zhao Wang, Bowen Chen, Yotaro Shimose et al.

2025 EMNLP

ECom-Bench: Can LLM Agent Resolve Real-World E-commerce Customer Support Issues?

Haoxin Wang, Xianhan Peng, Huang Cheng et al.

2025 EMNLP

ProCut: LLM Prompt Compression via Attribution Estimation

Zhentao Xu, Fengyi Li, Albert C. Chen et al.

2025 EMNLP

Select-then-Route : Taxonomy guided Routing for LLMs

Soham Shah, Kumar Shridhar

2025 EMNLP

Papers