Research Explorer

LLM as a Broken Telephone: Iterative Generation Distorts Information

Amr Mohamed, Mingmeng Geng, Michalis Vazirgiannis et al.

2025 ACL

Enough Coin Flips Can Make LLMs Act Bayesian

Ritwik Gupta, Rodolfo Corona, Jiaxin Ge et al.

2025 ACL

GAMEBoT: Transparent Assessment of LLM Reasoning in Games

Wenye Lin, Jonathan Roberts, Yunhan Yang et al.

2025 ACL

A Text is Worth Several Tokens: Text Embedding from LLMs Secretly Aligns Well with The Key Tokens

Zhijie Nie, Richong Zhang, Zhanyu Wu

2025 ACL

CER: Confidence Enhanced Reasoning in LLMs

Ali Razghandi, Seyed Mohammad Hadi Hosseini, Mahdieh Soleymani Baghshah

2025 ACL

SynthesizeMe! Inducing Persona-Guided Prompts for Personalized Reward Models in LLMs

Michael J. Ryan, Omar Shaikh, Aditri Bhagirath et al.

2025 ACL

Self-Error-Instruct: Generalizing from Errors for LLMs Mathematical Reasoning

Erxin Yu, Jing Li, Ming Liao et al.

2025 ACL

MultiAgentBench : Evaluating the Collaboration and Competition of LLM agents

Kunlun Zhu, Hongyi Du, Zhaochen Hong et al.

2025 ACL

LLMs can Perform Multi-Dimensional Analytic Writing Assessments: A Case Study of L2 Graduate-Level Academic English Writing

Zhengxiang Wang, Veronika Makarova, Zhi Li et al.

2025 ACL

SEUF: Is Unlearning One Expert Enough for Mixture-of-Experts LLMs?

Haomin Zhuang, Yihua Zhang, Kehan Guo et al.

2025 ACL

LocAgent: Graph-Guided LLM Agents for Code Localization

Zhaoling Chen, Robert Tang, Gangda Deng et al.

2025 ACL

CKnowEdit: A New Chinese Knowledge Editing Dataset for Linguistics, Facts, and Logic Error Correction in LLMs

Jizhan Fang, Tianhe Lu, Yunzhi Yao et al.

2025 ACL

SkillVerse : Assessing and Enhancing LLMs with Tree Evaluation

Yufei Tian, Jiao Sun, Nanyun Peng et al.

2025 ACL

CypherBench: Towards Precise Retrieval over Full-scale Modern Knowledge Graphs in the LLM Era

Yanlin Feng, Simone Papicchio, Sajjadur Rahman

2025 ACL

Are LLMs effective psychological assessors? Leveraging adaptive RAG for interpretable mental health screening through psychometric practice

Federico Ravenda, Seyed Ali Bahrainian, Andrea Raballo et al.

2025 ACL

Improving Preference Extraction In LLMs By Identifying Latent Knowledge Through Classifying Probes

Sharan Maiya, Yinhong Liu, Ramit Debnath et al.

2025 ACL

White Men Lead, Black Women Help? Benchmarking and Mitigating Language Agency Social Biases in LLMs

Yixin Wan, Kai-Wei Chang

2025 ACL

AIMSCheck: Leveraging LLMs for AI-Assisted Review of Modern Slavery Statements Across Jurisdictions

Adriana Eufrosina Bora, Akshatha Arodi, Duoyi Zhang et al.

2025 ACL

Bitnet.cpp: Efficient Edge Inference for Ternary LLMs

Jinheng Wang, Hansong Zhou, Ting Song et al.

2025 ACL

PIG: Privacy Jailbreak Attack on LLMs via Gradient-based Iterative In-Context Optimization

Yidan Wang, Yanan Cao, Yubing Ren et al.

2025 ACL

Agents Under Siege: Breaking Pragmatic Multi-Agent LLM Systems with Optimized Prompt Attacks

Rana Shahroz, Zhen Tan, Sukwon Yun et al.

2025 ACL

On the Risk of Evidence Pollution for Malicious Social Text Detection in the Era of LLMs

Herun Wan, Minnan Luo, Zhixiong Su et al.

2025 ACL

Polishing Every Facet of the GEM: Testing Linguistic Competence of LLMs and Humans in Korean

SungHo Kim, Nayeon Kim, Taehee Jeon et al.

2025 ACL

NexusSum: Hierarchical LLM Agents for Long-Form Narrative Summarization

Hyuntak Kim, Byung-Hak Kim

2025 ACL

Exploring the Potential of LLMs as Personalized Assistants: Dataset, Evaluation, and Analysis

Jisoo Mok, Ik-hwan Kim, Sangkwon Park et al.

2025 ACL

Papers