Research Explorer

Exploring Prosocial Irrationality for LLM Agents: A Social Cognition View

Xuan Liu, Jie ZHANG, HaoYang Shang et al.

2025 ICLR

BadRobot: Jailbreaking Embodied LLM Agents in the Physical World

Hangtao Zhang, Chenyu Zhu, Xianlong Wang et al.

2025 ICLR

Proactive Agent: Shifting LLM Agents from Reactive Responses to Active Assistance

Yaxi Lu, Shenzhi Yang, Cheng Qian et al.

2025 ICLR

EMOS: Embodiment-aware Heterogeneous Multi-robot Operating System with LLM Agents

Junting Chen, Checheng Yu, Xunzhe Zhou et al.

2025 ICLR

Unsupervised Feature Transformation via In-context Generation, Generator-critic LLM Agents, and Duet-play Teaming

Nanxu Gong, Xinyuan Wang, Wangyang Ying et al.

2025 IJCAI

EDGE: Efficient Data Selection for LLM Agents via Guideline Effectiveness

Yunxiao Zhang, Guanming Xiong, Haochen Li et al.

2025 IJCAI

AgentQuest: A Modular Benchmark Framework to Measure Progress and Improve LLM Agents

Luca Gioacchini, Giuseppe Siracusano, Davide Sanvito et al.

2024 NAACL

On Evaluating the Integration of Reasoning and Action in LLM Agents with Database Question Answering

Linyong Nan, Ellen Zhang, Weijin Zou et al.

2024 NAACL

CRMArena: Understanding the Capacity of LLM Agents to Perform Professional CRM Tasks in Realistic Environments

Kung-Hsiang Huang, Akshara Prabhakar, Sidharth Dhawan et al.

2025 NAACL

AI-LieDar : Examine the Trade-off Between Utility and Truthfulness in LLM Agents

Zhe Su, Xuhui Zhou, Sanketh Rangreji et al.

2025 NAACL

CSR-Bench: Benchmarking LLM Agents in Deployment of Computer Science Research Repositories

Yijia Xiao, Runhui Wang, Luyang Kong et al.

2025 NAACL

Adapting LLM Agents with Universal Communication Feedback

Kuan Wang, Yadong Lu, Michael Santacroce et al.

2025 NAACL

Adaptive Attacks Break Defenses Against Indirect Prompt Injection Attacks on LLM Agents

Qiusi Zhan, Richard Fang, Henil Shalin Panchal et al.

2025 NAACL

Hypothesis Generation for Materials Discovery and Design Using Goal-Driven and Constraint-Guided LLM Agents

Shrinidhi Kumbhar, Venkatesh Mishra, Kevin Coutinho et al.

2025 NAACL

Self Knowledge-Tracing for Tool Use (SKT-Tool): Helping LLM Agents Understand Their Capabilities in Tool Use

Joshua Vigel, Renpei Cai, Eleanor Chen et al.

2025 NAACL

TableWise at SemEval-2025 Task 8: LLM Agents for TabQA

Harsh Bansal, Aman Raj, Akshit Sharma et al.

2025 SEMEVAL

QleverAnswering-PUCRS at SemEval-2025 Task 8: Exploring LLM agents, code generation and correction for Table Question Answering

André Bergmann Lisboa, Lucas Cardoso Azevedo, Lucas Rafael Costella Pessutto

2025 SEMEVAL

Teams of LLM Agents can Exploit Zero-Day Vulnerabilities

Yuxuan Zhu, Antony Kellermann, Akul Gupta et al.

2026 EACL

H-MEM: Hierarchical Memory for High-Efficiency Long-Term Reasoning in LLM Agents

Haoran Sun, Shaoning Zeng, Bob Zhang

2026 EACL

Automating Android Build Repair: Bridging the Reasoning-Execution Gap in LLM Agents with Domain-Specific Tools

Ha Min Son, Huan Ren, Xin Liu et al.

2026 EACL

Beyond Blind Following: Evaluating Robustness of LLM Agents under Imperfect Guidance

Yao Fu, Ran Qiu, Xinhe Wang et al.

2026 EACL

Communication Enables Cooperation in LLM Agents: A Comparison with Curriculum-Based Approaches

Hachem Madmoun, Salem Lahlou

2026 EACL

PersonaTrace: Synthesizing Realistic Digital Footprints with LLM Agents

Minjia Wang, Yunfeng Wang, Xiao Ma et al.

2026 EACL

Beyond IVR: Benchmarking Customer Support LLM Agents for Business-Adherence

Sumanth Balaji, Piyush Mishra, Aashraya Sachdeva et al.

2026 EACL

SIRAJ: Diverse and Efficient Red-Teaming for LLM Agents via Distilled Structured Reasoning

Kaiwen Zhou, Ahmed Elgohary, A S M Iftekhar et al.

2026 EACL

Papers