Papers
219 papers found
Exploring Prosocial Irrationality for LLM Agents: A Social Cognition View
Xuan Liu, Jie ZHANG, HaoYang Shang et al.
BadRobot: Jailbreaking Embodied LLM Agents in the Physical World
Hangtao Zhang, Chenyu Zhu, Xianlong Wang et al.
Proactive Agent: Shifting LLM Agents from Reactive Responses to Active Assistance
Yaxi Lu, Shenzhi Yang, Cheng Qian et al.
EMOS: Embodiment-aware Heterogeneous Multi-robot Operating System with LLM Agents
Junting Chen, Checheng Yu, Xunzhe Zhou et al.
Unsupervised Feature Transformation via In-context Generation, Generator-critic LLM Agents, and Duet-play Teaming
Nanxu Gong, Xinyuan Wang, Wangyang Ying et al.
EDGE: Efficient Data Selection for LLM Agents via Guideline Effectiveness
Yunxiao Zhang, Guanming Xiong, Haochen Li et al.
AgentQuest: A Modular Benchmark Framework to Measure Progress and Improve LLM Agents
Luca Gioacchini, Giuseppe Siracusano, Davide Sanvito et al.
On Evaluating the Integration of Reasoning and Action in LLM Agents with Database Question Answering
Linyong Nan, Ellen Zhang, Weijin Zou et al.
CRMArena: Understanding the Capacity of LLM Agents to Perform Professional CRM Tasks in Realistic Environments
Kung-Hsiang Huang, Akshara Prabhakar, Sidharth Dhawan et al.
AI-LieDar : Examine the Trade-off Between Utility and Truthfulness in LLM Agents
Zhe Su, Xuhui Zhou, Sanketh Rangreji et al.
CSR-Bench: Benchmarking LLM Agents in Deployment of Computer Science Research Repositories
Yijia Xiao, Runhui Wang, Luyang Kong et al.
Adapting LLM Agents with Universal Communication Feedback
Kuan Wang, Yadong Lu, Michael Santacroce et al.
Adaptive Attacks Break Defenses Against Indirect Prompt Injection Attacks on LLM Agents
Qiusi Zhan, Richard Fang, Henil Shalin Panchal et al.
Hypothesis Generation for Materials Discovery and Design Using Goal-Driven and Constraint-Guided LLM Agents
Shrinidhi Kumbhar, Venkatesh Mishra, Kevin Coutinho et al.
Self Knowledge-Tracing for Tool Use (SKT-Tool): Helping LLM Agents Understand Their Capabilities in Tool Use
Joshua Vigel, Renpei Cai, Eleanor Chen et al.
TableWise at SemEval-2025 Task 8: LLM Agents for TabQA
Harsh Bansal, Aman Raj, Akshit Sharma et al.
QleverAnswering-PUCRS at SemEval-2025 Task 8: Exploring LLM agents, code generation and correction for Table Question Answering
André Bergmann Lisboa, Lucas Cardoso Azevedo, Lucas Rafael Costella Pessutto
Teams of LLM Agents can Exploit Zero-Day Vulnerabilities
Yuxuan Zhu, Antony Kellermann, Akul Gupta et al.
H-MEM: Hierarchical Memory for High-Efficiency Long-Term Reasoning in LLM Agents
Haoran Sun, Shaoning Zeng, Bob Zhang
Automating Android Build Repair: Bridging the Reasoning-Execution Gap in LLM Agents with Domain-Specific Tools
Ha Min Son, Huan Ren, Xin Liu et al.
Beyond Blind Following: Evaluating Robustness of LLM Agents under Imperfect Guidance
Yao Fu, Ran Qiu, Xinhe Wang et al.
Communication Enables Cooperation in LLM Agents: A Comparison with Curriculum-Based Approaches
Hachem Madmoun, Salem Lahlou
PersonaTrace: Synthesizing Realistic Digital Footprints with LLM Agents
Minjia Wang, Yunfeng Wang, Xiao Ma et al.
Beyond IVR: Benchmarking Customer Support LLM Agents for Business-Adherence
Sumanth Balaji, Piyush Mishra, Aashraya Sachdeva et al.
SIRAJ: Diverse and Efficient Red-Teaming for LLM Agents via Distilled Structured Reasoning
Kaiwen Zhou, Ahmed Elgohary, A S M Iftekhar et al.