Papers

219 papers found
2025 ICLR
BadRobot: Jailbreaking Embodied LLM Agents in the Physical World
Hangtao Zhang, Chenyu Zhu, Xianlong Wang et al.
2025 ICLR
2025 ICLR
EDGE: Efficient Data Selection for LLM Agents via Guideline Effectiveness
Yunxiao Zhang, Guanming Xiong, Haochen Li et al.
2025 IJCAI
AgentQuest: A Modular Benchmark Framework to Measure Progress and Improve LLM Agents
Luca Gioacchini, Giuseppe Siracusano, Davide Sanvito et al.
2024 NAACL
2025 NAACL
Adapting LLM Agents with Universal Communication Feedback
Kuan Wang, Yadong Lu, Michael Santacroce et al.
2025 NAACL
2025 NAACL
TableWise at SemEval-2025 Task 8: LLM Agents for TabQA
Harsh Bansal, Aman Raj, Akshit Sharma et al.
2025 SEMEVAL
2025 SEMEVAL
Teams of LLM Agents can Exploit Zero-Day Vulnerabilities
Yuxuan Zhu, Antony Kellermann, Akul Gupta et al.
2026 EACL
2026 EACL
Beyond IVR: Benchmarking Customer Support LLM Agents for Business-Adherence
Sumanth Balaji, Piyush Mishra, Aashraya Sachdeva et al.
2026 EACL
2026 EACL