Papers

5,479 papers found
2025 ICLR
LLM-based Typed Hyperresolution for Commonsense Reasoning with Knowledge Bases
Armin Toroghi, Ali Pesaranghader, Tanmana Sadhu et al.
2025 ICLR
AgentOccam: A Simple Yet Strong Baseline for LLM-Based Web Agents
Ke Yang, Yao Liu, Sapana Chaudhary et al.
2025 ICLR
BadJudge: Backdoor Vulnerabilities of LLM-As-A-Judge
Terry Tong, Fei Wang, Zhe Zhao et al.
2025 ICLR
2025 ICLR
Learning LLM-as-a-Judge for Preference Alignment
Ziyi Ye, Xiangsheng Li, Qiuchi Li et al.
2025 ICLR
RevisEval: Improving LLM-as-a-Judge via Response-Adapted References
Qiyuan Zhang, Yufei Wang, Tiezheng YU et al.
2025 ICLR
2025 ICLR
How efficient is LLM-generated code? A rigorous & high-standard benchmark
Ruizhong Qiu, Weiliang Will Zeng, James Ezick et al.
2025 ICLR
2025 ICLR
Justice or Prejudice? Quantifying Biases in LLM-as-a-Judge
Jiayi Ye, Yanbo Wang, Yue Huang et al.
2025 ICLR
2025 ICLR
JudgeBench: A Benchmark for Evaluating LLM-Based Judges
Sijun Tan, Siyuan Zhuang, Kyle Montgomery et al.
2025 ICLR
Improving Data Efficiency via Curating LLM-Driven Rating Systems
Jinlong Pang, Jiaheng Wei, Ankit Shah et al.
2025 ICLR
LLM-SR: Scientific Equation Discovery via Programming with Large Language Models
Parshin Shojaee, Kazem Meidani, Shashank Gupta et al.
2025 ICLR