Research Explorer

Reliable and Diverse Evaluation of LLM Medical Knowledge Mastery

Yuxuan Zhou, Xien Liu, Chen Ning et al.

2025 ICLR

How new data permeates LLM knowledge and how to dilute it

Chen Sun, Renat Aksitov, Andrey Zhmoginov et al.

2025 ICLR

Understanding and Enhancing Safety Mechanisms of LLMs via Safety-Specific Neuron

Yiran Zhao, Wenxuan Zhang, Yuxi Xie et al.

2025 ICLR

Searching for Optimal Solutions with LLMs via Bayesian Optimization

Dhruv Agarwal, Manoj Ghuhan Arivazhagan, Rajarshi Das et al.

2025 ICLR

Semantic Loss Guided Data Efficient Supervised Fine Tuning for Safe Responses in LLMs

Yuxiao Lu, Arunesh Sinha, Pradeep Varakantham

2025 ICLR

Rewarding Progress: Scaling Automated Process Verifiers for LLM Reasoning

Amrith Setlur, Chirag Nagpal, Adam Fisch et al.

2025 ICLR

Measuring and Enhancing Trustworthiness of LLMs in RAG through Grounded Attributions and Learning to Refuse

Maojia Song, Shang Hong Sim, Rishabh Bhardwaj et al.

2025 ICLR

Unveiling the Secret Recipe: A Guide For Supervised Fine-Tuning Small LLMs

Aldo Pareja, Nikhil Shivakumar Nayak, Hao Wang et al.

2025 ICLR

Compute-Optimal LLMs Provably Generalize Better with Scale

Marc Anton Finzi, Sanyam Kapoor, Diego Granziol et al.

2025 ICLR

Towards Federated RLHF with Aggregated Client Preference for LLMs

Feijie Wu, Xiaoze Liu, Haoyu Wang et al.

2025 ICLR

RouteLLM: Learning to Route LLMs from Preference Data

Isaac Ong, Amjad Almahairi, Vincent Wu et al.

2025 ICLR

Strategist: Self-improvement of LLM Decision Making via Bi-Level Tree Search

Jonathan Light, Min Cai, Weiqin Chen et al.

2025 ICLR

Spread Preference Annotation: Direct Preference Judgment for Efficient LLM Alignment

Dongyoung Kim, Kimin Lee, Jinwoo Shin et al.

2025 ICLR

PEARL: Towards Permutation-Resilient LLMs

Liang CHEN, Li Shen, Yang Deng et al.

2025 ICLR

DailyDilemmas: Revealing Value Preferences of LLMs with Quandaries of Daily Life

Yu Ying Chiu, Liwei Jiang, Yejin Choi

2025 ICLR

Collab: Controlled Decoding using Mixture of Agents for LLM Alignment

Souradip Chakraborty, Sujay Bhatt, Udari Madhushani Sehwag et al.

2025 ICLR

SeedLM: Compressing LLM Weights into Seeds of Pseudo-Random Generators

Rasoul Shafipour, David Harrison, Maxwell Horton et al.

2025 ICLR

ACC-Collab: An Actor-Critic Approach to Multi-Agent LLM Collaboration

Andrew Estornell, Jean-Francois Ton, Yuanshun Yao et al.

2025 ICLR

Can Watermarks be Used to Detect LLM IP Infringement For Free?

Zhengyue Zhao, Xiaogeng Liu, Somesh Jha et al.

2025 ICLR

Humanizing the Machine: Proxy Attacks to Mislead LLM Detectors

Tianchun Wang, Yuanzhou Chen, Zichuan Liu et al.

2025 ICLR

Probe Pruning: Accelerating LLMs through Dynamic Pruning via Model-Probing

Qi Le, Enmao Diao, Ziyan Wang et al.

2025 ICLR

Human-inspired Episodic Memory for Infinite Context LLMs

Zafeirios Fountas, Martin Benfeghoul, Adnan Oomerjee et al.

2025 ICLR

BingoGuard: LLM Content Moderation Tools with Risk Levels

Fan Yin, Philippe Laban, XIANGYU PENG et al.

2025 ICLR

SqueezeAttention: 2D Management of KV-Cache in LLM Inference via Layer-wise Optimal Budget

Zihao Wang, Bin CUI, Shaoduo Gan

2025 ICLR

SFS: Smarter Code Space Search improves LLM Inference Scaling

Jonathan Light, Yue Wu, Yiyou Sun et al.

2025 ICLR

Papers