conftrace_

Papers

11,951 papers found
2025 ICLR
Round and Round We Go! What makes Rotary Positional Encodings useful?
Federico Barbero, Alex Vitvitskyi, Christos Perivolaropoulos et al.
2025 ICLR
RouteLLM: Learning to Route LLMs from Preference Data
Isaac Ong, Amjad Almahairi, Vincent Wu et al.
2025 ICLR
2025 ICLR
RRM: Robust Reward Model Training Mitigates Reward Hacking
Tianqi Liu, Wei Xiong, Jie Ren et al.
2025 ICLR
R-Sparse: Rank-Aware Activation Sparsity for Efficient LLM Inference
Zhenyu Zhang, Zechun Liu, Yuandong Tian et al.
2025 ICLR
2025 ICLR
2025 ICLR
SafeDiffuser: Safe Planning with Diffusion Probabilistic Models
Wei Xiao, Tsun-Hsuan Wang, Chuang Gan et al.
2025 ICLR
Safety Alignment Should be Made More Than Just a Few Tokens Deep
Xiangyu Qi, Ashwinee Panda, Kaifeng Lyu et al.
2025 ICLR
2025 ICLR
Safety-Prioritizing Curricula for Constrained Reinforcement Learning
Cevahir Koprulu, Thiago D. Simão, Nils Jansen et al.
2025 ICLR
Safety Representations for Safer Policy Learning
Kaustubh Mani, Vincent Mai, Charlie Gauthier et al.
2025 ICLR
2025 ICLR
2025 ICLR
SaLoRA: Safety-Alignment Preserved Low-Rank Adaptation
Mingjie Li, Wai Man Si, Michael Backes et al.
2025 ICLR