Papers
RULEBREAKERS: Challenging LLMs at the Crossroads between Formal Logic and Human-like Reasoning
Jason Chan, Robert J. Gaizauskas, Zhixue Zhao
Speak Easy: Eliciting Harmful Jailbreaks from LLMs with Simple Interactions
Yik Siu Chan, Narutatsu Ri, Yuxin Xiao et al.
Bounded Rationality for LLMs: Satisficing Alignment at Inference-Time
Mohamad Fares El Hajj Chehade, Soumya Suvra Ghosal, Souradip Chakraborty et al.
Generalists vs. Specialists: Evaluating LLMs on Highly-Constrained Biophysical Sequence Optimization Tasks
Angelica Chen, Samuel Don Stanton, Frances Ding et al.
Tool Unlearning for Tool-Augmented LLMs
Jiali Cheng, Hadi Amiri
Copilot Arena: A Platform for Code LLM Evaluation in the Wild
Wayne Chi, Valerie Chen, Anastasios Nikolas Angelopoulos et al.
Learning to Route LLMs with Confidence Tokens
Yu-Neng Chuang, Prathusha Kameswara Sarma, Parikshit Gopalan et al.
Two Tickets are Better than One: Fair and Accurate Hiring Under Strategic LLM Manipulations
Lee Cohen, Connie Hong, Jack Hsieh et al.
Are LLMs Prescient? A Continuous Evaluation using Daily News as the Oracle
Hui Dai, Ryan Teehan, Mengye Ren
A Unified Approach to Routing and Cascading for LLMs
Jasper Dekoninck, Maximilian Baader, Martin Vechev
BEST-Route: Adaptive LLM Routing with Test-Time Optimal Compute
Dujian Ding, Ankur Mallick, Shaokun Zhang et al.
STP: Self-play LLM Theorem Provers with Iterative Conjecturing and Proving
Kefan Dong, Tengyu Ma
TypyBench: Evaluating LLM Type Inference for Untyped Python Repositories
Honghua Dong, Jiacheng Yang, Xun Deng et al.
Emergent Response Planning in LLMs
Zhichen Dong, Zhanhui Zhou, Zhixuan Liu et al.
Long-Short Alignment for Effective Long-Context Modeling in LLMs
Tianqi Du, Haotian Huang, Yifei Wang et al.
Unnatural Languages Are Not Bugs but Features for LLMs
Keyu Duan, Yiran Zhao, Zhili Feng et al.
any4: Learned 4-bit Numeric Representation for LLMs
Mostafa Elhoushi, Jeff Johnson
Grammar-Forced Translation of Natural Language to Temporal Logic using LLMs
William H English, Dominic Simon, Sumit Kumar Jha et al.
Towards LLM Unlearning Resilient to Relearning Attacks: A Sharpness-Aware Minimization Perspective and Beyond
Chongyu Fan, Jinghan Jia, Yihua Zhang et al.
Model Swarms: Collaborative Search to Adapt LLM Experts via Swarm Intelligence
Shangbin Feng, Zifeng Wang, Yike Wang et al.
Product of Experts with LLMs: Boosting Performance on ARC Is a Matter of Perspective
Daniel Franzen, Jan Disselhoff, David Hartmann
Prompt-to-Leaderboard: Prompt-Adaptive LLM Evaluations
Evan Frick, Connor Chen, Joseph Tennyson et al.
LLM Enhancers for GNNs: An Analysis from the Perspective of Causal Mechanism Identification
Hang Gao, Huang Wenxuan, Fengge Wu et al.
MAGELLAN: Metacognitive predictions of learning progress guide autotelic LLM agents in large goal spaces
Loris Gaven, Thomas Carta, Clément Romac et al.
RLEF: Grounding Code LLMs in Execution Feedback with Reinforcement Learning
Jonas Gehring, Kunhao Zheng, Jade Copet et al.