Papers
Hidden in Plain Sight: Reasoning in Underspecified and Misspecified Scenarios for Multimodal LLMs
Qianqi Yan, Hongquan Li, Shan Jiang et al.
Code Execution as Grounded Supervision for LLM Reasoning
Dongwon Jung, Wenxuan Zhou, Muhao Chen
Subjective Behaviors and Preferences in LLM: Language of Browsing
Sai Sundaresan, Harshita Chopra, Atanu R. Sinha et al.
TactfulToM: Do LLMs have the Theory of Mind ability to understand White Lies?
Yiwei Liu, Emma Jane Pretty, Jiahao Huang et al.
Analyzing values about gendered language reform in LLMs’ revisions
Jules Watson, Xi Wang, Raymond Liu et al.
Stepwise Informativeness Search for Improving LLM Reasoning
Siyuan Wang, Enda Zhao, Xiang Ren
Harmful Prompt Laundering: Jailbreaking LLMs with Abductive Styles and Symbolic Encoding
Seongho Joo, Hyukhun Koh, Kyomin Jung
Amulet: Putting Complex Multi-Turn Conversations on the Stand with LLM Juries
Sahana Ramnath, Anurag Mudgil, Brihi Joshi et al.
CMedCalc-Bench: A Fine-Grained Benchmark for Chinese Medical Calculations in LLM
Yunyan Zhang, Zhihong Zhu, Xian Wu
Subtle Risks, Critical Failures: A Framework for Diagnosing Physical Safety of LLMs for Embodied Decision Making
Yejin Son, Minseo Kim, Sungwoong Kim et al.
HELENE: Hessian Layer-wise Clipping and Gradient Annealing for Accelerating Fine-tuning LLM with Zeroth-order Optimization
Huaqin Zhao, Jiaxi Li, Yi Pan et al.
From Parameters to Performance: A Data-Driven Study on LLM Structure and Development
Suqing Wang, Zuchao Li, Shi Luohe et al.
Speculating LLMs’ Chinese Training Data Pollution from Their Tokens
Qingjie Zhang, Di Wang, Haoting Qian et al.
The Stepwise Deception: Simulating the Evolution from True News to Fake News with LLM Agents
Yuhan Liu, Zirui Song, Juntian Zhang et al.
CopySpec: Accelerating LLMs with Speculative Copy-and-Paste
Razvan-Gabriel Dumitru, Minglai Yang, Vikas Yadav et al.
Are LLMs Better than Reported? Detecting Label Errors and Mitigating Their Effect on Model Performance
Omer Nahum, Nitay Calderon, Orgad Keller et al.
M-Wanda: Improving One-Shot Pruning for Multilingual LLMs
Rochelle Choenni, Ivan Titov
Beyond Online Sampling: Bridging Offline-to-Online Alignment via Dynamic Data Transformation for LLMs
Zhang Zhang, Guhao Feng, Jian Guan et al.
Enhancing LLM Language Adaption through Cross-lingual In-Context Pre-training
Linjuan Wu, Hao-Ran Wei, Huan Lin et al.
Order Doesn’t Matter, But Reasoning Does: Training LLMs with Order-Centric Augmentation
Qianxi He, Qianyu He, Jiaqing Liang et al.
CogDual: Enhancing Dual Cognition of LLMs via Reinforcement Learning with Implicit Rule-Based Rewards
Cheng Liu, Yifei Lu, Fanghua Ye et al.
MrGuard: A Multilingual Reasoning Guardrail for Universal LLM Safety
Yahan Yang, Soham Dan, Shuo Li et al.
When Life Gives You Samples: The Benefits of Scaling up Inference Compute for Multilingual LLMs
Ammar Khairi, Daniel D’souza, Ye Shen et al.
Scaling Low-Resource MT via Synthetic Data Generation with LLMs
Ona de Gibert, Joseph Attieh, Teemu Vahtola et al.
Morables: A Benchmark for Assessing Abstract Moral Reasoning in LLMs with Fables
Matteo Marcuzzo, Alessandro Zangari, Andrea Albarelli et al.