Papers
Flaw or Artifact? Rethinking Prompt Sensitivity in Evaluating LLMs
Andong Hua, Kenan Tang, Chenhe Gu et al.
Membership and Memorization in LLM Knowledge Distillation
Ziqi Zhang, Ali Shahin Shamsabadi, Hanxiao Lu et al.
Think Globally, Group Locally: Evaluating LLMs Using Multi-Lingual Word Grouping Games
César Guerra-Solano, Zhuochun Li, Xiang Lorraine Li
MobiZO: Enabling Efficient LLM Fine-Tuning at the Edge via Inference Engines
Lei Gao, Amir Ziashahabi, Yue Niu et al.
Fann or Flop: A Multigenre, Multiera Benchmark for Arabic Poetry Understanding in LLMs
Wafa Al Ghallabi, Ritesh Thawkar, Sara Ghaboura et al.
Reading Between the Prompts: How Stereotypes Shape LLM’s Implicit Personalization
Vera Neplenbroek, Arianna Bisazza, Raquel Fernández
DiCoRe: Enhancing Zero-shot Event Detection via Divergent-Convergent LLM Reasoning
Tanmay Parekh, Kartik Mehta, Ninareh Mehrabi et al.
LogiDynamics: Unraveling the Dynamics of Inductive, Abductive and Deductive Logical Inferences in LLM Reasoning
Tianshi Zheng, Cheng Jiayang, Chunyang Li et al.
Can Prompts Rewind Time for LLMs? Evaluating the Effectiveness of Prompted Knowledge Cutoffs
Xin Gao, Ruiyi Zhang, Daniel Du et al.
Tool Preferences in Agentic LLMs are Unreliable
Kazem Faghih, Wenxiao Wang, Yize Cheng et al.
Understanding and Mitigating Overrefusal in LLMs from an Unveiling Perspective of Safety Decision Boundary
Licheng Pan, Yongqi Tong, Xin Zhang et al.
Koel-TTS: Enhancing LLM based Speech Generation with Preference Alignment and Classifier Free Guidance
Shehzeen Samarah Hussain, Paarth Neekhara, Xuesong Yang et al.
Mixing Inference-time Experts for Enhancing LLM Reasoning
Soumya Sanyal, Tianyi Xiao, Xiang Ren
TokenSelect: Efficient Long-Context Inference and Length Extrapolation for LLMs via Dynamic Token-Level KV Cache Selection
Wei Wu, Zhuoshi Pan, Kun Fu et al.
Interpretation Meets Safety: A Survey on Interpretation Methods and Tools for Improving LLM Safety
Seongmin Lee, Aeree Cho, Grace C. Kim et al.
Improving Task Diversity in Label Efficient Supervised Finetuning of LLMs
Abhinav Arabelly, Jagrut Nemade, Robert D Nowak et al.
Memorization or Reasoning? Exploring the Idiom Understanding of LLMs
Jisu Kim, Youngwoo Shin, Uiji Hwang et al.
Learning to Ask: When LLM Agents Meet Unclear Instruction
Wenxuan Wang, Shi Juluan, Zixuan Ling et al.
StepSearch: Igniting LLMs Search Ability via Step-Wise Proximal Policy Optimization
Xuhui Zheng, Kang An, Ziliang Wang et al.
Understanding and Leveraging the Expert Specialization of Context Faithfulness in Mixture-of-Experts LLMs
Jun Bai, Minghao Tong, Yang Liu et al.
Data-Efficient Selection via Grammatical Complexity in Continual Pre-training of Domain-Specific LLMs
Yizhou Ying, Geng Zhang, Cui Danxin et al.
DiffusionAttacker: Diffusion-Driven Prompt Manipulation for LLM Jailbreak
Hao Wang, Hao Li, Junda Zhu et al.
Massive Supervised Fine-tuning Experiments Reveal How Data, Layer, and Training Factors Shape LLM Alignment Quality
Yuto Harada, Yusuke Yamauchi, Yusuke Oda et al.
Internal Chain-of-Thought: Empirical Evidence for Layer‐wise Subtask Scheduling in LLMs
Zhipeng Yang, Junzhuo Li, Siyu Xia et al.
Debiasing Multilingual LLMs in Cross-lingual Latent Space
Qiwei Peng, Guimin Hu, Yekun Chai et al.