Papers
2,781 papers found
Can LLMs Generate and Solve Linguistic Olympiad Puzzles?
Neh Majmudar, Elena Filatova
Shared Path: Unraveling Memorization in Multilingual LLMs through Language Similarities
Xiaoyu Luo, Yiyi Chen, Johannes Bjerva et al.
3DS: Medical Domain Adaptation of LLMs via Decomposed Difficulty-based Data Selection
Hongxin Ding, Yue Fang, Runchuan Zhu et al.
Mind the Gap: A Closer Look at Tokenization for Multiple-Choice Question Answering with LLMs
Mario Sanz-Guerrero, Minh Duc Bui, Katharina von der Wense
VocalNet: Speech LLMs with Multi-Token Prediction for Faster and High-Quality Generation
Yuhao Wang, Heyang Liu, Ziyang Cheng et al.
Beyond the Score: Uncertainty-Calibrated LLMs for Automated Essay Assessment
Ahmed Karim, Qiao Wang, Zheng Yuan
DSMoE: Matrix-Partitioned Experts with Dynamic Routing for Computation-Efficient Dense LLMs
Minxuan Lv, Zhenpeng Su, Leiyu Pan et al.
Rescorla-Wagner Steering of LLMs for Undesired Behaviors over Disproportionate Inappropriate Context
Rushi Wang, Jiateng Liu, Cheng Qian et al.
Flaw or Artifact? Rethinking Prompt Sensitivity in Evaluating LLMs
Andong Hua, Kenan Tang, Chenhe Gu et al.
Think Globally, Group Locally: Evaluating LLMs Using Multi-Lingual Word Grouping Games
César Guerra-Solano, Zhuochun Li, Xiang Lorraine Li
Fann or Flop: A Multigenre, Multiera Benchmark for Arabic Poetry Understanding in LLMs
Wafa Al Ghallabi, Ritesh Thawkar, Sara Ghaboura et al.
Can Prompts Rewind Time for LLMs? Evaluating the Effectiveness of Prompted Knowledge Cutoffs
Xin Gao, Ruiyi Zhang, Daniel Du et al.
Tool Preferences in Agentic LLMs are Unreliable
Kazem Faghih, Wenxiao Wang, Yize Cheng et al.
Understanding and Mitigating Overrefusal in LLMs from an Unveiling Perspective of Safety Decision Boundary
Licheng Pan, Yongqi Tong, Xin Zhang et al.
TokenSelect: Efficient Long-Context Inference and Length Extrapolation for LLMs via Dynamic Token-Level KV Cache Selection
Wei Wu, Zhuoshi Pan, Kun Fu et al.
Improving Task Diversity in Label Efficient Supervised Finetuning of LLMs
Abhinav Arabelly, Jagrut Nemade, Robert D Nowak et al.
Memorization or Reasoning? Exploring the Idiom Understanding of LLMs
Jisu Kim, Youngwoo Shin, Uiji Hwang et al.
StepSearch: Igniting LLMs Search Ability via Step-Wise Proximal Policy Optimization
Xuhui Zheng, Kang An, Ziliang Wang et al.
Understanding and Leveraging the Expert Specialization of Context Faithfulness in Mixture-of-Experts LLMs
Jun Bai, Minghao Tong, Yang Liu et al.
Data-Efficient Selection via Grammatical Complexity in Continual Pre-training of Domain-Specific LLMs
Yizhou Ying, Geng Zhang, Cui Danxin et al.
Internal Chain-of-Thought: Empirical Evidence for Layer‐wise Subtask Scheduling in LLMs
Zhipeng Yang, Junzhuo Li, Siyu Xia et al.
Debiasing Multilingual LLMs in Cross-lingual Latent Space
Qiwei Peng, Guimin Hu, Yekun Chai et al.
Persona-Augmented Benchmarking: Evaluating LLMs Across Diverse Writing Styles
Kimberly Truong, Riccardo Fogliato, Hoda Heidari et al.
Job Unfair: An Investigation of Gender and Occupational Bias in Free-Form Text Completions by LLMs
Camilla Casula, Sebastiano Vecellio Salto, Elisa Leonardelli et al.
Understanding LLMs’ Cross-Lingual Context Retrieval: How Good It Is And Where It Comes From
Changjiang Gao, Hankun Lin, Xin Huang et al.