Papers
2,781 papers found
Benchmarking LLMs’ Mathematical Reasoning with Unseen Random Variables Questions
Zijin Hong, Hao Wu, Su Dong et al.
LiteLong: Resource-Efficient Long-Context Data Synthesis for LLMs
Junlong Jia, Xing Wu, Chaochen Gao et al.
EduGuardBench: A Holistic Benchmark for Evaluating the Pedagogical Fidelity and Adversarial Safety of LLMs as Simulated Teachers
Yilin Jiang, Mingzi Zhang, Xuanyu Yin et al.
Difficulty Is Not Enough: Curriculum Learning for LLMs Fine-tuning Must Consider Utility
Zishang Jiang, Jinyi Han, Tingyun Li et al.
Do Not Merge My Model! Safeguarding Open-Source LLMs Against Unauthorized Model Merging
Qinfeng Li, Miao Pan, Jintao Chen et al.
OSVBench: Benchmarking LLMs on Specification Generation Tasks for Operating System Verification
Shangyu Li, Juyong Jiang, Tiancheng Zhao et al.
CoFact: Dynamic Coordination of Attention Heads for Improving Factual Consistency in LLMs
Shike Li, Xiaokai Wang, Xiaofeng Liu et al.
Semantic Volume: Quantifying and Detecting Both External and Internal Uncertainty in LLMs
Xiaomin Li, Zhou Yu, Ziji Zhang et al.
LoopLLM: Transferable Energy-Latency Attacks in LLMs via Repetitive Generation
Xingyu Li, Xiaolei Liu, Cheng Liu et al.
Do LLMs Feel? Teaching Emotion Recognition with Prompts, Retrieval, and Curriculum Learning
Xinran Li, Yu Liu, Jiaqi Qiao et al.
Hidden in the Noise: Unveiling Backdoors in Audio LLMs Alignment Through Latent Acoustic Pattern Triggers
Liang Lin, Miao Yu, Kaiwen Luo et al.
Format as a Prior: Quantifying and Analyzing Bias in LLMs for Heterogeneous Data
Jiacheng Liu, Mayi Xu, Qiankun Pi et al.
Easy for Children, Hard for AI: The Limits of Multimodal LLMs in Early Childhood Learning
Jingping Liu, Xueyan Wu, Hanxuan Chen et al.
ARBench: Algorithmic Reasoner or API Alchemist? Evaluating LLMs Beyond API Calls
Ren-Biao Liu, Chao-Zeng Ma, Anqi Li et al.
LongLLaDA: Unlocking Long Context Capabilities in Diffusion LLMs
Xiaoran Liu, Yuerong Song, Zhigeng Liu et al.
QueryAligner: Customizing User Query to Match LLMs Preferences for Better Intent Recognition
Yunlong Ma, Bo Wang, Yihong Tang et al.
PoeTone: A Framework for Constrained Generation of Structured Chinese Songci with LLMs
Zhan Qu, Shuzhou Yuan, Michael Färber
Assessing the Capabilities of LLMs in Humor: A Multi-dimensional Analysis of Oogiri Generation and Evaluation
Ritsu Sakabe, Hwichan Kim, Tosho Hirasawa et al.
Positional Cognitive Specialization: Where Do LLMs Learn to Comprehend and Speak Your Language?
Luis Frentzen Salim, Lun-Wei Ku, Hsing-Kuo Kenneth Pao
AntiDote: Bi-level Adversarial Training for Tamper-Resistant LLMs
Debdeep Sanyal, Manodeep Ray, Murari Mandal
From Solver to Tutor: Evaluating the Pedagogical Intelligence of LLMs with KMP-Bench
Weikang Shi, Houxing Ren, Junting Pan et al.
Fine-Tuned LLMs Know They Don’t Know: A Parameter-Efficient Approach to Recovering Honesty
Zeyu Shi, Ziming Wang, Tianyu Chen et al.
qa-FLoRA: Data-free query-adaptive Fusion of LoRAs for LLMs
Shreya Shukla, Aditya Sriram, Milinda Kuppur Narayanaswamy et al.
Sparse-dLLM: Accelerating Diffusion LLMs with Dynamic Cache Eviction
Yuerong Song, Xiaoran Liu, Ruixiao Li et al.