Papers
2,781 papers found
Exploring the Hidden Capacity of LLMs for One-Step Text Generation
Gleb Mezentsev, Ivan Oseledets
DCR: Quantifying Data Contamination in LLMs Evaluation
Cheng Xu, Nan Yan, Shuhao Guan et al.
Building Trust in Clinical LLMs: Bias Analysis and Dataset Transparency
Svetlana Maslenkova, Clement Christophe, Marco AF Pimentel et al.
Drivel-ology: Challenging LLMs with Interpreting Nonsense with Depth
Yang Wang, Chenghao Xiao, Chia-Yi Hsiao et al.
InterIDEAS: Philosophical Intertextuality via LLMs
Yue Yang, Yinzhi Xu, Chenghao Huang et al.
RAcQUEt: Unveiling the Dangers of Overlooked Referential Ambiguity in Visual LLMs
Alberto Testoni, Barbara Plank, Raquel Fernández
Easy as PIE? Identifying Multi-Word Expressions with LLMs
Kai Golan Hashiloni, Ofri Hefetz, Kfir Bar
Graph-R1: Incentivizing the Zero-Shot Graph Learning Capability in LLMs via Explicit Reasoning
Yicong Wu, Guangyue Lu, Yuan Zuo et al.
LLMs Don’t Know Their Own Decision Boundaries: The Unreliability of Self-Generated Counterfactual Explanations
Harry Mayne, Ryan Othniel Kearns, Yushi Yang et al.
Grounding Multilingual Multimodal LLMs With Cultural Knowledge
Jean De Dieu Nyandwi, Yueqi Song, Simran Khanuja et al.
From Language to Cognition: How LLMs Outgrow the Human Language Network
Badr AlKhamissi, Greta Tuckute, Yingtian Tang et al.
Hallucination Detection in LLMs Using Spectral Features of Attention Maps
Jakub Binkowski, Denis Janiak, Albert Sawczyn et al.
Towards a Holistic and Automated Evaluation Framework for Multi-Level Comprehension of LLMs in Book-Length Contexts
Yuho Lee, Jiaqi Deng, Nicole Hee-Yeon Kim et al.
From Word to World: Evaluate and Mitigate Culture Bias in LLMs via Word Association Test
Xunlian Dai, Li Zhou, Benyou Wang et al.
TFDP: Token-Efficient Disparity Audits for Autoregressive LLMs via Single-Token Masked Evaluation
Inderjeet Singh, Ramya Srinivasan, Roman Vainshtein et al.
Hidden in Plain Sight: Reasoning in Underspecified and Misspecified Scenarios for Multimodal LLMs
Qianqi Yan, Hongquan Li, Shan Jiang et al.
TactfulToM: Do LLMs have the Theory of Mind ability to understand White Lies?
Yiwei Liu, Emma Jane Pretty, Jiahao Huang et al.
Analyzing values about gendered language reform in LLMs’ revisions
Jules Watson, Xi Wang, Raymond Liu et al.
Harmful Prompt Laundering: Jailbreaking LLMs with Abductive Styles and Symbolic Encoding
Seongho Joo, Hyukhun Koh, Kyomin Jung
Subtle Risks, Critical Failures: A Framework for Diagnosing Physical Safety of LLMs for Embodied Decision Making
Yejin Son, Minseo Kim, Sungwoong Kim et al.
Speculating LLMs’ Chinese Training Data Pollution from Their Tokens
Qingjie Zhang, Di Wang, Haoting Qian et al.
CopySpec: Accelerating LLMs with Speculative Copy-and-Paste
Razvan-Gabriel Dumitru, Minglai Yang, Vikas Yadav et al.
Are LLMs Better than Reported? Detecting Label Errors and Mitigating Their Effect on Model Performance
Omer Nahum, Nitay Calderon, Orgad Keller et al.
M-Wanda: Improving One-Shot Pruning for Multilingual LLMs
Rochelle Choenni, Ivan Titov
Beyond Online Sampling: Bridging Offline-to-Online Alignment via Dynamic Data Transformation for LLMs
Zhang Zhang, Guhao Feng, Jian Guan et al.