Papers
Do Audio LLMs Really LISTEN, or Just Transcribe? Measuring Lexical vs. Acoustic Emotion Cues Reliance
Jingyi Chen, Zhimeng Guo, Jiyun Chun et al.
MERLIN: Multi-Stage Curriculum Alignment for Multilingual Encoder-LLM Integration in Cross-Lingual Reasoning
Kosei Uemura, David Guzmán, Quang Phuoc Nguyen et al.
Mary, the Cheeseburger-Eating Vegetarian: Do LLMs Recognize Incoherence in Narratives?
Karin De Langis, Püren Öncel, Ryan Peters et al.
Strong Memory, Weak Control: An Empirical Study of Executive Functioning in LLMs
Karin de Langis, Jong Inn Park, Bin Hu et al.
Can LLMs reason over extended multilingual contexts? Towards long-context evaluation beyond retrieval over haystacks
Amey Hengle, Prasoon Bajpai, Soham Dan et al.
Knowing When to Abstain: Medical LLMs Under Clinical Uncertainty
Sravanthi Machcha, Sushrita Yerra, Sahil Gupta et al.
MedQA-CS: Objective Structured Clinical Examination (OSCE)-Style Benchmark for Evaluating LLM Clinical Skills
Zonghai Yao, Zihao Zhang, Chaolong Tang et al.
LLMs as Cultural Archives: Cultural Commonsense Knowledge Graph Extraction
Junior Cedric Tonga, Chen Cecilia Liu, Iryna Gurevych et al.
Activation-Space Personality Steering: Hybrid Layer Selection for Stable Trait Control in LLMs
Pranav Bhandari, Nicolas Fay, Sanjeevan Selvaganapathy et al.
KG-CRAFT: Knowledge Graph-based Contrastive Reasoning with LLMs for Enhancing Automated Fact-checking
Vítor Lourenço, Aline Paes, Tillman Weyde et al.
Elections go bananas: A First Large-scale Multilingual Study of Pluralia Tantum using LLMs
Elena Spaziani, Kamyar Zeinalipour, Pierluigi Cassotti et al.
Beyond Blind Following: Evaluating Robustness of LLM Agents under Imperfect Guidance
Yao Fu, Ran Qiu, Xinhe Wang et al.
How Do LLMs Generate Contrastive Sentiments? A Mechanistic Perspective
Van Bach Nguyen, Jörg Schlötterer, Christin Seifert
H3Fusion: Helpful, Harmless, Honest Fusion of Aligned LLMs
Selim Furkan Tekin, Fatih Ilhan, Sihao Hu et al.
BLUR: A Bi-Level Optimization Approach for LLM Unlearning
Hadi Reisizadeh, Jinghan Jia, Zhiqi Bu et al.
Evidential Semantic Entropy for LLM Uncertainty Quantification
Lucie Kunitomo-Jacquin, Edison Marrese-Taylor, Ken Fukuda et al.
A Reinforcement Learning Framework for Robust and Secure LLM Watermarking
Li An, Yujian Liu, Yepeng Liu et al.
Beyond Understanding: Evaluating the Pragmatic Gap in LLMs’ Cultural Processing of Figurative Language
Mena Attia, Aashiq Muhamed, Mai Alkhamissi et al.
Do You See Me : A Multidimensional Benchmark for Evaluating Visual Perception in Multimodal LLMs
Aditya Sanjiv Kanade, Tanuja Ganu
A Review of Incorporating Psychological Theories in LLMs
Zizhou Liu, Ziwei Gong, Lin Ai et al.
Tokenizer-Aware Cross-Lingual Adaptation of Decoder-Only LLMs through Embedding Relearning and Swapping
Fan Jiang, Honglin Yu, Grace Y Chung et al.
Active Generalized Category Discovery with Diverse LLM Feedback
Henry Peng Zou, Siffi Singh, Yi Nian et al.
RAFFLES: Reasoning-based Attribution of Faults for LLM Systems
Chenyang Zhu, Spencer Hong, Jingyu Wu et al.
Jailbreaks as Inference-Time Alignment: A Framework for Understanding Safety Failures in LLMs
James Beetham, Souradip Chakraborty, Mengdi Wang et al.
Are All Prompt Components Value-Neutral? Understanding the Heterogeneous Adversarial Robustness of Dissected Prompt in LLMs
Yujia Zheng, Tianhao Li, Haotian Huang et al.