Papers
5,479 papers found
KG-CRAFT: Knowledge Graph-based Contrastive Reasoning with LLMs for Enhancing Automated Fact-checking
Vítor Lourenço, Aline Paes, Tillman Weyde et al.
Elections go bananas: A First Large-scale Multilingual Study of Pluralia Tantum using LLMs
Elena Spaziani, Kamyar Zeinalipour, Pierluigi Cassotti et al.
Beyond Blind Following: Evaluating Robustness of LLM Agents under Imperfect Guidance
Yao Fu, Ran Qiu, Xinhe Wang et al.
How Do LLMs Generate Contrastive Sentiments? A Mechanistic Perspective
Van Bach Nguyen, Jörg Schlötterer, Christin Seifert
H3Fusion: Helpful, Harmless, Honest Fusion of Aligned LLMs
Selim Furkan Tekin, Fatih Ilhan, Sihao Hu et al.
BLUR: A Bi-Level Optimization Approach for LLM Unlearning
Hadi Reisizadeh, Jinghan Jia, Zhiqi Bu et al.
Evidential Semantic Entropy for LLM Uncertainty Quantification
Lucie Kunitomo-Jacquin, Edison Marrese-Taylor, Ken Fukuda et al.
A Reinforcement Learning Framework for Robust and Secure LLM Watermarking
Li An, Yujian Liu, Yepeng Liu et al.
Beyond Understanding: Evaluating the Pragmatic Gap in LLMs’ Cultural Processing of Figurative Language
Mena Attia, Aashiq Muhamed, Mai Alkhamissi et al.
Do You See Me : A Multidimensional Benchmark for Evaluating Visual Perception in Multimodal LLMs
Aditya Sanjiv Kanade, Tanuja Ganu
A Review of Incorporating Psychological Theories in LLMs
Zizhou Liu, Ziwei Gong, Lin Ai et al.
Tokenizer-Aware Cross-Lingual Adaptation of Decoder-Only LLMs through Embedding Relearning and Swapping
Fan Jiang, Honglin Yu, Grace Y Chung et al.
Active Generalized Category Discovery with Diverse LLM Feedback
Henry Peng Zou, Siffi Singh, Yi Nian et al.
RAFFLES: Reasoning-based Attribution of Faults for LLM Systems
Chenyang Zhu, Spencer Hong, Jingyu Wu et al.
Jailbreaks as Inference-Time Alignment: A Framework for Understanding Safety Failures in LLMs
James Beetham, Souradip Chakraborty, Mengdi Wang et al.
Are All Prompt Components Value-Neutral? Understanding the Heterogeneous Adversarial Robustness of Dissected Prompt in LLMs
Yujia Zheng, Tianhao Li, Haotian Huang et al.
What Does Infect Mean to Cardio? Investigating the Role of Clinical Specialty Data in Medical LLMs
Xinlan Yan, Di Wu, Yibin Lei et al.
Redefining Retrieval Evaluation in the Era of LLMs
Giovanni Trappolini, Florin Cuconasu, Simone Filice et al.
Debate, Deliberate, Decide (D3): A Cost-Aware Adversarial Framework for Reliable and Interpretable LLM Evaluation
Abir Harrasse, Chaithanya Bandi, Hari Bandi
Korean Canonical Legal Benchmark: Toward Knowledge-Independent Evaluation of LLMs’ Legal Reasoning Capabilities
Hongseok Oh, Wonseok Hwang, Kyoung-Woon On
Measuring Linguistic Competence of LLMs on Indigenous Languages of the Americas
Justin Vasselli, Arturo Mp, Frederikus Hudi et al.
Communication Enables Cooperation in LLM Agents: A Comparison with Curriculum-Based Approaches
Hachem Madmoun, Salem Lahlou
Beyond Tokens: Concept-Level Training Objectives for LLMs
Laya Iyer, Pranav Somani, Alice Guo et al.
Persuasion Tokens for Editing Factual Knowledge in LLMs
Paul Youssef, Christin Seifert, Jörg Schlötterer