Papers
2,781 papers found
Does Fine-Tuning LLMs on New Knowledge Encourage Hallucinations?
Zorik Gekhman, Gal Yona, Roee Aharoni et al.
Pelican: Correcting Hallucination in Vision-LLMs via Claim Decomposition and Program of Thought Verification
Pritish Sahu, Karan Sikka, Ajay Divakaran
Unsupervised End-to-End Task-Oriented Dialogue with LLMs: The Power of the Noisy Channel
Brendan King, Jeffrey Flanigan
Humans or LLMs as the Judge? A Study on Judgement Bias
Guiming Hardy Chen, Shunian Chen, Ziche Liu et al.
Knowledge Conflicts for LLMs: A Survey
Rongwu Xu, Zehan Qi, Zhijiang Guo et al.
A Thorough Examination of Decoding Methods in the Era of LLMs
Chufan Shi, Haoran Yang, Deng Cai et al.
MiniCheck: Efficient Fact-Checking of LLMs on Grounding Documents
Liyan Tang, Philippe Laban, Greg Durrett
Learning to Correct for QA Reasoning with Black-box LLMs
Jaehyung Kim, Dongyoung Kim, Yiming Yang
Token Erasure as a Footprint of Implicit Vocabulary Items in LLMs
Sheridan Feucht, David Atkinson, Byron C Wallace et al.
Summary of a Haystack: A Challenge to Long-Context LLMs and RAG Systems
Philippe Laban, Alexander Fabbri, Caiming Xiong et al.
ARM: An Alignment-and-Replacement Module for Chinese Spelling Check Based on LLMs
Changchun Liu, Kai Zhang, Junzhe Jiang et al.
LLMs Are Prone to Fallacies in Causal Inference
Nitish Joshi, Abulhair Saparov, Yixin Wang et al.
Do LLMs suffer from Multi-Party Hangover? A Diagnostic Approach to Addressee Recognition and Response Selection in Conversations
Nicolò Penzo, Maryam Sajedinia, Bruno Lepri et al.
Code Prompting Elicits Conditional Reasoning Abilities in Text+Code LLMs
Haritz Puerto, Martin Tutek, Somak Aditya et al.
PrExMe! Large Scale Prompt Exploration of Open Source LLMs for Machine Translation and Summarization Evaluation
Christoph Leiter, Steffen Eger
Reasoning or a Semblance of it? A Diagnostic Study of Transitive Reasoning in LLMs
Houman Mehrafarin, Arash Eshghi, Ioannis Konstas
Unraveling Babel: Exploring Multilingual Activation Patterns of LLMs and Their Applications
Weize Liu, Yinlong Xu, Hongxia Xu et al.
TheoremLlama: Transforming General-Purpose LLMs into Lean4 Experts
Ruida Wang, Jipeng Zhang, Yizhen Jia et al.
Subword Segmentation in LLMs: Looking at Inflection and Consistency
Marion Di Marco, Alexander Fraser
Do LLMs Overcome Shortcut Learning? An Evaluation of Shortcut Challenges in Large Language Models
Yu Yuan, Lili Zhao, Kai Zhang et al.
Subjective Topic meets LLMs: Unleashing Comprehensive, Reflective and Creative Thinking through the Negation of Negation
Fangrui Lv, Kaixiong Gong, Jian Liang et al.
Why Does New Knowledge Create Messy Ripple Effects in LLMs?
Jiaxin Qin, Zixuan Zhang, Chi Han et al.
“Global is Good, Local is Bad?”: Understanding Brand Bias in LLMs
Mahammed Kamruzzaman, Hieu Minh Nguyen, Gene Louis Kim
RLHF Can Speak Many Languages: Unlocking Multilingual Preference Optimization for LLMs
John Dang, Arash Ahmadian, Kelly Marchisio et al.