Papers
2,781 papers found
Soda-Eval: Open-Domain Dialogue Evaluation in the age of LLMs
John Mendonça, Isabel Trancoso, Alon Lavie
Augmenting Reasoning Capabilities of LLMs with Graph Structures in Knowledge Base Question Answering
Yuhang Tian, Dandan Song, Zhijing Wu et al.
Towards Robust Evaluation of Unlearning in LLMs via Data Transformations
Abhinav Joshi, Shaswati Saha, Divyaksh Shukla et al.
Fast Matrix Multiplications for Lookup Table-Quantized LLMs
Han Guo, William Brandon, Radostin Cholakov et al.
Can We Instruct LLMs to Compensate for Position Bias?
Meiru Zhang, Zaiqiao Meng, Nigel Collier
Cognitive Bias in Decision-Making with LLMs
Jessica Maria Echterhoff, Yao Liu, Abeer Alessa et al.
InternalInspector I2: Robust Confidence Estimation in LLMs through Internal States
Mohammad Beigi, Ying Shen, Runing Yang et al.
Turning English-centric LLMs Into Polyglots: How Much Multilinguality Is Needed?
Tannon Kew, Florian Schottmann, Rico Sennrich
“What is the value of templates?” Rethinking Document Information Extraction Datasets for LLMs
Ran Zmigrod, Pranav Shetty, Mathieu Sibue et al.
Can LLMs Understand the Implication of Emphasized Sentences in Dialogue?
Guan-Ting Lin, Hung-yi Lee
LLMs as Collaborator: Demands-Guided Collaborative Retrieval-Augmented Generation for Commonsense Knowledge-Grounded Open-Domain Dialogue Systems
Jiong Yu, Sixing Wu, Jiahao Chen et al.
Regression Aware Inference with LLMs
Michal Lukasik, Harikrishna Narasimhan, Aditya Krishna Menon et al.
Aligners: Decoupling LLMs and Alignment
Lilian Ngweta, Mayank Agarwal, Subha Maity et al.
QEFT: Quantization for Efficient Fine-Tuning of LLMs
Changhun Lee, Jun-gyu Jin, YoungHyun Cho et al.
DrAttack: Prompt Decomposition and Reconstruction Makes Powerful LLMs Jailbreakers
Xirui Li, Ruochen Wang, Minhao Cheng et al.
Can LLMs Replace Clinical Doctors? Exploring Bias in Disease Diagnosis by Large Language Models
Yutian Zhao, Huimin Wang, Yuqi Liu et al.
LLMs to Replace Crowdsourcing For Parallel Data Creation? The Case of Text Detoxification
Daniil Moskovskiy, Sergey Pletenev, Alexander Panchenko
“Seeing the Big through the Small”: Can LLMs Approximate Human Judgment Distributions on NLI from a Few Explanations?
Beiduo Chen, Xinpeng Wang, Siyao Peng et al.
LLMs for Generating and Evaluating Counterfactuals: A Comprehensive Study
Van Bach Nguyen, Paul Youssef, Christin Seifert et al.
AXCEL: Automated eXplainable Consistency Evaluation using LLMs
P Aditya Sreekar, Sahil Verma, Suransh Chopra et al.
Student Data Paradox and Curious Case of Single Student-Tutor Model: Regressive Side Effects of Training LLMs for Personalized Learning
Shashank Sonkar, Naiming Liu, Richard Baraniuk
To Ask LLMs about English Grammaticality, Prompt Them in a Different Language
Shabnam Behzad, Amir Zeldes, Nathan Schneider
Cost-Performance Optimization for Processing Low-Resource Language Tasks Using Commercial LLMs
Arijit Nag, Animesh Mukherjee, Niloy Ganguly et al.
Evaluating Gender Bias of LLMs in Making Morality Judgements
Divij Bajaj, Yuanyuan Lei, Jonathan Tong et al.
How Does Quantization Affect Multilingual LLMs?
Kelly Marchisio, Saurabh Dash, Hongyu Chen et al.