Papers
2,781 papers found
Quantifying Generalizations: Exploring the Divide Between Human and LLMs’ Sensitivity to Quantification
Claudia Collacciani, Giulia Rambelli, Marianna Bolognesi
LLMs Learn Task Heuristics from Demonstrations: A Heuristic-Driven Prompting Strategy for Document-Level Event Argument Extraction
Hanzhang Zhou, Junlang Qian, Zijian Feng et al.
Beyond Traditional Benchmarks: Analyzing Behaviors of Open LLMs on Data-to-Text Generation
Zdeněk Kasner, Ondrej Dusek
Don’t Go To Extremes: Revealing the Excessive Sensitivity and Calibration Limitations of LLMs in Implicit Hate Speech Detection
Min Zhang, Jianfeng He, Taoran Ji et al.
One Prompt To Rule Them All: LLMs for Opinion Summary Evaluation
Tejpalsingh Siledar, Swaroop Nath, Sankara Muddu et al.
LANDeRMT: Dectecting and Routing Language-Aware Neurons for Selectively Finetuning LLMs to Machine Translation
Shaolin Zhu, Leiyu Pan, Bo Li et al.
Back to Basics: Revisiting REINFORCE-Style Optimization for Learning from Human Feedback in LLMs
Arash Ahmadian, Chris Cremer, Matthias Gallé et al.
Silent Signals, Loud Impact: LLMs for Word-Sense Disambiguation of Coded Dog Whistles
Julia Kruk, Michela Marchini, Rijul Magu et al.
Retaining Key Information under High Compression Ratios: Query-Guided Compressor for LLMs
Zhiwei Cao, Qian Cao, Yu Lu et al.
API-BLEND: A Comprehensive Corpora for Training and Benchmarking API LLMs
Kinjal Basu, Ibrahim Abdelaziz, Subhajit Chaudhury et al.
HOLMES: Hyper-Relational Knowledge Graphs for Multi-hop Question Answering using LLMs
Pranoy Panda, Ankush Agarwal, Chaitanya Devaguptapu et al.
Meta-Tuning LLMs to Leverage Lexical Knowledge for Generalizable Language Style Understanding
Ruohao Guo, Wei Xu, Alan Ritter
Skin-in-the-Game: Decision Making via Multi-Stakeholder Alignment in LLMs
Bilgehan Sel, Priya Shanmugasundaram, Mohammad Kachuee et al.
ArtPrompt: ASCII Art-based Jailbreak Attacks against Aligned LLMs
Fengqing Jiang, Zhangchen Xu, Luyao Niu et al.
DocMath-Eval: Evaluating Math Reasoning Capabilities of LLMs in Understanding Long and Specialized Documents
Yilun Zhao, Yitao Long, Hongjun Liu et al.
The Earth is Flat because...: Investigating LLMs’ Belief towards Misinformation via Persuasive Conversation
Rongwu Xu, Brian Lin, Shujian Yang et al.
Is the Pope Catholic? Yes, the Pope is Catholic. Generative Evaluation of Non-Literal Intent Resolution in LLMs
Akhila Yerukola, Saujas Vaduguru, Daniel Fried et al.
Naming, Describing, and Quantifying Visual Objects in Humans and LLMs
Alberto Testoni, Juell Sprott, Sandro Pezzelle
Are LLMs classical or nonmonotonic reasoners? Lessons from generics
Alina Leidinger, Robert Van Rooij, Ekaterina Shutova
Cross-Modal Projection in Multimodal LLMs Doesn’t Really Project Visual Attributes to Textual Space
Gaurav Verma, Minje Choi, Kartik Sharma et al.
OpenEval: Benchmarking Chinese LLMs across Capability, Alignment and Safety
Chuang Liu, Linhao Yu, Jiaxuan Li et al.
UltraEval: A Lightweight Platform for Flexible and Comprehensive Evaluation for LLMs
Chaoqun He, Renjie Luo, Shengding Hu et al.
ITAKE: Interactive Unstructured Text Annotation and Knowledge Extraction System with LLMs and ModelOps
Jiahe Song, Hongxin Ding, Zhiyuan Wang et al.
ELLA: Empowering LLMs for Interpretable, Accurate and Informative Legal Advice
Yutong Hu, Kangcheng Luo, Yansong Feng
Pragmatic inference of scalar implicature by LLMs
Ye-eun Cho, Seong mook Kim