Papers
16,749 papers found
ESF: Efficient Sensitive Fingerprinting for Black-Box Tamper Detection of Large Language Models
Xiaofan Bai, Pingyi Hu, Xiaojing Ma et al.
EssayJudge: A Multi-Granular Benchmark for Assessing Automated Essay Scoring Capabilities of Multimodal Large Language Models
Jiamin Su, Yibo Yan, Fangteng Fu et al.
Establishing Trustworthy LLM Evaluation via Shortcut Neuron Analysis
Kejian Zhu, Shangqing Tu, Zhuoran Jin et al.
Estimating Privacy Leakage of Augmented Contextual Knowledge in Language Models
James Flemings, Bo Jiang, Wanrong Zhang et al.
Estimation of Text Difficulty in the Context of Language Learning
Anisia Katinskaia, Anh-Duc Vu, Jue Hou et al.
Eta-WavLM: Efficient Speaker Identity Removal in Self-Supervised Speech Representations Using a Simple Linear Equation
Giuseppe Ruggiero, Matteo Testa, Jurgen Van De Walle et al.
ETF: An Entity Tracing Framework for Hallucination Detection in Code Summaries
Kishan Maharaj, Vitobha Munigala, Srikanth G. Tamilselvam et al.
EtiCor++: Towards Understanding Etiquettical Bias in LLMs
Ashutosh Dwivedi, Siddhant Shivdutt Singh, Ashutosh Modi
ETRQA: A Comprehensive Benchmark for Evaluating Event Temporal Reasoning Abilities of Large Language Models
Sigang Luo, Yinan Liu, Dongying Lin et al.
EuroVerdict: A Multilingual Dataset for Verdict Generation Against Misinformation
Daniel Russo, Fariba Sadeghi, Stefano Menini et al.
Evading Toxicity Detection with ASCII-art: A Benchmark of Spatial Attacks on Moderation Systems
Sergey Berezin, Reza Farahbakhsh, Noel Crespi
Evaluating Credibility and Political Bias in LLMs for News Outlets in Bangladesh
Tabia Tanzin Prama, Md. Saiful Islam
Evaluating Design Decisions for Dual Encoder-based Entity Disambiguation
Susanna Rücker, Alan Akbik
Evaluating Implicit Bias in Large Language Models by Attacking From a Psychometric Perspective
Yuchen Wen, Keping Bi, Wei Chen et al.
Evaluating Instructively Generated Statement by Large Language Models for Directional Event Causality Identification
Wei Xiang, Chuanhong Zhan, Qing Zhang et al.
Evaluating Intermediate Reasoning of Code-Assisted Large Language Models for Mathematics
Zena Al-Khalili, Nick Howell, Dietrich Klakow
Evaluating Language Models as Synthetic Data Generators
Seungone Kim, Juyoung Suk, Xiang Yue et al.
Evaluating Large Language Models for Confidence-based Check Set Selection
Jane Arleth dela Cruz, Iris Hendrickx, Martha Larson
Evaluating Lexical Proficiency in Neural Language Models
Cristiano Ciaccio, Alessio Miaschi, Felice Dell’Orletta
Evaluating LLMs’ Assessment of Mixed-Context Hallucination Through the Lens of Summarization
Siya Qi, Rui Cao, Yulan He et al.
Evaluating LLMs for Portuguese Sentence Simplification with Linguistic Insights
Arthur Mariano Rocha De Azevedo Scalercio, Elvis A. De Souza, Maria José Bocorny Finatto et al.
Evaluating LLMs’ Mathematical and Coding Competency through Ontology-guided Interventions
Pengfei Hong, Navonil Majumder, Deepanway Ghosal et al.
Evaluating LLMs with Multiple Problems at once
Zhengxiang Wang, Jordan Kodner, Owen Rambow
Evaluating Multimodal Language Models as Visual Assistants for Visually Impaired Users
Antonia Karamolegkou, Malvina Nikandrou, Georgios Pantazopoulos et al.
Evaluating Multimodal Large Language Models on Video Captioning via Monte Carlo Tree Search
Linhao Yu, Xingguang Ji, Yahui Liu et al.