Papers
2,781 papers found
Does Data Contamination Detection Work (Well) for LLMs? A Survey and Evaluation on Detection Assumptions
Yujuan Fu, Ozlem Uzuner, Meliha Yetisgen et al.
From Single to Multi: How LLMs Hallucinate in Multi-Document Summarization
Catarina G Belém, Pouya Pezeshkpour, Hayate Iso et al.
LLMs are Biased Teachers: Evaluating LLM Bias in Personalized Education
Iain Weissburg, Sathvika Anand, Sharon Levy et al.
RankAdaptor: Hierarchical Rank Allocation for Efficient Fine-Tuning Pruned LLMs via Performance Model
Changhai Zhou, Shijie Han, Lining Yang et al.
Rationale Behind Essay Scores: Enhancing S-LLM’s Multi-Trait Essay Scoring with Rationale Generated by LLMs
SeongYeub Chu, Jong Woo Kim, Bryan Wong et al.
MAQA: Evaluating Uncertainty Quantification in LLMs Regarding Data Uncertainty
Yongjin Yang, Haneul Yoo, Hwaran Lee
SeaExam and SeaBench: Benchmarking LLMs with Local Multilingual Questions in Southeast Asia
Chaoqun Liu, Wenxuan Zhang, Jiahao Ying et al.
SOLID: Self-seeding and Multi-intent Self-instructing LLMs for Generating Intent-aware Information-Seeking Dialogs
Arian Askari, Roxana Petcu, Chuan Meng et al.
Text Annotation via Inductive Coding: Comparing Human Experts to LLMs in Qualitative Data Analysis
Angelina Parfenova, Andreas Marfurt, Jürgen Pfeffer et al.
Optimizing LLMs for Italian: Reducing Token Fertility and Enhancing Efficiency Through Vocabulary Adaptation
Luca Moroni, Giovanni Puccetti, Pere-Lluís Huguet Cabot et al.
LLMs for Extremely Low-Resource Finno-Ugric Languages
Taido Purason, Hele-Andra Kuulmets, Mark Fishel
AutoBreach: Universal and Adaptive Jailbreaking with Efficient Wordplay-Guided Optimization via Multi-LLMs
Jiawei Chen, Xiao Yang, Zhengwei Fang et al.
HEISIR: Hierarchical Expansion of Inverted Semantic Indexing for Training-free Retrieval of Conversational Data using LLMs
Sangyeop Kim, Hangyeul Lee, Yohan Lee
Investigating the Shortcomings of LLMs in Step-by-Step Legal Reasoning
Venkatesh Mishra, Bimsara Pathiraja, Mihir Parmar et al.
DHP Benchmark: Are LLMs Good NLG Evaluators?
Yicheng Wang, Jiayi Yuan, Yu-Neng Chuang et al.
Exploring the Application of 7B LLMs for Named Entity Recognition in Chinese Ancient Texts
Chenrui Zheng, Yicheng Zhu, Han Bi
Finetuning LLMs for EvaCun 2025 token prediction shared task
Josef Jon, Ondřej Bojar
Beyond Base Predictors: Using LLMs to Resolve Ambiguities in Akkadian Lemmatization
Frederick Riemenschneider
EvaCun 2025 Shared Task: Lemmatization and Token Prediction in Akkadian and Sumerian using LLMs
Shai Gordin, Aleksi Sahala, Shahar Spencer et al.
Tongue-Tied: Breaking LLMs Safety Through New Language Learning
Bibek Upadhayay, Vahid Behzadan
Mining Social Media for Barriers to Opioid Recovery with LLMs
Vinu Ekanayake, Md Sultan Al Nahian, Ramakanth Kavuluru
Using LLMs to improve RL policies in personalized health adaptive interventions
Karine Karine, Benjamin Marlin
The Emotional Spectrum of LLMs: Leveraging Empathy and Emotion-Based Markers for Mental Health Support
Alessandro De Grandi, Federico Ravenda, Andrea Raballo et al.
Bigger But Not Better: Small Neural Language Models Outperform LLMs in Detection of Thought Disorder
Changye Li, Weizhe Xu, Serguei Pakhomov et al.
Using LLMs to Aid Annotation and Collection of Clinically-Enriched Data in Bipolar Disorder and Schizophrenia
Ankit Aich, Avery Quynh, Pamela Osseyi et al.