Papers
2,781 papers found
Improving Automatic Evaluation of Large Language Models (LLMs) in Biomedical Relation Extraction via LLMs-as-the-Judge
Md Tahmid Rahman Laskar, Israt Jahan, Elham Dolatabadi et al.
DOGE: LLMs-Enhanced Hyper-Knowledge Graph Recommender for Multimodal Recommendation
Fanshen Meng, Zhenhua Meng, Ru Jin et al.
Hide and Seek in Noise Labels: Noise-Robust Collaborative Active Learning with LLMs-Powered Assistance
Bo Yuan, Yulin Chen, Yin Zhang et al.
From Model-centered to Human-Centered: Revision Distance as a Metric for Text Evaluation in LLMs-based Applications
Yongqiang Ma, Lizhi Qing, Jiawei Liu et al.
On LLMs-Driven Synthetic Data Generation, Curation, and Evaluation: A Survey
Lin Long, Rui Wang, Ruixuan Xiao et al.
CalibraEval: Calibrating Prediction Distribution to Mitigate Selection Bias in LLMs-as-Judges
Haitao Li, Junjie Chen, Qingyao Ai et al.
Judging the Judges: Evaluating Alignment and Vulnerabilities in LLMs-as-Judges
Aman Singh Thakur, Kartik Choudhary, Venkat Srinik Ramayapally et al.
Submodular-based In-context Example Selection for LLMs-based Machine Translation
Baijun Ji, Xiangyu Duan, Zhenyu Qiu et al.
Towards Boosting LLMs-driven Relevance Modeling with Progressive Retrieved Behavior-augmented Prompting
Zeyuan Chen, Haiyan Wu, Kaixin Wu et al.
LLMsAgainstHate@NLU of Devanagari Script Languages 2025: Hate Speech Detection and Target Identification in Devanagari Languages via Parameter Efficient Fine-Tuning of LLMs
Rushendra Sidibomma, Pransh Patwa, Parth Patwa et al.
LTGC: Long-tail Recognition via Leveraging LLMs-driven Generated Content
Qihao Zhao, Yalun Dai, Hao Li et al.
A Survey on Detection of LLMs-Generated Content
Xianjun Yang, Liangming Pan, Xuandong Zhao et al.
LLMs-as-Instructors: Learning from Errors Toward Automating Model Improvement
Jiahao Ying, Mingbao Lin, Yixin Cao et al.
CondenseLM: LLMs-driven Text Dataset Condensation via Reward Matching
Cheng Shen, Yew-Soon Ong, Joey Tianyi Zhou
Benchmarking the Detection of LLMs-Generated Modern Chinese Poetry
Shanshan Wang, Junchao Wu, Fengying Ye et al.
Reference-Guided Verdict: LLMs-as-Judges in Automatic Evaluation of Free-Form QA
Sher Badshah, Hassan Sajjad
CLLMRec: Contrastive Learning with LLMs-based View Augmentation for Sequential Recommendation
Fan Lu, Xiaolong Xu, Haolong Xiang et al.
Uncovering Latent Arguments in Social Media Messaging by Employing LLMs-in-the-Loop Strategy
Tunazzina Islam, Dan Goldwasser
Evaluation of LLMs-based Hidden States as Author Representations for Psychological Human-Centered NLP Tasks
Nikita Soni, Pranav Chitale, Khushboo Singh et al.
Can LLMs Help Uncover Insights about LLMs? A Large-Scale, Evolving Literature Analysis of Frontier LLMs
Jungsoo Park, Junmo Kang, Gabriel Stanovsky et al.
On scalable oversight with weak LLMs judging strong LLMs
Zachary Kenton, Noah Y. Siegel, János Kramár et al.
LLM Stinger: Jailbreaking LLMs Using RL Fine-Tuned LLMs (Student Abstract)
Piyush Jha, Arnav Arora, Vijay Ganesh
Can LLMs Learn from Previous Mistakes? Investigating LLMs’ Errors to Boost for Reasoning
Yongqi Tong, Dawei Li, Sizhe Wang et al.
Can LLMs Reason with Rules? Logic Scaffolding for Stress-Testing and Improving LLMs
Siyuan Wang, Zhongyu Wei, Yejin Choi et al.
How Johnny Can Persuade LLMs to Jailbreak Them: Rethinking Persuasion to Challenge AI Safety by Humanizing LLMs
Yi Zeng, Hongpeng Lin, Jingwen Zhang et al.