Papers
2,781 papers found
GuessArena: Guess Who I Am? A Self-Adaptive Framework for Evaluating LLMs in Domain-Specific Knowledge and Reasoning
Qingchen Yu, Zifan Zheng, Ding Chen et al.
Beware of Your Po! Measuring and Mitigating AI Safety Risks in Role-Play Fine-Tuning of LLMs
Weixiang Zhao, Yulin Hu, Yang Deng et al.
VMLU Benchmarks: A comprehensive benchmark toolkit for Vietnamese LLMs
Cuc Thi Bui, Nguyen Truong Son, Truong Van Trang et al.
Scaling up the State Size of RNN LLMs for Long-Context Scenarios
Kai Liu, Jianfei Gao, Kai Chen
HyKGE: A Hypothesis Knowledge Graph Enhanced RAG Framework for Accurate and Reliable Medical LLMs Responses
Xinke Jiang, Ruizhe Zhang, Yongxin Xu et al.
UniLR: Unleashing the Power of LLMs on Multiple Legal Tasks with a Unified Legal Retriever
Ang Li, Yiquan Wu, Yifei Liu et al.
DebateCoder: Towards Collective Intelligence of LLMs via Test Case Driven LLM Debate for Code Generation
Jizheng Chen, Kounianhua Du, Xinyi Dai et al.
HomeBench: Evaluating LLMs in Smart Homes with Valid and Invalid Instructions Across Single and Multiple Devices
Silin Li, Yuhang Guo, Jiashu Yao et al.
Making LLMs Better Many-to-Many Speech-to-Text Translators with Curriculum Learning
Yexing Du, Youcheng Pan, Ziyang Ma et al.
Nudging: Inference-time Alignment of LLMs via Guided Decoding
Yu Fei, Yasaman Razeghi, Sameer Singh
Lost in Literalism: How Supervised Training Shapes Translationese in LLMs
Yafu Li, Ronghao Zhang, Zhilin Wang et al.
Exploring Compositional Generalization of Multimodal LLMs for Medical Imaging
Zhenyang Cai, Junying Chen, Rongsheng Wang et al.
Innovative Image Fraud Detection with Cross-Sample Anomaly Analysis: The Power of LLMs
QiWen Wang, Junqi Yang, Zhenghao Lin et al.
QDTSynth: Quality-Driven Formal Theorem Synthesis for Enhancing Proving Performance of LLMs
Lei Wang, Ruobing Zuo, Gaolei He et al.
Debiasing the Fine-Grained Classification Task in LLMs with Bias-Aware PEFT
Daiying Zhao, Xinyu Yang, Hang Chen
Continual Gradient Low-Rank Projection Fine-Tuning for LLMs
Chenxu Wang, Yilin Lyu, Zicheng Sun et al.
Towards Objective Fine-tuning: How LLMs’ Prior Knowledge Causes Potential Poor Calibration?
Ziming Wang, Zeyu Shi, Haoyi Zhou et al.
Two Intermediate Translations Are Better Than One: Fine-tuning LLMs for Document-level Translation Refinement
Yichen Dong, Xinglin Lyu, Junhui Li et al.
Can LLMs Ground when they (Don’t) Know: A Study on Direct and Loaded Political Questions
Clara Lachenmaier, Judith Sieker, Sina Zarrieß
EPO: Explicit Policy Optimization for Strategic Reasoning in LLMs via Reinforcement Learning
Xiaoqian Liu, Ke Wang, Yongbin Li et al.
Learning Together to Perform Better: Teaching Small-Scale LLMs to Collaborate via Preferential Rationale Tuning
Sohan Patnaik, Milan Aggarwal, Sumit Bhatia et al.
MasRouter: Learning to Route LLMs for Multi-Agent Systems
Yanwei Yue, Guibin Zhang, Boyang Liu et al.
Middle-Layer Representation Alignment for Cross-Lingual Transfer in Fine-Tuned LLMs
Danni Liu, Jan Niehues
The Alternative Annotator Test for LLM-as-a-Judge: How to Statistically Justify Replacing Human Annotators with LLMs
Nitay Calderon, Roi Reichart, Rotem Dror
EvolveBench: A Comprehensive Benchmark for Assessing Temporal Awareness in LLMs on Evolving Knowledge
Zhiyuan Zhu, Yusheng Liao, Zhe Chen et al.