Papers
Drift: Enhancing LLM Faithfulness in Rationale Generation via Dual-Reward Probabilistic Inference
Jiazheng Li, Hanqi Yan, Yulan He
Fairness through Difference Awareness: Measuring Desired Group Discrimination in LLMs
Angelina Wang, Michelle Phan, Daniel E. Ho et al.
DRAG: Distilling RAG for SLMs from LLMs to Transfer Knowledge and Mitigate Hallucination via Evidence and Graph-based Distillation
Jennifer Chen, Aidar Myrzakhan, Yaxin Luo et al.
Rolling the DICE on Idiomaticity: How LLMs Fail to Grasp Context
Maggie Mi, Aline Villavicencio, Nafise Sadat Moosavi
MathFusion: Enhancing Mathematical Problem-solving of LLM through Instruction Fusion
Qizhi Pei, Lijun Wu, Zhuoshi Pan et al.
LLM as a Broken Telephone: Iterative Generation Distorts Information
Amr Mohamed, Mingmeng Geng, Michalis Vazirgiannis et al.
Enough Coin Flips Can Make LLMs Act Bayesian
Ritwik Gupta, Rodolfo Corona, Jiaxin Ge et al.
GAMEBoT: Transparent Assessment of LLM Reasoning in Games
Wenye Lin, Jonathan Roberts, Yunhan Yang et al.
A Text is Worth Several Tokens: Text Embedding from LLMs Secretly Aligns Well with The Key Tokens
Zhijie Nie, Richong Zhang, Zhanyu Wu
CER: Confidence Enhanced Reasoning in LLMs
Ali Razghandi, Seyed Mohammad Hadi Hosseini, Mahdieh Soleymani Baghshah
SynthesizeMe! Inducing Persona-Guided Prompts for Personalized Reward Models in LLMs
Michael J. Ryan, Omar Shaikh, Aditri Bhagirath et al.
Self-Error-Instruct: Generalizing from Errors for LLMs Mathematical Reasoning
Erxin Yu, Jing Li, Ming Liao et al.
MultiAgentBench : Evaluating the Collaboration and Competition of LLM agents
Kunlun Zhu, Hongyi Du, Zhaochen Hong et al.
LLMs can Perform Multi-Dimensional Analytic Writing Assessments: A Case Study of L2 Graduate-Level Academic English Writing
Zhengxiang Wang, Veronika Makarova, Zhi Li et al.
SEUF: Is Unlearning One Expert Enough for Mixture-of-Experts LLMs?
Haomin Zhuang, Yihua Zhang, Kehan Guo et al.
LocAgent: Graph-Guided LLM Agents for Code Localization
Zhaoling Chen, Robert Tang, Gangda Deng et al.
CKnowEdit: A New Chinese Knowledge Editing Dataset for Linguistics, Facts, and Logic Error Correction in LLMs
Jizhan Fang, Tianhe Lu, Yunzhi Yao et al.
SkillVerse : Assessing and Enhancing LLMs with Tree Evaluation
Yufei Tian, Jiao Sun, Nanyun Peng et al.
CypherBench: Towards Precise Retrieval over Full-scale Modern Knowledge Graphs in the LLM Era
Yanlin Feng, Simone Papicchio, Sajjadur Rahman
Are LLMs effective psychological assessors? Leveraging adaptive RAG for interpretable mental health screening through psychometric practice
Federico Ravenda, Seyed Ali Bahrainian, Andrea Raballo et al.
Improving Preference Extraction In LLMs By Identifying Latent Knowledge Through Classifying Probes
Sharan Maiya, Yinhong Liu, Ramit Debnath et al.
White Men Lead, Black Women Help? Benchmarking and Mitigating Language Agency Social Biases in LLMs
Yixin Wan, Kai-Wei Chang
AIMSCheck: Leveraging LLMs for AI-Assisted Review of Modern Slavery Statements Across Jurisdictions
Adriana Eufrosina Bora, Akshatha Arodi, Duoyi Zhang et al.
Bitnet.cpp: Efficient Edge Inference for Ternary LLMs
Jinheng Wang, Hansong Zhou, Ting Song et al.
PIG: Privacy Jailbreak Attack on LLMs via Gradient-based Iterative In-Context Optimization
Yidan Wang, Yanan Cao, Yubing Ren et al.