Papers
5,479 papers found
DialogXpert: Driving Intelligent and Emotion-Aware Conversations Through Online Value-Based Reinforcement Learning with LLM Priors
Tazeek Bin Abdur Rakib, Ambuj Mehrish, Lay-Ki Soon et al.
Where Norms and References Collide: Evaluating LLMs on Normative Reasoning
Mitchell Abrams, Kaveh Eskandari Miandoab, Felix Gervits et al.
Beyond Next Token Probabilities: Learnable, Fast Detection of Hallucinations and Data Contamination on LLM Output Distributions
Guy Bar-Shalom, Fabrizio Frasca, Derek Lim et al.
Do LLMs Really Struggle at NL-FOL Translation? Revealing Their Strengths via a Novel Benchmarking Strategy
Andrea Brunello, Luca Geatti, Michele Mignani et al.
RaCoT: Plug-and-Play Contrastive Example Generation Mechanism for Enhanced LLM Reasoning Reliability
Kaitong Cai, Jusheng Zhang, Yijia Fan et al.
Does Question Really Matter? The Attribution of Answer Bias in LLM Evaluation
Boxi Cao, Ruotong Pan, Hongyu Lin et al.
Can Editing LLMs Inject Harm?
Canyu Chen, Baixiang Huang, Zekun Li et al.
DEPO: Dual-Efficiency Preference Optimization for LLM Agents
Sirui Chen, Mengshi Zhao, Lei Xu et al.
Activations as Features: Probing LLMs for Generalizable Essay Scoring Representations
Jinwei Chi, Ke Wang, Yu Chen et al.
HanjaBridge: Resolving Semantic Ambiguity in Korean LLMs via Hanja-Augmented Pre-Training
Seungho Choi, Sihyun Park, Minsang Kim et al.
Persistent Backdoor Attacks Under Continual Fine-Tuning of LLMs
Jing Cui, Yufei Han, Jianbin Jiao et al.
When Smiley Turns Hostile: Interpreting How Emojis Trigger LLMs’ Toxicity
Shiyao Cui, Xijia Feng, Yingkang Wang et al.
HLPD: Aligning LLMs to Human Language Preference for Machine-Revised Text Detection
Fangqi Dai, Xingjian Jiang, Zizhuang Deng
Measuring the Unmeasurable: Unveiling Latent Cognitive Capabilities of LLM
Cui Danxin, Sihang Jiang, Keyi Wang et al.
Guess or Recall? Training CNNs to Classify and Localize Memorization in LLMs
Jérémie Dentan, Davide Buscaldi, Sonia Vanier
MemGuide: Intent-Driven Memory Selection for Goal-Oriented Multi-Session LLM Agents
Yiming Du, Bingbing Wang, Yang He et al.
Graph of Verification: Structured Verification of LLM Reasoning with Directed Acyclic Graphs
Jiwei Fang, Bin Zhang, Changwei Wang et al.
Toward Better EHR Reasoning in LLMs: Reinforcement Learning with Expert Attention Guidance
Yue Fang, Yuxin Guo, Jiaran Gao et al.
FinMathBench: A Formula-Driven Benchmark for Evaluating LLMs’ Math Reasoning Capabilities in Finance
Yi He, Ping Wang, Shiqiang Xiong et al.
Format Matters: The Robustness of Multimodal LLMs in Reviewing Evidence from Tables and Charts
Xanh Ho, Yun-Ang Wu, Sunisth Kumar et al.
Benchmarking LLMs’ Mathematical Reasoning with Unseen Random Variables Questions
Zijin Hong, Hao Wu, Su Dong et al.
SPA: Achieving Consensus in LLM Alignment via Self-Priority Optimization
Yue Huang, Xiangqi Wang, Xiangliang Zhang
LiteLong: Resource-Efficient Long-Context Data Synthesis for LLMs
Junlong Jia, Xing Wu, Chaochen Gao et al.
Importance-Aware Data Selection for Efficient LLM Instruction Tuning
Tingyu Jiang, Shen Li, Yiyao Song et al.