Papers
Establishing Trustworthy LLM Evaluation via Shortcut Neuron Analysis
Kejian Zhu, Shangqing Tu, Zhuoran Jin et al.
Do Large Language Models have an English Accent? Evaluating and Improving the Naturalness of Multilingual LLMs
Yanzhu Guo, Simone Conia, Zelin Zhou et al.
Enhancing Character-Level Understanding in LLMs through Token Internal Structure Learning
Zhu Xu, Zhiqiang Zhao, Zihan Zhang et al.
Confidence v.s. Critique: A Decomposition of Self-Correction Capability for LLMs
Zhe Yang, Yichang Zhang, Yudong Wang et al.
Automating Legal Interpretation with LLMs: Retrieval, Generation, and Evaluation
Kangcheng Luo, Quzhe Huang, Cong Jiang et al.
Game Development as Human-LLM Interaction
Jiale Hong, Hongqiu Wu, Hai Zhao
Can LLMs Simulate L2-English Dialogue? An Information-Theoretic Analysis of L1-Dependent Biases
Rena Gao, Xuetong Wu, Tatsuki Kuribayashi et al.
Auto-Arena: Automating LLM Evaluations with Agent Peer Battles and Committee Discussions
Ruochen Zhao, Wenxuan Zhang, Yew Ken Chia et al.
How Humans and LLMs Organize Conceptual Knowledge: Exploring Subordinate Categories in Italian
Andrea Pedrotti, Giulia Rambelli, Caterina Villani et al.
Stepwise Reasoning Disruption Attack of LLMs
Jingyu Peng, Maolin Wang, Xiangyu Zhao et al.
Uncertainty Propagation on LLM Agent
Qiwei Zhao, Dong Li, Yanchi Liu et al.
Are the Hidden States Hiding Something? Testing the Limits of Factuality-Encoding Capabilities in LLMs
Giovanni Servedio, Alessandro De Bellis, Dario Di Palma et al.
HD-NDEs: Neural Differential Equations for Hallucination Detection in LLMs
Qing Li, Jiahui Geng, Zongxiong Chen et al.
CoT-based Synthesizer: Enhancing LLM Performance through Answer Synthesis
Bohan Zhang, Xiaokang Zhang, Jing Zhang et al.
Can Graph Descriptive Order Affect Solving Graph Problems with LLMs?
Yuyao Ge, Shenghua Liu, Baolong Bi et al.
GIFT-SW: Gaussian noise Injected Fine-Tuning of Salient Weights for LLMs
Maxim Zhelnin, Viktor Moskvoretskii, Egor Shvetsov et al.
Biased LLMs can Influence Political Decision-Making
Jillian Fisher, Shangbin Feng, Robert Aron et al.
TheoremExplainAgent: Towards Video-based Multimodal Explanations for LLM Theorem Understanding
Max Ku, Cheuk Hei Chong, Jonathan Leung et al.
FineReason: Evaluating and Improving LLMs’ Deliberate Reasoning through Reflective Puzzle Solving
Guizhen Chen, Weiwen Xu, Hao Zhang et al.
The TIP of the Iceberg: Revealing a Hidden Class of Task-in-Prompt Adversarial Attacks on LLMs
Sergey Berezin, Reza Farahbakhsh, Noel Crespi
Drift: Enhancing LLM Faithfulness in Rationale Generation via Dual-Reward Probabilistic Inference
Jiazheng Li, Hanqi Yan, Yulan He
Fairness through Difference Awareness: Measuring Desired Group Discrimination in LLMs
Angelina Wang, Michelle Phan, Daniel E. Ho et al.
DRAG: Distilling RAG for SLMs from LLMs to Transfer Knowledge and Mitigate Hallucination via Evidence and Graph-based Distillation
Jennifer Chen, Aidar Myrzakhan, Yaxin Luo et al.
Rolling the DICE on Idiomaticity: How LLMs Fail to Grasp Context
Maggie Mi, Aline Villavicencio, Nafise Sadat Moosavi
MathFusion: Enhancing Mathematical Problem-solving of LLM through Instruction Fusion
Qizhi Pei, Lijun Wu, Zhuoshi Pan et al.