Papers
LLMs Don’t Know Their Own Decision Boundaries: The Unreliability of Self-Generated Counterfactual Explanations
Harry Mayne, Ryan Othniel Kearns, Yushi Yang et al.
Grounding Multilingual Multimodal LLMs With Cultural Knowledge
Jean De Dieu Nyandwi, Yueqi Song, Simran Khanuja et al.
NEXUS: Network Exploration for eXploiting Unsafe Sequences in Multi-Turn LLM Jailbreaks
Javad Rafiei Asl, Sidhant Narula, Mohammad Ghasemigol et al.
Sketch-of-Thought: Efficient LLM Reasoning with Adaptive Cognitive-Inspired Sketching
Simon A. Aytes, Jinheon Baek, Sung Ju Hwang
From Language to Cognition: How LLMs Outgrow the Human Language Network
Badr AlKhamissi, Greta Tuckute, Yingtian Tang et al.
Hallucination Detection in LLMs Using Spectral Features of Attention Maps
Jakub Binkowski, Denis Janiak, Albert Sawczyn et al.
Towards a Holistic and Automated Evaluation Framework for Multi-Level Comprehension of LLMs in Book-Length Contexts
Yuho Lee, Jiaqi Deng, Nicole Hee-Yeon Kim et al.
Evaluation and Facilitation of Online Discussions in the LLM Era: A Survey
Katerina Korre, Dimitris Tsirmpas, Nikos Gkoumas et al.
From Word to World: Evaluate and Mitigate Culture Bias in LLMs via Word Association Test
Xunlian Dai, Li Zhou, Benyou Wang et al.
AdaSteer: Your Aligned LLM is Inherently an Adaptive Jailbreak Defender
Weixiang Zhao, Jiahe Guo, Yulin Hu et al.
TFDP: Token-Efficient Disparity Audits for Autoregressive LLMs via Single-Token Masked Evaluation
Inderjeet Singh, Ramya Srinivasan, Roman Vainshtein et al.
Hidden in Plain Sight: Reasoning in Underspecified and Misspecified Scenarios for Multimodal LLMs
Qianqi Yan, Hongquan Li, Shan Jiang et al.
Code Execution as Grounded Supervision for LLM Reasoning
Dongwon Jung, Wenxuan Zhou, Muhao Chen
Subjective Behaviors and Preferences in LLM: Language of Browsing
Sai Sundaresan, Harshita Chopra, Atanu R. Sinha et al.
TactfulToM: Do LLMs have the Theory of Mind ability to understand White Lies?
Yiwei Liu, Emma Jane Pretty, Jiahao Huang et al.
Analyzing values about gendered language reform in LLMs’ revisions
Jules Watson, Xi Wang, Raymond Liu et al.
Stepwise Informativeness Search for Improving LLM Reasoning
Siyuan Wang, Enda Zhao, Xiang Ren
Harmful Prompt Laundering: Jailbreaking LLMs with Abductive Styles and Symbolic Encoding
Seongho Joo, Hyukhun Koh, Kyomin Jung
Amulet: Putting Complex Multi-Turn Conversations on the Stand with LLM Juries
Sahana Ramnath, Anurag Mudgil, Brihi Joshi et al.
CMedCalc-Bench: A Fine-Grained Benchmark for Chinese Medical Calculations in LLM
Yunyan Zhang, Zhihong Zhu, Xian Wu
Subtle Risks, Critical Failures: A Framework for Diagnosing Physical Safety of LLMs for Embodied Decision Making
Yejin Son, Minseo Kim, Sungwoong Kim et al.
HELENE: Hessian Layer-wise Clipping and Gradient Annealing for Accelerating Fine-tuning LLM with Zeroth-order Optimization
Huaqin Zhao, Jiaxi Li, Yi Pan et al.
From Parameters to Performance: A Data-Driven Study on LLM Structure and Development
Suqing Wang, Zuchao Li, Shi Luohe et al.
Speculating LLMs’ Chinese Training Data Pollution from Their Tokens
Qingjie Zhang, Di Wang, Haoting Qian et al.
The Stepwise Deception: Simulating the Evolution from True News to Fake News with LLM Agents
Yuhan Liu, Zirui Song, Juntian Zhang et al.