Papers
DyePack: Provably Flagging Test Set Contamination in LLMs Using Backdoors
Yize Cheng, Wenxiao Wang, Mazda Moayeri et al.
Jailbreak LLMs through Internal Stance Manipulation
Shuangjie Fu, Du Su, Beining Huang et al.
Benchmark Profiling: Mechanistic Diagnosis of LLM Benchmarks
Dongjun Kim, Gyuho Shim, Yongchan Chun et al.
Improving Chemical Understanding of LLMs via SMILES Parsing
Yunhui Jang, Jaehyung Kim, Sungsoo Ahn
The State of Multilingual LLM Safety Research: From Measuring The Language Gap To Mitigating It
Zheng Xin Yong, Beyza Ermis, Marzieh Fadaee et al.
From Capabilities to Performance: Evaluating Key Functional Properties of LLM Architectures in Penetration Testing
Lanxiao Huang, Daksh Dave, Tyler Cody et al.
The Staircase of Ethics: Probing LLM Value Priorities through Multi-Step Induction to Complex Moral Dilemmas
Ya Wu, Qiang Sheng, Danding Wang et al.
Sparse Neurons Carry Strong Signals of Question Ambiguity in LLMs
Zhuoxuan Zhang, Jinhao Duan, Edward Kim et al.
Comparing human and LLM politeness strategies in free production
Haoran Zhao, Robert D. Hawkins
CARMA: Enhanced Compositionality in LLMs via Advanced Regularisation and Mutual Information Alignment
Nura Aljaafari, Danilo Carvalho, Andre Freitas
Can LLMs simulate the same correct solutions to free-response math problems as real students?
Yuya Asano, Diane Litman, Erin Walker
Evaluating Behavioral Alignment in Conflict Dialogue: A Multi-Dimensional Comparison of LLM Agents and Humans
Deuksin Kwon, Kaleen Shrestha, Bin Han et al.
Attention Eclipse: Manipulating Attention to Bypass LLM Safety-Alignment
Pedram Zaree, Md Abdullah Al Mamun, Quazi Mishkatul Alam et al.
Implicit Values Embedded in How Humans and LLMs Complete Subjective Everyday Tasks
Arjun Arunasalam, Madison Pickering, Z. Berkay Celik et al.
Let’s Reason Formally: Natural-Formal Hybrid Reasoning Enhances LLM’s Math Capability
Ruida Wang, Yuxin Li, Yi R. Fung et al.
Fair or Framed? Political Bias in News Articles Generated by LLMs
Junho Yoo, Youhyun Shin
Calibrating LLMs for Text-to-SQL Parsing by Leveraging Sub-clause Frequencies
Terrance Liu, Shuyi Wang, Daniel Preotiuc-Pietro et al.
REACT: Representation Extraction And Controllable Tuning to Overcome Overfitting in LLM Knowledge Editing
Haitian Zhong, Yuhuan Liu, Ziyang Xu et al.
PychoAgent: Psychology-driven LLM Agents for Explainable Panic Prediction on Social Media during Sudden Disaster Events
Mengzhu Liu, Zhengqiu Zhu, Chuan Ai et al.
Stepwise Reasoning Checkpoint Analysis: A Test Time Scaling Method to Enhance LLMs’ Reasoning
Zezhong Wang, Xingshan Zeng, Weiwen Liu et al.
RJE: A Retrieval-Judgment-Exploration Framework for Efficient Knowledge Graph Question Answering with LLMs
Can Lin, Zhengwang Jiang, Ling Zheng et al.
Bias Mitigation or Cultural Commonsense? Evaluating LLMs with a Japanese Dataset
Taisei Yamamoto, Ryoma Kumon, Danushka Bollegala et al.
Chameleon LLMs: User Personas Influence Chatbot Personality Shifts
Jane Xing, Tianyi Niu, Shashank Srivastava
SynC-LLM: Generation of Large-Scale Synthetic Circuit Code with Hierarchical Language Models
Shang Liu, Yao Lu, Wenji Fang et al.
Why Stop at One Error? Benchmarking LLMs as Data Science Code Debuggers for Multi-Hop and Multi-Bug Errors
Zhiyu Yang, Shuo Wang, Yukun Yan et al.