Papers
Interpretation Meets Safety: A Survey on Interpretation Methods and Tools for Improving LLM Safety
Seongmin Lee, Aeree Cho, Grace C. Kim et al.
Improving Task Diversity in Label Efficient Supervised Finetuning of LLMs
Abhinav Arabelly, Jagrut Nemade, Robert D Nowak et al.
Memorization or Reasoning? Exploring the Idiom Understanding of LLMs
Jisu Kim, Youngwoo Shin, Uiji Hwang et al.
Learning to Ask: When LLM Agents Meet Unclear Instruction
Wenxuan Wang, Shi Juluan, Zixuan Ling et al.
StepSearch: Igniting LLMs Search Ability via Step-Wise Proximal Policy Optimization
Xuhui Zheng, Kang An, Ziliang Wang et al.
Understanding and Leveraging the Expert Specialization of Context Faithfulness in Mixture-of-Experts LLMs
Jun Bai, Minghao Tong, Yang Liu et al.
Data-Efficient Selection via Grammatical Complexity in Continual Pre-training of Domain-Specific LLMs
Yizhou Ying, Geng Zhang, Cui Danxin et al.
DiffusionAttacker: Diffusion-Driven Prompt Manipulation for LLM Jailbreak
Hao Wang, Hao Li, Junda Zhu et al.
Massive Supervised Fine-tuning Experiments Reveal How Data, Layer, and Training Factors Shape LLM Alignment Quality
Yuto Harada, Yusuke Yamauchi, Yusuke Oda et al.
Internal Chain-of-Thought: Empirical Evidence for Layer‐wise Subtask Scheduling in LLMs
Zhipeng Yang, Junzhuo Li, Siyu Xia et al.
Debiasing Multilingual LLMs in Cross-lingual Latent Space
Qiwei Peng, Guimin Hu, Yekun Chai et al.
Persona-Augmented Benchmarking: Evaluating LLMs Across Diverse Writing Styles
Kimberly Truong, Riccardo Fogliato, Hoda Heidari et al.
Job Unfair: An Investigation of Gender and Occupational Bias in Free-Form Text Completions by LLMs
Camilla Casula, Sebastiano Vecellio Salto, Elisa Leonardelli et al.
Understanding LLMs’ Cross-Lingual Context Retrieval: How Good It Is And Where It Comes From
Changjiang Gao, Hankun Lin, Xin Huang et al.
Exploring the Hidden Capacity of LLMs for One-Step Text Generation
Gleb Mezentsev, Ivan Oseledets
DCR: Quantifying Data Contamination in LLMs Evaluation
Cheng Xu, Nan Yan, Shuhao Guan et al.
Building Trust in Clinical LLMs: Bias Analysis and Dataset Transparency
Svetlana Maslenkova, Clement Christophe, Marco AF Pimentel et al.
Drivel-ology: Challenging LLMs with Interpreting Nonsense with Depth
Yang Wang, Chenghao Xiao, Chia-Yi Hsiao et al.
InterIDEAS: Philosophical Intertextuality via LLMs
Yue Yang, Yinzhi Xu, Chenghao Huang et al.
GER-LLM: Efficient and Effective Geospatial Entity Resolution with Large Language Model
Haojia Zhu, Zhicheng Li, Jiahui Jin
RAcQUEt: Unveiling the Dangers of Overlooked Referential Ambiguity in Visual LLMs
Alberto Testoni, Barbara Plank, Raquel Fernández
Rethinking Text-based Protein Understanding: Retrieval or LLM?
Juntong Wu, Zijing Liu, He Cao et al.
Easy as PIE? Identifying Multi-Word Expressions with LLMs
Kai Golan Hashiloni, Ofri Hefetz, Kfir Bar
Graph-R1: Incentivizing the Zero-Shot Graph Learning Capability in LLMs via Explicit Reasoning
Yicong Wu, Guangyue Lu, Yuan Zuo et al.
Scalable and Culturally Specific Stereotype Dataset Construction via Human-LLM Collaboration
Weicheng Ma, John J. Guerrerio, Soroush Vosoughi