Papers
The Death and Life of Great Prompts: Analyzing the Evolution of LLM Prompts from the Structural Perspective
Yihan Ma, Xinyue Shen, Yixin Wu et al.
The Effect of Surprisal on Reading Times in Information Seeking and Repeated Reading
Keren Gruteke Klein, Yoav Meiri, Omer Shubi et al.
The effects of distance on NPI illusive effects in BERT
So Young Lee, Mai Ha Vu
The Emergence of Compositional Languages in Multi-entity Referential Games: from Image to Graph Representations
Daniel Akkerman, Phong Le, Raquel G. Alhama
The Empirical Variability of Narrative Perceptions of Social Media Texts
Joel Mire, Maria Antoniak, Elliott Ash et al.
The Factuality Tax of Diversity-Intervened Text-to-Image Generation: Benchmark and Fact-Augmented Intervention
Yixin Wan, Di Wu, Haoran Wang et al.
The Fall of ROME: Understanding the Collapse of LLMs in Model Editing
Wanli Yang, Fei Sun, Jiajun Tan et al.
The GenderQueer Test Suite
Steinunn Rut Friðriksdóttir
The Generation Gap: Exploring Age Bias in the Value Systems of Large Language Models
Siyang Liu, Trisha Maturi, Bowen Yi et al.
The Greatest Good Benchmark: Measuring LLMs’ Alignment with Utilitarian Moral Dilemmas
Giovanni Franco Gabriel Marraffini, Andrés Cotton, Noe Fabian Hsueh et al.
The Grid: A semi-automated tool to support expert-driven modeling
Allegra A. Beal Cohen, Maria Alexeeva, Keith Alcock et al.
The Illusion of Competence: Evaluating the Effect of Explanations on Users’ Mental Models of Visual Question Answering Systems
Judith Sieker, Simeon Junker, Ronja Utescher et al.
The Instinctive Bias: Spurious Images lead to Illusion in MLLMs
Tianyang Han, Qing Lian, Rui Pan et al.
The Language of Trauma: Modeling Traumatic Event Descriptions Across Domains with Explainable AI
Miriam Schirmer, Tobias Leemann, Gjergji Kasneci et al.
The LLM Effect: Are Humans Truly Using LLMs, or Are They Being Influenced By Them Instead?
Alexander S. Choi, Syeda Sabrina Akter, JP Singh et al.
The Lou Dataset - Exploring the Impact of Gender-Fair Language in German Text Classification
Andreas Waldis, Joel Birrer, Anne Lauscher et al.
Themis: A Reference-free NLG Evaluation Language Model with Flexibility and Interpretability
Xinyu Hu, Li Lin, Mingqi Gao et al.
The Moral Foundations Weibo Corpus
Renjie Cao, Miaoyan Hu, Jiahan Wei et al.
The Multilingual Alignment Prism: Aligning Global and Local Preferences to Reduce Harm
Aakanksha, Arash Ahmadian, Beyza Ermis et al.
The Mystery of Compositional Generalization in Graph-based Generative Commonsense Reasoning
Xiyan Fu, Anette Frank
The Mystery of In-Context Learning: A Comprehensive Survey on Interpretation and Analysis
Yuxiang Zhou, Jiazheng Li, Yanzheng Xiang et al.
The Mystery of the Pathological Path-star Task for Language Models
Arvid Frydenlund
The Odyssey of Commonsense Causality: From Foundational Benchmarks to Cutting-Edge Reasoning
Shaobo Cui, Zhijing Jin, Bernhard Schölkopf et al.
TheoremLlama: Transforming General-Purpose LLMs into Lean4 Experts
Ruida Wang, Jipeng Zhang, Yizhen Jia et al.