Papers
When LLMs Annotate: Reliability Challenges in Low-Resource NLI
Solmaz Panahi, John Kelleher, Vasudevan Nedumpozhimana
When Meanings Meet: Investigating the Emergence and Quality of Shared Concept Spaces during Multilingual Language Model Training
Felicia Körner, Max Müller-Eberstein, Anna Korhonen et al.
When Multilingual Evaluation Assumptions Fail: Tokenization Effects Across Scripts
Manodyna K H, Luc De Nardi
When Prompt Optimization Becomes Jailbreaking: Adaptive Red-Teaming of Large Language Models
Zafir Shamsi, Nikhil Chekuru, Zachary Guzman et al.
When Semantic Overlap Is Not Enough: Cross-Lingual Euphemism Transfer Between Turkish and English
Hasan Can Biyik, Libby Barak, Jing Peng et al.
When Speed Meets Intelligence: Scalable Conversational NER in an Ever-evolving World
Karim Ghonim, Antonio Roberto, Davide Bernardi
When the Model Said ‘No Comment’, We Knew Helpfulness Was Dead, Honesty Was Alive, and Safety Was Terrified
Gautam Siddharth Kashyap, Mark Dras, Usman Naseem
When Words Wear Masks: Detecting Malicious Intents and Hostile Impacts of Online Hate Speech
Priyansh Singhal, Piyush Joshi
Where Are We at with Automatic Speech Recognition for the Bambara Language?
Seydou Diallo, Yacouba Diarra, Panga Azazia Kamaté et al.
Where Do LLMs Compose Meaning? A Layerwise Analysis of Compositional Robustness
Nura Aljaafari, Danilo Carvalho, Andre Freitas
Where do LLMs currently stand on biomedical NER in both clean and noisy settings ?
Christophe Ye, Cassie S. Mitchell
Which course? Discourse! Teaching Discourse and Generation in the Era of LLMs
Junyi Jessy Li, Yang Janet Liu, Kanishka Misra et al.
Which Works Best for Vietnamese? A Practical Study of Information Retrieval Methods across Domains
Long S. T. Nguyen, Tho T. Quan
Who Judges the Judge? Evaluating LLM-as-a-Judge for French Medical open-ended QA
Ikram Belmadani, Oumaima El Khettari, Pacôme Constant dit Beaufils et al.
Whom to Trust? Analyzing the Divergence Between User Satisfaction and LLM-as-a-Judge in E-Commerce RAG Systems
Arif Türkmen, Kaan Efe Keleş
Who Plays Which Role? Protagonist Detection and Classification in Moral Discourse
Mirko Sommer, Maria Becker
Who You Are, What You Say: Intra- and Inter- Context Personality for Emotion Recognition in Conversation
Tazeek Bin Abdur Rakib, Lay-Ki Soon, Wern Han Lim
Why Are We Lonely? Leveraging LLMs to Measure and Understand Loneliness in Caregivers and Non-caregivers
Michelle Damin Kim, Ellie S. Paek, Yufen Lin et al.
WikiFirst: A Genre-Fixed, Content-controlled Corpus for Evaluating Content Effects in Authorship Analysis
Dung Nguyen, G. Çağatay Sat, Evgeny Pyshkin et al.
WikiLingDiv: a dataset for quantifying digital linguistic diversity using Wikipedia page views
Hannes Essfors, Andreas Baumann
Wikontic: Constructing Wikidata-Aligned, Ontology-Aware Knowledge Graphs with Large Language Models
Alla Chepurova, Aydar Bulatov, Mikhail Burtsev et al.
Word Surprisal Correlates with Sentential Contradiction in LLMs
Ning Shi, Bradley Hauer, David Basil et al.
WorkForceAgent-R1: Incentivizing Reasoning Capability in LLM-based Web Agents via Reinforcement Learning
Yuchen Zhuang, Di Jin, Jiaao Chen et al.
Wugnectives: Novel Entity Inferences of Language Models from Discourse Connectives
Daniel Brubaker, William Sheffield, Junyi Jessy Li et al.
xLM: A Python Package for Non-Autoregressive Language Models
Dhruvesh Patel, Durga Prasad Maram, Sai Sreenivas Chintha et al.