Papers
Do LLMs model human linguistic variation? A case study in Hindi-English Verb code-mixing
Mukund Choudhary, Madhur Jindal, Gaurja Aeron et al.
FactSelfCheck: Fact-Level Black-Box Hallucination Detection for LLMs
Albert Sawczyn, Jakub Binkowski, Denis Janiak et al.
What Matters to an LLM? Behavioral and Computational Evidences from Summarization
Yongxin Zhou, Changshun Wu, Philippe Mulhem et al.
Better Call CLAUSE: A Discrepancy Benchmark for Auditing LLMs Legal Reasoning Capabilities
Manan Roy Choudhury, Adithya Chandramouli, Mannan Anand et al.
Beyond a Single Extractor: Re-thinking HTML-to-Text Extraction for LLM Pre-training
Jeffrey Li, Joshua P Gardner, Doug Kang et al.
Can Models Help Us Create Better Models? Evaluating LLMs as Data Scientists
Michał Pietruszka, Łukasz Borchmann, Aleksander Jędrosz et al.
Argument-Based Consistency in Toxicity Explanations of LLMs
Ramaravind Kommiya Mothilal, Joanna Roy, Syed Ishtiaque Ahmed et al.
Quantifying Data Contamination in Psychometric Evaluations of LLMs
Jongwook Han, Woojung Song, Jonggeun Lee et al.
How to Contextualize Empirical Data for Risk Analysis with LLMs: A Case Study of Power Outages
Haiyun Huang, Yukun Li, Marco A Pretell et al.
Thinking Beyond the Local: Multi-View Instructed Adaptive Reasoning in KG-Enhanced LLMs
Minghan Zhang, Shu Zhao, Zhen Yang et al.
FINEST: Improving LLM Responses to Sensitive Topics Through Fine-Grained Evaluation
Juhyun Oh, Nayeon Lee, Chani Jung et al.
Turn-PPO: Turn-Level Advantage Estimation with PPO for Improved Multi-Turn RL in Agentic LLMs
Junbo Li, Peng Zhou, Rui Meng et al.
Decoding Time Series with LLMs: A Multi-Agent Framework for Cross-Domain Annotation
Minhua Lin, Zhengzhang Chen, Yanchi Liu et al.
Learning to Judge: LLMs Designing and Applying Evaluation Rubrics
Clemencia Siro, Pourya Aliannejadi, Mohammad Aliannejadi
Visual–Linguistic Abductive Reasoning with LLMs for Knowledge-based Visual Question Answering
Jieun Kim, Yujin Jeong, Sung-Bae Cho
MapCoder-Lite: Distilling Multi-Agent Coding into a Single Small LLM
Woongkyu Lee, Junhee Cho, Jungwook Choi
CrisiText: A dataset of warning messages for LLM training in emergency communication
Giacomo Gonella, Gian Maria Campedelli, Stefano Menini et al.
Cards Against Contamination: TCG-Bench for Difficulty-Scalable Multilingual LLM Reasoning
Sultan AlRashed, Jianghui Wang, Francesco Orabona
Arabic Dialect Translation with Small LLMs: Enhancing through Reasoning-Oriented Reinforcement Learning
Sohaila Abdulsattar, Keith Ross
Enhancing Urdu Sentiment Classification through Instruction-Tuned LLMs and Cross-Lingual Transfer
Hasan Faraz Khan, Noor Fatima, Irfan Ahmad
Current state of LLMs for Arabic dialectal machine translation
Josef Jon, Rawan Bondok, Ondřej Bojar
Reasoning Beyond Labels: Measuring LLM Sentiment in Low-Resource, Culturally Nuanced Contexts
Millicent Ochieng, Anja Thieme, Ignatius Ezeani et al.
Synthetic Data Generation Pipeline for Low-Resource Swahili Sentiment Analysis: Multi-LLM Judging with Human Validation
Samuel Gyamfi, Alfred Malengo Kondoro, Yankı Öztürk et al.
Building a Conversational AI Assistant for African Travel Services with LLMs and RAG
Grace Kevine Ngoufo, Shamsuddeen Hassan Muhammad, Kevin Jeff Fogang Fokoa
Hybrid Neural-LLM Pipeline for Morphological Glossing in Endangered Language Documentation: A Case Study of Jungar Tuvan
Siyu Liang, Talant Mawkanuli, Gina-Anne Levow