Papers
CrowdSelect: SyntheticInstruction Data Selection with Multi-LLM Wisdom
Yisen Li, Lingfeng Yang, Wenxuan Shen et al.
Breaking the Illusion of Reasoning in Polish LLMs: Quality over Quantity of Thought
Dzmitry Pihulski, Mikołaj Langner, Jan Eliasz et al.
WebNovelBench: Placing LLM Novelists on the Web Novel Distribution
Liangtao Lin, Jun Zheng, Haidong Wang
Feature Drift: How Fine-Tuning Repurposes Representations in LLMs
Andrey V. Galichin, Anton Korznikov, Alexey Dontsov et al.
MEDAL: A Framework for Benchmarking LLMs as Multilingual Open-Domain Dialogue Evaluators
John Mendonça, Alon Lavie, Isabel Trancoso
Foundations of LLM Knowledge Materialization: Termination, Reproducibility, Robustness
Luca Giordano, Simon Razniewski
Bias in the East, Bias in the West: A Bilingual Analysis of LLM Political Bias on U.S.- and China-Related Issues
Ying Ying Lim, Paul Röttger
A Simple and Efficient Learning-Style Prompting for LLM Jailbreaking
Xuan Luo, Yue Wang, Zefeng He et al.
Aggregating Crowd of LLMs for Cost-Effective Data Annotation
Jiacheng Liu, Xiaofeng Hou
Can LLMs Reason Like Doctors? Exploring the Limits of Large Language Models in Complex Medical Reasoning
Flavio Merenda, Jose Manuel Gomez-Perez, German Rigau
Testing Low-Resource Language Support in LLMs Using Language Proficiency Exams: the Case of Luxembourgish
Cedric Lothritz, Jordi Cabot, Laura Bernardy
Unveiling Decision-Making in LLMs for Text Classification : Extraction of influential and interpretable concepts with Sparse Autoencoders
Mathis Le Bail, Jérémie Dentan, Davide Buscaldi et al.
TextMineX: Data, Evaluation Framework and Ontology-guided LLM Pipeline for Humanitarian Mine Action
Chenyue Zhou, Gürkan Solmaz, Flavio Cirillo et al.
Are Multimodal LLMs Movie Buffs?
Carlo Bretti, Pascal Mettes, Nanne Van Noord
Ensemble Privacy Defense for Knowledge-Intensive LLMs against Membership Inference Attacks
Haowei Fu, Bo Ni, Han Xu et al.
SafeSearch: Do Not Trade Safety for Utility in LLM Search Agents
Qiusi Zhan, Angeline Budiman-Chan, Abdelrahman Zayed et al.
Better as Generators Than Classifiers: Leveraging LLMs and Synthetic Data for Low-Resource Multilingual Classification
Branislav Pecher, Jan Cegin, Robert Belanec et al.
Dialogue is Better Than Monologue: Instructing Meidcal LLMs via Strategic Conversations
Zijie Liu, Xinyu Zhao, Jie Peng et al.
Hearing Between the Lines: Unlocking the Reasoning Power of LLMs for Speech Evaluation
Arjun Chandra, Kevin Miller, Venkatesh Ravichandran et al.
FLAT-LLM: Fine-grained Low-rank Activation Space Transformation for Large Language Model Compression
Jiayi Tian, Ryan Solgi, Jinming Lu et al.
Analyzing LLM Instruction Optimization for Tabular Fact Verification
Xiaotang Du, Giwon Hong, Wai-Chung Kwan et al.
Imbalanced Gradients in RL Post-Training of Multi-Task LLMs
Runzhe Wu, Ankur Samanta, Ayush Jain et al.
SIRAJ: Diverse and Efficient Red-Teaming for LLM Agents via Distilled Structured Reasoning
Kaiwen Zhou, Ahmed Elgohary, A S M Iftekhar et al.
Harnessing Consistency for Robust Test-Time LLM Ensemble
Zhichen Zeng, Qi Yu, Xiao Lin et al.
AutoAnoEval: Semantic-Aware Model Selection via Tree-Guided LLM Reasoning for Tabular Anomaly Detection
Suhee Yoon, Sanghyu Yoon, Ye Seul Sim et al.