Papers
Submission for WMT25 Task 3
Govardhan Padmanabhan
Subtle Risks, Critical Failures: A Framework for Diagnosing Physical Safety of LLMs for Embodied Decision Making
Yejin Son, Minseo Kim, Sungwoong Kim et al.
sudoLLM: On Multi-role Alignment of Language Models
Soumadeep Saha, Akshay Chaturvedi, Joy Mahapatra et al.
SUE: Sparsity-based Uncertainty Estimation via Sparse Dictionary Learning
Tamás Ficsor, Gábor Berend
Sugar-Coated Poison: Benign Generation Unlocks Jailbreaking
Yuhang Wu, Yu-Jie Xiong, Hao Zhang et al.
Summarize-Exemplify-Reflect: Data-driven Insight Distillation Empowers LLMs for Few-shot Tabular Classification
Yifei Yuan, Jiatong Li, Weijia Zhang et al.
Summarizing Speech: A Comprehensive Survey
Fabian Retkowski, Maike Züfle, Andreas Sudmann et al.
Superficial Self-Improved Reasoners Benefit from Model Merging
Xiangchi Yuan, Chunhui Zhang, Zheyuan Liu et al.
Superpose Task-specific Features for Model Merging
Haiquan Qiu, You Wu, Dong Li et al.
Supervised Attention Mechanism for Low-quality Multimodal Data
Sijie Mai, Shiqin Han, Haifeng Hu
Supporting Online Discussions: Integrating AI Into the adhocracy+ Participation Platform To Enhance Deliberation
Maike Behrendt, Stefan Sylvius Wagner, Mira Warne et al.
SuPreME: A Supervised Pre-training Framework for Multimodal ECG Representation Learning
Mingsheng Cai, Jiuming Jiang, Wenhao Huang et al.
SURE: Safety Understanding and Reasoning Enhancement for Multimodal Large Language Models
Yuxin Gou, Xiaoning Dong, Qin Li et al.
Surge: On the Potential of Large Language Models as General-Purpose Surrogate Code Executors
Bohan Lyu, Siqiao Huang, Zichen Liang et al.
Surprise Calibration for Better In-Context Learning
Zhihang Tan, Jingrui Hou, Ping Wang et al.
SurveyGen: Quality-Aware Scientific Survey Generation with Large Language Models
Tong Bao, Mir Tafseer Nayeem, Davood Rafiei et al.
SVeritas: Benchmark for Robust Speaker Verification under Diverse Conditions
Massa Baali, Sarthak Bisht, Francisco Teixeira et al.
SWAM: Adaptive Sliding Window and Memory-Augmented Attention Model for Rumor Detection
Mei Guo, Chen Chen, Chunyan Hou et al.
SWAN: An Efficient and Scalable Approach for Long-Context Language Modeling
Krishna C Puvvada, Faisal Ladhak, Santiago Akle Serano et al.
SwarmAgentic: Towards Fully Automated Agentic System Generation via Swarm Intelligence
Yao Zhang, Chenyang Lin, Shijie Tang et al.
SWE-MERA: A Dynamic Benchmark for Agenticly Evaluating Large Language Models on Software Engineering Tasks
Adamenko Pavel, Ivanov Mikhail, Aidar Valeev et al.
SwiftKV: Fast Prefill-Optimized Inference with Knowledge-Preserving Model Transformation
Aurick Qiao, Zhewei Yao, Samyam Rajbhandari et al.
SwiftPrune: Hessian-Free Weight Pruning for Large Language Models
Yuhan Kang, Yang Shi, Mei Wen et al.
Sycophancy Mitigation Through Reinforcement Learning with Uncertainty-Aware Adaptive Reasoning Trajectories
Mohammad Beigi, Ying Shen, Parshin Shojaee et al.
SYNC: A Synthetic Long-Context Understanding Benchmark for Controlled Comparisons of Model Capabilities
Shuyang Cao, Kaijian Zou, Lu Wang