Papers
SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training
Tianzhe Chu, Yuexiang Zhai, Jihan Yang et al.
SGD Jittering: A Training Strategy for Robust and Accurate Model-Based Architectures
Peimeng Guan, Mark A. Davenport
ShadowKV: KV Cache in Shadows for High-Throughput Long-Context LLM Inference
Hanshi Sun, Li-Wen Chang, Wenlei Bao et al.
SHARP-Distill: A 68$\times$ Faster Recommender System with Hypergraph Neural Networks and Language Models
Saman Forouzandeh, Parham Moradi, Mahdi Jalili
SHE: Streaming-media Hashing Retrieval
Ruitao Pu, Yang Qin, Xiaomin Song et al.
ShieldAgent: Shielding Agents via Verifiable Safety Policy Reasoning
Zhaorun Chen, Mintong Kang, Bo Li
Shielded Diffusion: Generating Novel and Diverse Images using Sparse Repellency
Michael Kirchhof, James Thornton, Louis Béthune et al.
SHIELD: Multi-task Multi-distribution Vehicle Routing Solver with Sparsity and Hierarchy
Yong Liang Goh, Zhiguang Cao, Yining Ma et al.
Shifting Time: Time-series Forecasting with Khatri-Rao Neural Operators
Srinath Dama, Kevin Course, Prasanth B. Nair
Shortcut-connected Expert Parallelism for Accelerating Mixture of Experts
Weilin Cai, Juyong Jiang, Le Qin et al.
Should Decision-Makers Reveal Classifiers in Online Strategic Classification?
Han Shao, Shuo Xie, Kunhe Yang
Sidechain conditioning and modeling for full-atom protein sequence design with FAMPNN
Talal Widatalla, Richard W. Shuai, Brian Hie et al.
Signed Laplacians for Constrained Graph Clustering
John Stewart Fabila Carrasco, He Sun
SIMPLEMIX: Frustratingly Simple Mixing of Off- and On-policy Data in Language Model Preference Learning
Tianjian Li, Daniel Khashabi
Simple Path Structural Encoding for Graph Transformers
Louis Airale, Antonio Longa, Mattia Rigon et al.
Simple Policy Optimization
Zhengpeng Xie, Qiang Zhang, Fan Yang et al.
Simple Randomized Rounding for Max-Min Eigenvalue Augmentation
Jourdain Lamperski, Haeseong Yang, Oleg Prokopyev
Simplicity Bias and Optimization Threshold in Two-Layer ReLU Networks
Etienne Boursier, Nicolas Flammarion
Simplifying DINO via Coding Rate Regularization
Ziyang Wu, Jingyuan Zhang, Druv Pai et al.
Simultaneous Multi-Robot Motion Planning with Projected Diffusion Models
Jinhao Liang, Jacob K Christopher, Sven Koenig et al.
Since Faithfulness Fails: The Performance Limits of Neural Causal Discovery
Mateusz Olko, Mateusz Gajewski, Joanna Wojciechowska et al.
SING: Spatial Context in Large Language Model for Next-Gen Wearables
Ayushi Mishra, Yang Bai, Priyadarshan Narayanasamy et al.