Papers
8,340 papers found
Same Pre-training Loss, Better Downstream: Implicit Bias Matters for Language Models
Hong Liu, Sang Michael Xie, Zhiyuan Li et al.
SAM operates far from home: eigenvalue regularization as a dynamical phenomenon
Atish Agarwala, Yann Dauphin
Sample and Predict Your Latent: Modality-free Sequential Disentanglement via Contrastive Estimation
Ilan Naiman, Nimrod Berman, Omri Azencot
Sample Complexity Bounds for Learning High-dimensional Simplices in Noisy Regimes
Seyed Amir Hossein Saberi, Amir Najafi, Abolfazl Motahari et al.
Sample Complexity of Probability Divergences under Group Symmetry
Ziyu Chen, Markos Katsoulakis, Luc Rey-Bellet et al.
Sampling-Based Accuracy Testing of Posterior Estimators for General Inference
Pablo Lemos, Adam Coogan, Yashar Hezaveh et al.
Sampling-based Nyström Approximation and Kernel Quadrature
Satoshi Hayakawa, Harald Oberhauser, Terry Lyons
Scalable Adaptive Computation for Iterative Generation
Allan Jabri, David J. Fleet, Ting Chen
Scalable Multi-Agent Reinforcement Learning through Intelligent Information Aggregation
Siddharth Nayak, Kenneth Choi, Wenqi Ding et al.
Scalable Safe Policy Improvement via Monte Carlo Tree Search
Alberto Castellini, Federico Bianchi, Edoardo Zorzi et al.
Scalable Set Encoding with Universal Mini-Batch Consistency and Unbiased Full Set Gradient Approximation
Jeffrey Willette, Seanie Lee, Bruno Andreis et al.
Scaling Laws for Generative Mixed-Modal Language Models
Armen Aghajanyan, Lili Yu, Alexis Conneau et al.
Scaling Laws for Multilingual Neural Machine Translation
Patrick Fernandes, Behrooz Ghorbani, Xavier Garcia et al.
Scaling Laws for Reward Model Overoptimization
Leo Gao, John Schulman, Jacob Hilton
Scaling of Class-wise Training Losses for Post-hoc Calibration
Seungjin Jung, Seungmo Seo, Yonghyun Jeong et al.
Scaling Spherical CNNs
Carlos Esteves, Jean-Jacques Slotine, Ameesh Makadia
Scaling Up Dataset Distillation to ImageNet-1K with Constant Memory
Justin Cui, Ruochen Wang, Si Si et al.
Scaling Vision Transformers to 22 Billion Parameters
Mostafa Dehghani, Josip Djolonga, Basil Mustafa et al.
Score Approximation, Estimation and Distribution Recovery of Diffusion Models on Low-Dimensional Data
Minshuo Chen, Kaixuan Huang, Tuo Zhao et al.
SDDM: Score-Decomposed Diffusion Models on Manifolds for Unpaired Image-to-Image Translation
Shikun Sun, Longhui Wei, Junliang Xing et al.
SE(3) diffusion model with application to protein backbone generation
Jason Yim, Brian L. Trippe, Valentin De Bortoli et al.
Searching Large Neighborhoods for Integer Linear Programs with Contrastive Learning
Taoan Huang, Aaron M Ferber, Yuandong Tian et al.
Second-Order Optimization with Lazy Hessians
Nikita Doikov, El Mahdi Chayti, Martin Jaggi
Second-order regression models exhibit progressive sharpening to the edge of stability
Atish Agarwala, Fabian Pedregosa, Jeffrey Pennington
Secure Federated Correlation Test and Entropy Estimation
Qi Pang, Lun Wang, Shuai Wang et al.