Papers
11,951 papers found
Scale-Free Graph-Language Models
Jianglin Lu, Yixuan Liu, Yitian Zhang et al.
Scaling and evaluating sparse autoencoders
Leo Gao, Tom Dupre la Tour, Henk Tillman et al.
Scaling Autonomous Agents via Automatic Reward Modeling And Planning
Zhenfang Chen, Delin Chen, Rui Sun et al.
Scaling Diffusion Language Models via Adaptation from Autoregressive Models
Shansan Gong, Shivam Agarwal, Yizhe Zhang et al.
Scaling FP8 training to trillion-token LLMs
Maxim Fishman, Brian Chmiel, Ron Banner et al.
Scaling Instruction-tuned LLMs to Million-token Contexts via Hierarchical Synthetic Data Generation
Linda He, Jue WANG, Maurice Weber et al.
Scaling In-the-Wild Training for Diffusion-based Illumination Harmonization and Editing by Imposing Consistent Light Transport
Lvmin Zhang, Anyi Rao, Maneesh Agrawala
Scaling Large Language Model-based Multi-Agent Collaboration
Chen Qian, Zihao Xie, YiFei Wang et al.
Scaling Laws for Downstream Task Performance in Machine Translation
Berivan Isik, Natalia Ponomareva, Hussein Hazimeh et al.
Scaling Laws for Precision
Tanishq Kumar, Zachary Ankner, Benjamin Frederick Spector et al.
Scaling LLM Test-Time Compute Optimally Can be More Effective than Scaling Parameters for Reasoning
Charlie Victor Snell, Jaehoon Lee, Kelvin Xu et al.
Scaling Long Context Training Data by Long-Distance Referrals
Yonghao Zhuang, Lanxiang Hu, Longfei Yun et al.
Scaling Offline Model-Based RL via Jointly-Optimized World-Action Model Pretraining
Jie Cheng, Ruixi Qiao, YINGWEI MA et al.
Scaling Optimal LR Across Token Horizons
Johan Bjorck, Alon Benhaim, Vishrav Chaudhary et al.
Scaling Speech-Text Pre-training with Synthetic Interleaved Data
Aohan Zeng, Zhengxiao Du, Mingdao Liu et al.
Scaling Stick-Breaking Attention: An Efficient Implementation and In-depth Study
Shawn Tan, Songlin Yang, Aaron Courville et al.
Scaling Transformers for Low-Bitrate High-Quality Speech Coding
Julian D Parker, Anton Smirnov, Jordi Pons et al.
Scaling up Masked Diffusion Models on Text
Shen Nie, Fengqi Zhu, Chao Du et al.
Scaling Wearable Foundation Models
Girish Narayanswamy, Xin Liu, Kumar Ayush et al.
SCBench: A KV Cache-Centric Analysis of Long-Context Methods
YUCHENG LI, Huiqiang Jiang, Qianhui Wu et al.
Schur's Positive-Definite Network: Deep Learning in the SPD cone with structure
Can Pouliquen, Mathurin Massias, Titouan Vayer
ScienceAgentBench: Toward Rigorous Assessment of Language Agents for Data-Driven Scientific Discovery
Ziru Chen, Shijie Chen, Yuting Ning et al.
SciLitLLM: How to Adapt LLMs for Scientific Literature Understanding
Sihang Li, Jin Huang, Jiaxi Zhuang et al.