Papers
11,015 papers found
The Geometry of Categorical and Hierarchical Concepts in Large Language Models
Kiho Park, Yo Joong Choe, Yibo Jiang et al.
The Hidden Cost of Waiting for Accurate Predictions
Ali Shirali, Ariel D. Procaccia, Rediet Abebe
The Hyperfitting Phenomenon: Sharpening and Stabilizing LLMs for Open-Ended Text Generation
Fredrik Carlsson, Fangyu Liu, Daniel Ward et al.
The impact of allocation strategies in subset learning on the expressive power of neural networks
Ofir Schlisselberg, Ran Darshan
The Journey Matters: Average Parameter Count over Pre-training Unifies Sparse and Dense Scaling Laws
Tian Jin, Ahmed Imtiaz Humayun, Utku Evci et al.
The KoLMogorov Test: Compression by Code Generation
Ori Yoran, Kunhao Zheng, Fabian Gloeckle et al.
The Labyrinth of Links: Navigating the Associative Maze of Multi-modal LLMs
Hong Li, Nanxi Li, Yuanjie Chen et al.
The Last Iterate Advantage: Empirical Auditing and Principled Heuristic Analysis of Differentially Private SGD
Milad Nasr, Thomas Steinke, Borja Balle et al.
The "Law'' of the Unconscious Contrastive Learner: Probabilistic Alignment of Unpaired Modalities
Yongwei Che, Benjamin Eysenbach
The OMG dataset: An Open MetaGenomic corpus for mixed-modality genomic language modeling
Andre Cornman, Jacob West-Roberts, Antonio Pedro Camargo et al.
The Optimization Landscape of SGD Across the Feature Learning Strength
Alexander Atanasov, Alexandru Meterez, James B Simon et al.
Theory, Analysis, and Best Practices for Sigmoid Self-Attention
Jason Ramapuram, Federico Danieli, Eeshan Gunesh Dhekane et al.
Theory on Mixture-of-Experts in Continual Learning
Hongbo Li, Sen Lin, Lingjie Duan et al.
Theory on Score-Mismatched Diffusion Models and Zero-Shot Conditional Samplers
Yuchen Liang, Peizhong Ju, Yingbin Liang et al.
The Pitfalls of Memorization: When Memorization Hurts Generalization
Reza Bayat, Mohammad Pezeshki, Elvis Dohmatob et al.
The Power of LLM-Generated Synthetic Data for Stance Detection in Online Political Discussions
Stefan Sylvius Wagner, Maike Behrendt, Marc Ziegele et al.
The Ramanujan Library - Automated Discovery on the Hypergraph of Integer Relations
Itay Beit Halachmi, Ido Kaminer
The Rise and Down of Babel Tower: Investigating the Evolution Process of Multilingual Code Large Language Model
Jiawei Chen, Wentao Chen, Jing Su et al.
ThermalGaussian: Thermal 3D Gaussian Splatting
Rongfeng Lu, Hangyu Chen, Zunjie Zhu et al.
THE ROBUSTNESS OF DIFFERENTIABLE CAUSAL DISCOVERY IN MISSPECIFIED SCENARIOS
Huiyang Yi, Yanyan He, Duxin Chen et al.
The Same but Different: Structural Similarities and Differences in Multilingual Language Modeling
Ruochen Zhang, Qinan Yu, Matianyu Zang et al.
The Semantic Hub Hypothesis: Language Models Share Semantic Representations Across Languages and Modalities
Zhaofeng Wu, Xinyan Velocity Yu, Dani Yogatama et al.
The Superposition of Diffusion Models Using the Itô Density Estimator
Marta Skreta, Lazar Atanackovic, Joey Bose et al.
The Unreasonable Ineffectiveness of the Deeper Layers
Andrey Gromov, Kushal Tirumala, Hassan Shapourian et al.
The Utility and Complexity of In- and Out-of-Distribution Machine Unlearning
Youssef Allouah, Joshua Kazdan, Rachid Guerraoui et al.