Sanjeev Arora

64 papers · 2012–2025 · 6 conferences · across top CS/AI conferences

Achievements

+16 more ↓

🧭 Keyword Pioneer 🐣 Hot Topic Early Bird 🗺️ Taxonomy Completionist (21) 🌉 Interdisciplinary Bridge 🌍 Conference Polyglot (6)

🌉 Interdisciplinary Bridge 🌍 Conference Polyglot (6) 🗺️ Taxonomy Completionist (21) 🏠 Conference Loyalist (21) 🏆 Keyword Champion 👑 Triple Crown 👥 Mega-Team (22) 🔬 Deep Specialist (12) 🤝 Dynamic Duo (15) 🚀 Conference Pioneer 🔥 Unstoppable (14) ❓ The Questioner (7) ⚡ Prolific Year (10) 💎 Century Club (64) 🗃️ Keyword Collector (57) 📈 Trend Setter

Conferences

ICML (21) NIPS (19) ICLR (17) COLT (4) EMNLP (2) ACL (1)

Top co-authors

Zhiyuan Li (15) Nikunj Saunshi (11) Rong Ge (11) Sadhika Malladi (10) Kaifeng Lyu (10) Dingli Yu (9) Yi Zhang (8) Danqi Chen (7) Abhishek Panigrahi (7) Wei Hu (6)

Keywords

gradient descent (9) neural network optimization (6) representation learning (5) sample complexity (4) non-convex optimization (4) generalization bound (3) federated learning (3) neural network (3) stochastic differential equation (3) language model fine-tuning (3) learning rate (2) contrastive learning (2) data privacy (2) generative model (2) neural tangent kernel (2) sparse recovery (2) batch normalization (2) sparse coding (2) dictionary learning (2) benchmark evaluation (2)

Papers

On the Power of Context-Enhanced Learning in LLMs ICML 2025 Generalizing from SIMPLE to HARD Visual Reasoning: Can We Mitigate Modality Imbalance in VLMs? ICML 2025 Weak-to-Strong Generalization Even in Random Feature Networks, Provably ICML 2025 Instruct-SkillMix: A Powerful Pipeline for LLM Instruction Tuning ICLR 2025 Unintentional Unalignment: Likelihood Displacement in Direct Preference Optimization ICLR 2025 Provable unlearning in topic modeling and downstream tasks ICLR 2025 SKILL-MIX: a Flexible and Expandable Family of Evaluations for AI Models ICLR 2024 LESS: Selecting Influential Data for Targeted Instruction Tuning ICML 2024 Trainable Transformer in Transformer ICML 2024 Metacognitive Capabilities of LLMs: An Exploration in Mathematical Problem Solving NIPS 2024 ConceptMix: A Compositional Image Generation Benchmark with Controllable Difficulty NIPS 2024 Can Models Learn Skill Composition from Examples? NIPS 2024 CharXiv: Charting Gaps in Realistic Chart Understanding in Multimodal LLMs NIPS 2024 Keeping LLMs Aligned After Fine-tuning: The Crucial Role of Prompt Templates NIPS 2024 Language Models as Science Tutors ICML 2024 A Quadratic Synchronization Rule for Distributed Deep Learning ICLR 2024 Task-Specific Skill Localization in Fine-tuned Language Models ICML 2023 A Kernel-Based View of Language Model Fine-Tuning ICML 2023 Understanding Influence Functions and Datamodels via Harmonic Analysis ICLR 2023 Fine-Tuning Language Models with Just Forward Passes NIPS 2023 Why (and When) does Local SGD Generalize Better than SGD? ICLR 2023 Do Transformers Parse while Predicting the Masked Word? EMNLP 2023 Understanding the Generalization Benefit of Normalization Layers: Sharpness Reduction NIPS 2022 Understanding Gradient Descent on the Edge of Stability in Deep Learning ICML 2022 On Predicting Generalization using GANs ICLR 2022 What Happens after SGD Reaches Zero Loss? --A Mathematical Framework ICLR 2022 Understanding Contrastive Learning Requires Incorporating Inductive Biases ICML 2022 On the SDEs and Scaling Rules for Adaptive Gradient Algorithms NIPS 2022 New Definitions and Evaluations for Saliency Methods: Staying Intrinsic, Complete and Sound NIPS 2022 Implicit Bias of Gradient Descent on Reparametrized Models: On Equivalence to Mirror Descent NIPS 2022 A Mathematical Exploration of Why Language Models Help Solve Downstream Tasks ICLR 2021 Evaluating Gradient Inversion Attacks and Defenses in Federated Learning NIPS 2021 On the Validity of Modeling SGD with Stochastic Differential Equations (SDEs) NIPS 2021 Gradient Descent on Two-layer Nets: Margin Maximization and Simplicity Bias NIPS 2021 Why Are Convolutional Nets More Sample-Efficient than Fully-Connected Nets? ICLR 2021 A Sample Complexity Separation between Non-Convex and Convex Meta-Learning ICML 2020 InstaHide: Instance-hiding Schemes for Private Distributed Learning ICML 2020 Provable Representation Learning for Imitation Learning via Bi-level Optimization ICML 2020 Harnessing the Power of Infinitely Wide Deep Nets on Small-data Tasks ICLR 2020 An Exponential Learning Rate Schedule for Deep Learning ICLR 2020 Over-parameterized Adversarial Training: An Analysis Overcoming the Curse of Dimensionality NIPS 2020 Reconciling Modern Deep Learning with Traditional Optimization Analyses: The Intrinsic Learning Rate NIPS 2020 TextHide: Tackling Data Privacy in Language Understanding Tasks EMNLP 2020 Explaining Landscape Connectivity of Low-cost Solutions for Multilayer Nets NIPS 2019 A Convergence Analysis of Gradient Descent for Deep Linear Neural Networks ICLR 2019 Theoretical Analysis of Auto Rate-Tuning by Batch Normalization ICLR 2019 Fine-Grained Analysis of Optimization and Generalization for Overparameterized Two-Layer Neural Networks ICML 2019 A Theoretical Analysis of Contrastive Unsupervised Representation Learning ICML 2019 On Exact Computation with an Infinitely Wide Neural Net NIPS 2019 Implicit Regularization in Deep Matrix Factorization NIPS 2019 A La Carte Embedding: Cheap but Effective Induction of Semantic Feature Vectors ACL 2018 Stronger Generalization Bounds for Deep Nets via a Compression Approach ICML 2018 A Compressed Sensing View of Unsupervised Text Embeddings, Bag-of-n-Grams, and LSTMs ICLR 2018 Do GANs learn the distribution? Some Theory and Empirics ICLR 2018 An Analysis of the t-SNE Algorithm for Data Visualization COLT 2018 On the Optimization of Deep Networks: Implicit Acceleration by Overparameterization ICML 2018 On the Ability of Neural Nets to Express Distributions COLT 2017 Generalization and Equilibrium in Generative Adversarial Nets (GANs) ICML 2017 Provable Algorithms for Inference in Topic Models ICML 2016 Simple, Efficient, and Neural Algorithms for Sparse Coding COLT 2015 New Algorithms for Learning Incoherent and Overcomplete Dictionaries COLT 2014 Provable Bounds for Learning Some Deep Representations ICML 2014 A Practical Algorithm for Topic Modeling with Provable Guarantees ICML 2013 Provable ICA with Unknown Gaussian Noise, with Implications for Gaussian Mixtures and Autoencoders NIPS 2012