Jason Lee
87 papers · 2010–2025 · 13 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+17 more ↓ Show less ↑
πΊοΈ Taxonomy Completionist (24) π§ Keyword Pioneer π Interdisciplinary Bridge π Renaissance Researcher (7) π£ Hot Topic Early Bird
π
Academic Marathon
(15)
π
Renaissance Researcher
(7)
π
Interdisciplinary Bridge
π
Conference Loyalist
(44)
π
Keyword Trendsetter Combo
(5)
π€
Dynamic Duo
(10)
π
Triple Crown
π
Keyword Champion
π
Grand Slam
π¬
Deep Specialist
(29)
β
The Questioner
π
Trend Setter
π
Conference Pioneer
π₯
Unstoppable
(14)
β‘
Prolific Year
(11)
π
Century Club
(87)
ποΈ
Keyword Collector
(89)
Conferences
NIPS (44)
ICML (13)
AISTATS (9)
COLT (7)
EMNLP (5)
ACL (2)
AAAI (1)
AACL (1)
CORL (1)
EACL (1)
ICLR (1)
IJCNLP (1)
NAACL (1)
Top co-authors
Research topics
Keywords
gradient descent
(21)
sample complexity
(11)
neural network
(10)
representation learning
(7)
implicit bia
(7)
learning theory
(6)
neural tangent kernel
(5)
non-convex optimization
(5)
gradient flow
(5)
stochastic gradient descent
(5)
kernel methods
(5)
neural network optimization
(4)
function approximation
(4)
reinforcement learning
(3)
convergence guarantee
(3)
generalization bound
(3)
convex optimization
(3)
latent variable model
(3)
convergence rate
(3)
few-shot learning
(3)
Papers
Anytime Acceleration of Gradient Descent
COLT 2025
BranchOut: Capturing Realistic Multimodality in Autonomous Driving Decisions
CORL 2025
REST: Retrieval-Based Speculative Decoding
NAACL 2024
MUVERA: Multi-Vector Retrieval via Fixed Dimensional Encoding
NIPS 2024
Computational-Statistical Gaps in Gaussian Single-Index Models (Extended Abstract)
COLT 2024
A Side-by-side Comparison of Transformers for Implicit Discourse Relation Classification
ACL 2023
LFTK: Handcrafted Features in Computational Linguistics
ACL 2023
Implicit Bias of Gradient Descent for Logistic Regression at the Edge of Stability
NIPS 2023
Reward-agnostic Fine-tuning: Provable Statistical Benefits of Hybrid Reinforcement Learning
NIPS 2023
Fine-Tuning Language Models with Just Forward Passes
NIPS 2023
Sample Complexity for Quadratic Bandits: Hessian Dependent Bounds and Optimal Algorithms
NIPS 2023
Offline Minimax Soft-Q-learning Under Realizability and Partial Coverage
NIPS 2023
Smoothing the Landscape Boosts the Signal for SGD: Optimal Sample Complexity for Learning Single Index Models
NIPS 2023
Provable Guarantees for Nonlinear Feature Learning in Three-Layer Neural Networks
NIPS 2023
Prompt-based Learning for Text Readability Assessment
EACL 2023
Provable Hierarchy-Based Meta-Reinforcement Learning
AISTATS 2023
Reconstructing Training Data from Model Gradient, Provably
AISTATS 2023
Provably Efficient Reinforcement Learning via Surprise Bound
AISTATS 2023
Offline Reinforcement Learning with Realizability and Single-policy Concentrability
COLT 2022
Provably Efficient Reinforcement Learning in Partially Observable Dynamical Systems
NIPS 2022
Identifying good directions to escape the NTK regime and efficiently learn low-degree plus sparse polynomials
NIPS 2022
From Gradient Flow on Population Loss to Learning with Stochastic Gradient Descent
NIPS 2022
On the Effective Number of Linear Regions in Shallow Univariate ReLU Networks: Convergence Guarantees and Implicit Bias
NIPS 2022
Implicit Bias of Gradient Descent on Reparametrized Models: On Equivalence to Mirror Descent
NIPS 2022
Provably Efficient Policy Optimization for Two-Player Zero-Sum Markov Games
AISTATS 2022
Optimization-Based Separations for Neural Networks
COLT 2022
Neural Networks can Learn Representations with Gradient Descent
COLT 2022
How Fine-Tuning Allows for Effective Meta-Learning
NIPS 2021
Going Beyond Linear RL: Sample Efficient Neural Function Approximation
NIPS 2021
Label Noise SGD Provably Prefers Flat Global Minimizers
NIPS 2021
Optimal Gradient-based Algorithms for Non-concave Bandit Optimization
NIPS 2021
Shape Matters: Understanding the Implicit Bias of the Noise Covariance
COLT 2021
How Important is the Train-Validation Split in Meta-Learning?
ICML 2021
Modeling from Features: a Mean-field Framework for Over-parameterized Deep Neural Networks
COLT 2021
A Theory of Label Propagation for Subpopulation Shift
ICML 2021
Bilinear Classes: A Structural Framework for Provable Generalization in RL
ICML 2021
Near-Optimal Linear Regression under Distribution Shift
ICML 2021
Pushing on Text Readability Assessment: A Transformer Meets Handcrafted Linguistic Features
EMNLP 2021
Predicting What You Already Know Helps: Provable Self-Supervised Learning
NIPS 2021
Beyond Lazy Training for Over-parameterized Tensor Decomposition
NIPS 2020
Generalized Leverage Score Sampling for Neural Networks
NIPS 2020
Iterative Refinement in the Continuous Space for Non-Autoregressive Neural Machine Translation
EMNLP 2020
Latent-Variable Non-Autoregressive Neural Machine Translation with Deterministic Inference Using a Delta Posterior
AAAI 2020
Convergence of Meta-Learning with Task-Specific Adaptation over Partial Parameters
NIPS 2020
Sanity-Checking Pruning Methods: Random Tickets can Win the Jackpot
NIPS 2020
How to Characterize The Landscape of Overparameterized Convolutional Neural Networks
NIPS 2020
LXPER Index 2.0: Improving Text Readability Assessment Model for L2 English Students in Korea
AACL 2020
SGD Learns One-Layer Networks in WGANs
ICML 2020
Optimal transport mapping via input convex neural networks
ICML 2020
Towards Understanding Hierarchical Learning: Benefits of Neural Representations
NIPS 2020
Implicit Bias in Deep Linear Classification: Initialization Scale vs Training Accuracy
NIPS 2020
Agnostic $Q$-learning with Function Approximation in Deterministic Systems: Near-Optimal Bounds on Approximation Error and Sample Complexity
NIPS 2020
On the Discrepancy between Density Estimation and Sequence Generation
EMNLP 2020
Regularization Matters: Generalization and Optimization of Neural Nets v.s. their Induced Kernel
NIPS 2019
Solving a Class of Non-Convex Min-Max Games Using Iterative First Order Methods
NIPS 2019
Countering Language Drift via Visual Grounding
IJCNLP 2019
Lexicographic and Depth-Sensitive Margins in Homogeneous and Non-Homogeneous Deep Models
ICML 2019
Gradient Descent Finds Global Minima of Deep Neural Networks
ICML 2019
Convergence of Gradient Descent on Separable Data
AISTATS 2019
Countering Language Drift via Visual Grounding
EMNLP 2019
Neural Temporal-Difference Learning Converges to Global Optima
NIPS 2019
Convergence of Adversarial Training in Overparametrized Neural Networks
NIPS 2019
Gradient Primal-Dual Algorithm Converges to Second-Order Stationary Solution for Nonconvex Distributed Optimization Over Networks
ICML 2018
Implicit Bias of Gradient Descent on Linear Convolutional Networks
NIPS 2018
Provably Correct Automatic Sub-Differentiation for Qualified Programs
NIPS 2018
On the Convergence and Robustness of Training GANs with Regularized Optimal Transport
NIPS 2018
Adding One Neuron Can Eliminate All Bad Local Minima
NIPS 2018
Algorithmic Regularization in Learning Deep Homogeneous Models: Layers are Automatically Balanced
NIPS 2018
Deterministic Non-Autoregressive Neural Sequence Modeling by Iterative Refinement
EMNLP 2018
Emergent Translation in Multi-Agent Communication
ICLR 2018
On the Power of Over-parametrization in Neural Networks with Quadratic Activation
ICML 2018
Gradient Descent Learns One-hidden-layer CNN: Donβt be Afraid of Spurious Local Minima
ICML 2018
Characterizing Implicit Bias in Terms of Optimization Geometry
ICML 2018
On the Learnability of Fully-Connected Neural Networks
AISTATS 2017
Black-box Importance Sampling
AISTATS 2017
Sketching Meets Random Projection in the Dual: A Provable Recovery Algorithm for Big and High-dimensional Data
AISTATS 2017
Gradient Descent Can Take Exponential Time to Escape Saddle Points
NIPS 2017
A Kernelized Stein Discrepancy for Goodness-of-fit Tests
ICML 2016
Matrix Completion has No Spurious Local Minimum
NIPS 2016
Evaluating the statistical significance of biclusters
NIPS 2015
Exact Post Model Selection Inference for Marginal Screening
NIPS 2014
Scalable Methods for Nonnegative Matrix Factorizations of Near-separable Tall-and-skinny Matrices
NIPS 2014
Using multiple samples to learn mixture models
NIPS 2013
Structure Learning of Mixed Graphical Models
AISTATS 2013
On model selection consistency of penalized M-estimators: a geometric theory
NIPS 2013
Proximal Newton-type methods for convex optimization
NIPS 2012
Practical Large-Scale Optimization for Max-norm Regularization
NIPS 2010