Jason D. Lee
56 papers · 2015–2025 · 7 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+12 more ↓ Show less ↑
π§ Keyword Pioneer πΊοΈ Taxonomy Completionist (17) π Interdisciplinary Bridge π Renaissance Researcher (5) π Conference Polyglot (7)
π
Interdisciplinary Bridge
πΊοΈ
Taxonomy Completionist
(17)
π§
Keyword Pioneer
π
Conference Loyalist
(25)
π
Triple Crown
ποΈ
Keyword Collector
(122)
β‘
Prolific Year
(19)
π
Conference Pioneer
π
Century Club
(56)
π₯
Unstoppable
(6)
π
Trend Setter
β
The Questioner
(3)
Conferences
ICLR (25)
ICML (16)
NIPS (6)
JMLR (4)
AISTATS (2)
COLT (2)
L4DC (1)
Top co-authors
Keywords
sample complexity
(4)
gradient descent
(3)
reinforcement learning
(3)
neural network
(3)
stochastic gradient descent
(2)
policy optimization
(2)
implicit bia
(2)
language modeling
(2)
neural network optimization
(2)
neural network training
(1)
transfer learning
(1)
model quantization
(1)
convex optimization
(1)
matrix factorization
(1)
non-convex optimization
(1)
preference learning
(1)
online learning
(1)
optimal transport
(1)
collaborative filtering
(1)
model selection
(1)
Papers
Learning Hierarchical Polynomials of Multiple Nonlinear Features
ICLR 2025
Transformers Provably Learn Two-Mixture of Linear Classification via Gradient Flow
ICLR 2025
Understanding Optimization in Deep Learning with Central Flows
ICLR 2025
Correcting the Mythos of KL-Regularization: Direct Alignment without Overoptimization via Chi-Squared Preference Optimization
ICLR 2025
Understanding Factual Recall in Transformers via Associative Memories
ICLR 2025
Regressing the Relative Future: Efficient Policy Optimization for Multi-turn RLHF
ICLR 2025
Rethinking Addressing in Language Models via Contextualized Equivariant Positional Encoding
ICML 2025
How Well Can Transformers Emulate In-Context Newtonβs Method?
AISTATS 2025
Minimax Optimal Regret Bound for Reinforcement Learning with Trajectory Feedback
ICML 2025
Metastable Dynamics of Chain-of-Thought Reasoning: Provable Benefits of Search, RL and Distillation
ICML 2025
Discrepancies are Virtue: Weak-to-Strong Generalization through Lens of Intrinsic Dimension
ICML 2025
Transformers Learn to Implement Multi-step Gradient Descent with Chain of Thought
ICLR 2025
Exploiting Structure in Offline Multi-Agent RL: The Benefits of Low Interaction Rank
ICLR 2025
Medusa: Simple LLM Inference Acceleration Framework with Multiple Decoding Heads
ICML 2024
Learning and Transferring Sparse Contextual Bigrams with Linear Transformers
NIPS 2024
REBEL: Reinforcement Learning via Regressing Relative Rewards
NIPS 2024
Neural network learns low-dimensional polynomials with SGD near the information-theoretic limit
NIPS 2024
Scaling Laws in Linear Regression: Compute, Parameters, and Data
NIPS 2024
Stochastic Zeroth-Order Optimization under Strongly Convexity and Lipschitz Hessian: Minimax Sample Complexity
NIPS 2024
Dichotomy of Early and Late Phase Implicit Biases Can Provably Induce Grokking
ICLR 2024
Learning Hierarchical Polynomials with Three-Layer Neural Networks
ICLR 2024
Horizon-Free Regret for Linear Markov Decision Processes
ICLR 2024
Provable Offline Preference-Based Reinforcement Learning
ICLR 2024
Provable Reward-Agnostic Preference-Based Reinforcement Learning
ICLR 2024
Teaching Arithmetic to Small Transformers
ICLR 2024
Provably Efficient CVaR RL in Low-rank MDPs
ICLR 2024
BitDelta: Your Fine-Tune May Only Be Worth One Bit
NIPS 2024
LoRA Training in the NTK Regime has No Spurious Local Minima
ICML 2024
An Information-Theoretic Analysis of In-Context Learning
ICML 2024
How Transformers Learn Causal Structure with Gradient Descent
ICML 2024
Transformers Provably Learn Sparse Token Selection While Fully-Connected Nets Cannot
ICML 2024
Revisiting Zeroth-Order Optimization for Memory-Efficient LLM Fine-Tuning: A Benchmark
ICML 2024
Can We Find Nash Equilibria at a Linear Rate in Markov Games?
ICLR 2023
Regret Guarantees for Online Deep Control
L4DC 2023
PAC Reinforcement Learning for Predictive State Representations
ICLR 2023
Decentralized Optimistic Hyperpolicy Mirror Descent: Provably No-Regret Learning in Markov Games
ICLR 2023
Efficient displacement convex optimization with particle gradient descent
ICML 2023
Looped Transformers as Programmable Computers
ICML 2023
Understanding Incremental Learning of Gradient Descent: A Fine-grained Analysis of Matrix Sensing
ICML 2023
Computationally Efficient PAC RL in POMDPs with Latent Determinism and Conditional Embeddings
ICML 2023
Local Optimization Achieves Global Optimality in Multi-Agent Reinforcement Learning
ICML 2023
Optimal Sample Complexity Bounds for Non-convex Optimization under Kurdyka-Lojasiewicz Condition
AISTATS 2023
Self-Stabilization: The Implicit Bias of Gradient Descent at the Edge of Stability
ICLR 2023
Towards General Function Approximation in Zero-Sum Markov Games
ICLR 2022
Few-Shot Learning via Learning the Representation, Provably
ICLR 2021
On the Theory of Policy Gradient Methods: Optimality, Approximation, and Distribution Shift
JMLR 2021
Impact of Representation Learning in Linear Bandits
ICLR 2021
Kernel and Rich Regimes in Overparametrized Models
COLT 2020
Beyond Linearization: On Quadratic and Higher-Order Approximation of Wide Neural Networks
ICLR 2020
When is a Convolutional Filter Easy to Learn?
ICLR 2018
Learning One-hidden-layer Neural Networks with Landscape Design
ICLR 2018
Communication-efficient Sparse Regression
JMLR 2017
Distributed Stochastic Variance Reduced Gradient Methods by Sampling Extra Data with Replacement
JMLR 2017
Gradient Descent Only Converges to Minimizers
COLT 2016
L1-regularized Neural Networks are Improperly Learnable in Polynomial Time
ICML 2016
Matrix Completion and Low-Rank SVD via Fast Alternating Least Squares
JMLR 2015