Difan Zou
50 papers · 2018–2026 · 11 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+12 more ↓ Show less ↑
π§ Keyword Pioneer π£ Hot Topic Early Bird πΊοΈ Taxonomy Completionist (13) π Interdisciplinary Bridge π Conference Polyglot (10)
π
Renaissance Researcher
(6)
πΊοΈ
Taxonomy Completionist
(13)
π§
Keyword Pioneer
π
Grand Slam
π
Triple Crown
π§¬
Topic Evolution
π€
Dynamic Duo
(25)
ποΈ
Keyword Collector
(145)
β
The Questioner
(6)
β‘
Prolific Year
(13)
π
Century Club
(48)
π₯
Unstoppable
(8)
Conferences
ICML (14)
ICLR (12)
NIPS (12)
COLT (3)
AAAI (2)
AISTATS (2)
ACL (1)
CVPR (1)
EMNLP (1)
JMLR (1)
UAI (1)
Top co-authors
Keywords
stochastic gradient descent
(5)
stochastic gradient
(4)
markov chain monte carlo
(4)
linear regression
(4)
variance reduction
(4)
diffusion model
(3)
learning theory
(3)
langevin dynamics
(3)
excess risk
(3)
neural network optimization
(3)
risk bound
(2)
adversarial robustness
(2)
implicit bia
(2)
image generation
(2)
hamiltonian monte carlo
(2)
bayesian inference
(2)
gradient descent
(2)
global convergence
(2)
generative model
(2)
iterate averaging
(2)
Papers
SIDE: Surrogate Conditional Data Extraction from Diffusion Models
AAAI 2026
Learning Diffusion Policy from Primitive Skills for Robot Manipulation
AAAI 2026
Masked Autoencoders Are Effective Tokenizers for Diffusion Models
ICML 2025
SWE-Fixer: Training Open-Source LLMs for Effective and Efficient GitHub Issue Resolution
ACL 2025
Parallelized Autoregressive Visual Generation
CVPR 2025
Model Unlearning via Sparse Autoencoder Subspace Guided Projections
EMNLP 2025
Beyond Surface Structure: A Causal Assessment of LLMs' Comprehension ability
ICLR 2025
How Does Critical Batch Size Scale in Pre-training?
ICLR 2025
HyPoGen: Optimization-Biased Hypernetworks for Generalizable Policy Generation
ICLR 2025
On the Feature Learning in Diffusion Models
ICLR 2025
Can Diffusion Models Learn Hidden Inter-Feature Rules Behind Images?
ICML 2025
Towards Understanding Fine-Tuning Mechanisms of LLMs via Circuit Analysis
ICML 2025
Faster Sampling via Stochastic Gradient Proximal Sampler
ICML 2024
Faster Sampling without Isoperimetry via Diffusion-based Monte Carlo
COLT 2024
Benign Overfitting in Two-Layer ReLU Convolutional Neural Networks for XOR Data
ICML 2024
How Many Pretraining Tasks Are Needed for In-Context Learning of Linear Regression?
ICLR 2024
Benign Oscillation of Stochastic Gradient Descent with Large Learning Rate
ICLR 2024
PRES: Toward Scalable Memory-Based Dynamic Graph Neural Networks
ICLR 2024
What Can Transformer Learn with Varying Depth? Case Studies on Sequence Learning Tasks
ICML 2024
Improving Group Robustness on Spurious Correlation Requires Preciser Group Inference
ICML 2024
The Implicit Bias of Adam on Separable Data
NIPS 2024
Reverse Transition Kernel: A Flexible Framework to Accelerate Diffusion Inference
NIPS 2024
An In-depth Investigation of Sparse Rate Reduction in Transformer-like Models
NIPS 2024
How Transformers Utilize Multi-Head Attention in In-Context Learning? A Case Study on Sparse Linear Regression
NIPS 2024
Slight Corruption in Pre-training Data Makes Better Diffusion Models
NIPS 2024
Understanding the Generalization of Adam in Learning Neural Networks with Proper Regularization
ICLR 2023
Benign Overfitting of Constant-Stepsize SGD for Linear Regression
JMLR 2023
Towards Robust Graph Incremental Learning on Evolving Graphs
ICML 2023
Finite-Sample Analysis of Learning High-Dimensional Single ReLU Neuron
ICML 2023
The Implicit Bias of Batch Normalization in Linear Models and Two-layer Linear Convolutional Neural Networks
COLT 2023
The Benefits of Mixup for Feature Learning
ICML 2023
Last Iterate Risk Bounds of SGD with Decaying Stepsize for Overparameterized Linear Regression
ICML 2022
The Power and Limitation of Pretraining-Finetuning for Linear Regression under Covariate Shift
NIPS 2022
Risk Bounds of Multi-Pass SGD for Least Squares in the Interpolation Regime
NIPS 2022
Self-training Converts Weak Learners to Strong Learners in Mixture Models
AISTATS 2022
Faster Convergence of Stochastic Gradient Langevin Dynamics for Non-Log-Concave Sampling
UAI 2021
Provable Robustness of Adversarial Training for Learning Halfspaces with Noise
ICML 2021
On the Convergence of Hamiltonian Monte Carlo with Stochastic Gradients
ICML 2021
How Much Over-parameterization Is Sufficient to Learn Deep ReLU Networks?
ICLR 2021
Benign Overfitting of Constant-Stepsize SGD for Linear Regression
COLT 2021
Direction Matters: On the Implicit Bias of Stochastic Gradient Descent with Moderate Learning Rate
ICLR 2021
The Benefits of Implicit Regularization from SGD in Least Squares Problems
NIPS 2021
Improving Adversarial Robustness Requires Revisiting Misclassified Examples
ICLR 2020
On the Global Convergence of Training Deep Linear ResNets
ICLR 2020
Stochastic Gradient Hamiltonian Monte Carlo Methods with Recursive Variance Reduction
NIPS 2019
An Improved Analysis of Training Over-parameterized Deep Neural Networks
NIPS 2019
Sampling from Non-Log-Concave Distributions via Variance-Reduced Gradient Langevin Dynamics
AISTATS 2019
Layer-Dependent Importance Sampling for Training Deep and Large Graph Convolutional Networks
NIPS 2019
Global Convergence of Langevin Dynamics Based Algorithms for Nonconvex Optimization
NIPS 2018
Stochastic Variance-Reduced Hamilton Monte Carlo Methods
ICML 2018