Difan Zou

50 papers · 2018–2026 · 11 conferences · across top CS/AI conferences

Achievements

+12 more ↓

🧭 Keyword Pioneer 🐣 Hot Topic Early Bird 🗺️ Taxonomy Completionist (13) 🌉 Interdisciplinary Bridge 🌍 Conference Polyglot (10)

🌈 Renaissance Researcher (6) 🗺️ Taxonomy Completionist (13) 🧭 Keyword Pioneer 🏆 Grand Slam 👑 Triple Crown 🧬 Topic Evolution 🤝 Dynamic Duo (25) 🗃️ Keyword Collector (145) ❓ The Questioner (6) ⚡ Prolific Year (13) 💎 Century Club (48) 🔥 Unstoppable (8)

Conferences

ICML (14) ICLR (12) NIPS (12) COLT (3) AAAI (2) AISTATS (2) ACL (1) CVPR (1) EMNLP (1) JMLR (1) UAI (1)

Top co-authors

Quanquan Gu (25) Jingfeng Wu (10) Vladimir braverman (9) Yuan Cao (7) Yujin Han (6) Sham Kakade (5) Pan Xu (5) Zixiang Chen (4) Sham M. Kakade (3) Yuanzhi Li (3)

Keywords

stochastic gradient descent (5) stochastic gradient (4) markov chain monte carlo (4) linear regression (4) variance reduction (4) diffusion model (3) learning theory (3) langevin dynamics (3) excess risk (3) neural network optimization (3) risk bound (2) adversarial robustness (2) implicit bia (2) image generation (2) hamiltonian monte carlo (2) bayesian inference (2) gradient descent (2) global convergence (2) generative model (2) iterate averaging (2)

Papers

SIDE: Surrogate Conditional Data Extraction from Diffusion Models AAAI 2026 Learning Diffusion Policy from Primitive Skills for Robot Manipulation AAAI 2026 Masked Autoencoders Are Effective Tokenizers for Diffusion Models ICML 2025 SWE-Fixer: Training Open-Source LLMs for Effective and Efficient GitHub Issue Resolution ACL 2025 Parallelized Autoregressive Visual Generation CVPR 2025 Model Unlearning via Sparse Autoencoder Subspace Guided Projections EMNLP 2025 Beyond Surface Structure: A Causal Assessment of LLMs' Comprehension ability ICLR 2025 How Does Critical Batch Size Scale in Pre-training? ICLR 2025 HyPoGen: Optimization-Biased Hypernetworks for Generalizable Policy Generation ICLR 2025 On the Feature Learning in Diffusion Models ICLR 2025 Can Diffusion Models Learn Hidden Inter-Feature Rules Behind Images? ICML 2025 Towards Understanding Fine-Tuning Mechanisms of LLMs via Circuit Analysis ICML 2025 Faster Sampling via Stochastic Gradient Proximal Sampler ICML 2024 Faster Sampling without Isoperimetry via Diffusion-based Monte Carlo COLT 2024 Benign Overfitting in Two-Layer ReLU Convolutional Neural Networks for XOR Data ICML 2024 How Many Pretraining Tasks Are Needed for In-Context Learning of Linear Regression? ICLR 2024 Benign Oscillation of Stochastic Gradient Descent with Large Learning Rate ICLR 2024 PRES: Toward Scalable Memory-Based Dynamic Graph Neural Networks ICLR 2024 What Can Transformer Learn with Varying Depth? Case Studies on Sequence Learning Tasks ICML 2024 Improving Group Robustness on Spurious Correlation Requires Preciser Group Inference ICML 2024 The Implicit Bias of Adam on Separable Data NIPS 2024 Reverse Transition Kernel: A Flexible Framework to Accelerate Diffusion Inference NIPS 2024 An In-depth Investigation of Sparse Rate Reduction in Transformer-like Models NIPS 2024 How Transformers Utilize Multi-Head Attention in In-Context Learning? A Case Study on Sparse Linear Regression NIPS 2024 Slight Corruption in Pre-training Data Makes Better Diffusion Models NIPS 2024 Understanding the Generalization of Adam in Learning Neural Networks with Proper Regularization ICLR 2023 Benign Overfitting of Constant-Stepsize SGD for Linear Regression JMLR 2023 Towards Robust Graph Incremental Learning on Evolving Graphs ICML 2023 Finite-Sample Analysis of Learning High-Dimensional Single ReLU Neuron ICML 2023 The Implicit Bias of Batch Normalization in Linear Models and Two-layer Linear Convolutional Neural Networks COLT 2023 The Benefits of Mixup for Feature Learning ICML 2023 Last Iterate Risk Bounds of SGD with Decaying Stepsize for Overparameterized Linear Regression ICML 2022 The Power and Limitation of Pretraining-Finetuning for Linear Regression under Covariate Shift NIPS 2022 Risk Bounds of Multi-Pass SGD for Least Squares in the Interpolation Regime NIPS 2022 Self-training Converts Weak Learners to Strong Learners in Mixture Models AISTATS 2022 Faster Convergence of Stochastic Gradient Langevin Dynamics for Non-Log-Concave Sampling UAI 2021 Provable Robustness of Adversarial Training for Learning Halfspaces with Noise ICML 2021 On the Convergence of Hamiltonian Monte Carlo with Stochastic Gradients ICML 2021 How Much Over-parameterization Is Sufficient to Learn Deep ReLU Networks? ICLR 2021 Benign Overfitting of Constant-Stepsize SGD for Linear Regression COLT 2021 Direction Matters: On the Implicit Bias of Stochastic Gradient Descent with Moderate Learning Rate ICLR 2021 The Benefits of Implicit Regularization from SGD in Least Squares Problems NIPS 2021 Improving Adversarial Robustness Requires Revisiting Misclassified Examples ICLR 2020 On the Global Convergence of Training Deep Linear ResNets ICLR 2020 Stochastic Gradient Hamiltonian Monte Carlo Methods with Recursive Variance Reduction NIPS 2019 An Improved Analysis of Training Over-parameterized Deep Neural Networks NIPS 2019 Sampling from Non-Log-Concave Distributions via Variance-Reduced Gradient Langevin Dynamics AISTATS 2019 Layer-Dependent Importance Sampling for Training Deep and Large Graph Convolutional Networks NIPS 2019 Global Convergence of Langevin Dynamics Based Algorithms for Nonconvex Optimization NIPS 2018 Stochastic Variance-Reduced Hamilton Monte Carlo Methods ICML 2018