Denny Wu

28 papers · 2019–2025 · 5 conferences · across top CS/AI conferences

Achievements

+10 more ↓

🌍 Conference Polyglot (5) 🏃 Academic Marathon (6) 🧭 Keyword Pioneer 🌉 Interdisciplinary Bridge 🐝 Cross-Pollinator (11)

🐝 Cross-Pollinator (11) 🗺️ Taxonomy Completionist (30) 👑 Triple Crown 🏆 Keyword Champion (3) 🤝 Dynamic Duo (21) 🔥 Unstoppable (7) 🗃️ Keyword Collector (74) 💎 Century Club (28) ❓ The Questioner (2) ⚡ Prolific Year (7)

Conferences

NIPS (10) ICLR (8) COLT (4) ICML (4) AISTATS (2)

Top co-authors

Taiji Suzuki (21) Atsushi Nitanda (11) Kazusato Oko (9) Jimmy Ba (5) Murat A Erdogdu (5) Yujin Song (3) Zhichao Wang (3) Ji Xu (2) Jason D. Lee (2) Naoki Nishikawa (2)

Keywords

neural network (9) representation learning (4) gradient descent (4) sample complexity (3) single-index model (3) feature learning (3) stochastic gradient descent (3) ridge regression (2) mean-field langevin dynamics (2) neural network optimization (2) two-layer neural network (2) convergence analysis (1) global convergence (1) mean field theory (1) in-context learning (1) gradient boosting (1) convex optimization (1) nonlinear regression (1) binary classification (1) probability distribution (1)

Papers

Nonlinear transformers can perform inference-time feature learning ICML 2025 Mean-field analysis of polynomial-width two-layer neural network beyond finite time horizon COLT 2025 Learning Compositional Functions with Transformers from Easy-to-Hard Data COLT 2025 Learning Multi-Index Models with Neural Networks via Mean-Field Langevin Dynamics ICLR 2025 Metastable Dynamics of Chain-of-Thought Reasoning: Provable Benefits of Search, RL and Distillation ICML 2025 Learning sum of diverse features: computational hardness and efficient gradient-based training for ridge combinations COLT 2024 Nonlinear spiked covariance matrices and signal propagation in deep neural networks COLT 2024 Neural network learns low-dimensional polynomials with SGD near the information-theoretic limit NIPS 2024 SILVER: Single-loop variance reduction and application to federated learning ICML 2024 Pretrained Transformer Efficiently Learns Low-Dimensional Target Functions In-Context NIPS 2024 Improved statistical and computational complexity of the mean-field Langevin dynamics under structured data ICLR 2024 Why is parameter averaging beneficial in SGD? An objective smoothing perspective AISTATS 2024 Learning in the Presence of Low-dimensional Structure: A Spiked Random Matrix Perspective NIPS 2023 Feature learning via mean-field Langevin dynamics: classifying sparse parities and beyond NIPS 2023 Gradient-Based Feature Learning under Structured Data NIPS 2023 Primal and Dual Analysis of Entropic Fictitious Play for Finite-sum Problems ICML 2023 Uniform-in-time propagation of chaos for the mean-field gradient Langevin dynamics ICLR 2023 Convergence of mean-field Langevin dynamics: time-space discretization, stochastic gradient, and variance reduction NIPS 2023 High-dimensional Asymptotics of Feature Learning: How One Gradient Step Improves the Representation NIPS 2022 Convex Analysis of the Mean Field Langevin Dynamics AISTATS 2022 Understanding the Variance Collapse of SVGD in High Dimensions ICLR 2022 Particle Stochastic Dual Coordinate Ascent: Exponential convergent algorithm for mean field neural network optimization ICLR 2022 Two-layer neural network on infinite dimensional data: global optimization guarantee in the mean-field regime NIPS 2022 When does preconditioning help or hurt generalization? ICLR 2021 Particle Dual Averaging: Optimization of Mean Field Neural Network with Global Convergence Rate Analysis NIPS 2021 Generalization of Two-layer Neural Networks: An Asymptotic Viewpoint ICLR 2020 On the Optimal Weighted $\ell_2$ Regularization in Overparameterized Linear Regression NIPS 2020 Post Selection Inference with Incomplete Maximum Mean Discrepancy Estimator ICLR 2019