Hadi Daneshmand

16 papers · 2014–2025 · 4 conferences · across top CS/AI conferences

Achievements

+5 more ↓

🌉 Interdisciplinary Bridge 🏃 Academic Marathon (11) 🐝 Cross-Pollinator (13) 🗺️ Taxonomy Completionist (19) 🧭 Keyword Pioneer

🐣 Hot Topic Early Bird 🌍 Conference Polyglot (4) 🏆 Keyword Champion (2) 💎 Century Club (16) 🗃️ Keyword Collector (70)

Conferences

NIPS (6) ICML (5) AISTATS (3) ICLR (2)

Top co-authors

Thomas Hofmann (7) Aurelien Lucchi (6) Amir Joudaki (4) Francis R. Bach (3) Jonas Köhler (3) Peiyuan Zhang (2) Antonio Orvieto (2) Le Song (1) Kwangjun Ahn (1) Aryan Mokhtari (1)

Keywords

batch normalization (4) stochastic gradient descent (3) adaptive sampling (2) convex optimization (2) non-convex optimization (1) neural network theory (1) iterative optimization (1) in-context learning (1) online learning (1) optimal transport (1) sample complexity (1) loss landscape (1) empirical risk minimization (1) accelerated optimization (1) neural network training (1) network inference (1) gradient descent (1) variance reduction (1) markov chain (1) representation learning (1)

Papers

Transformers Can Learn Temporal Difference Methods for In-Context Reinforcement Learning ICLR 2025 Towards Training Without Depth Limits: Batch Normalization Without Gradient Explosion ICLR 2024 Efficient displacement convex optimization with particle gradient descent ICML 2023 On Bridging the Gap between Mean Field and Finite Width Deep Random Multilayer Perceptron with Batch Normalization ICML 2023 On the impact of activation and normalization in obtaining isometric embeddings at initialization NIPS 2023 Transformers learn to implement preconditioned gradient descent for in-context learning NIPS 2023 Revisiting the Role of Euler Numerical Integration on Acceleration and Stability in Convex Optimization AISTATS 2021 Batch Normalization Orthogonalizes Representations in Deep Random Networks NIPS 2021 Rethinking the Variational Interpretation of Accelerated Optimization Methods NIPS 2021 Batch normalization provably avoids ranks collapse for randomly initialised deep networks NIPS 2020 Local Saddle Point Optimization: A Curvature Exploitation Approach AISTATS 2019 Exponential convergence rates for Batch Normalization: The power of length-direction decoupling in non-convex optimization AISTATS 2019 Escaping Saddles with Stochastic Gradients ICML 2018 Starting Small - Learning with Adaptive Sample Sizes ICML 2016 Adaptive Newton Method for Empirical Risk Minimization to Statistical Accuracy NIPS 2016 Estimating Diffusion Network Structures: Recovery Conditions, Sample Complexity & Soft-thresholding Algorithm ICML 2014