Hadi Daneshmand
16 papers · 2014–2025 · 4 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+5 more ↓ Show less ↑
π Interdisciplinary Bridge π Academic Marathon (11) π Cross-Pollinator (13) πΊοΈ Taxonomy Completionist (19) π§ Keyword Pioneer
π£
Hot Topic Early Bird
π
Conference Polyglot
(4)
π
Keyword Champion
(2)
π
Century Club
(16)
ποΈ
Keyword Collector
(70)
Conferences
NIPS (6)
ICML (5)
AISTATS (3)
ICLR (2)
Top co-authors
Keywords
batch normalization
(4)
stochastic gradient descent
(3)
adaptive sampling
(2)
convex optimization
(2)
non-convex optimization
(1)
neural network theory
(1)
iterative optimization
(1)
in-context learning
(1)
online learning
(1)
optimal transport
(1)
sample complexity
(1)
loss landscape
(1)
empirical risk minimization
(1)
accelerated optimization
(1)
neural network training
(1)
network inference
(1)
gradient descent
(1)
variance reduction
(1)
markov chain
(1)
representation learning
(1)
Papers
Transformers Can Learn Temporal Difference Methods for In-Context Reinforcement Learning
ICLR 2025
Towards Training Without Depth Limits: Batch Normalization Without Gradient Explosion
ICLR 2024
Efficient displacement convex optimization with particle gradient descent
ICML 2023
On Bridging the Gap between Mean Field and Finite Width Deep Random Multilayer Perceptron with Batch Normalization
ICML 2023
On the impact of activation and normalization in obtaining isometric embeddings at initialization
NIPS 2023
Transformers learn to implement preconditioned gradient descent for in-context learning
NIPS 2023
Revisiting the Role of Euler Numerical Integration on Acceleration and Stability in Convex Optimization
AISTATS 2021
Batch Normalization Orthogonalizes Representations in Deep Random Networks
NIPS 2021
Rethinking the Variational Interpretation of Accelerated Optimization Methods
NIPS 2021
Batch normalization provably avoids ranks collapse for randomly initialised deep networks
NIPS 2020
Local Saddle Point Optimization: A Curvature Exploitation Approach
AISTATS 2019
Exponential convergence rates for Batch Normalization: The power of length-direction decoupling in non-convex optimization
AISTATS 2019
Escaping Saddles with Stochastic Gradients
ICML 2018
Starting Small - Learning with Adaptive Sample Sizes
ICML 2016
Adaptive Newton Method for Empirical Risk Minimization to Statistical Accuracy
NIPS 2016
Estimating Diffusion Network Structures: Recovery Conditions, Sample Complexity & Soft-thresholding Algorithm
ICML 2014