Andrew M Saxe
16 papers · 2016–2025 · 3 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+4 more ↓ Show less ↑
🐝 Cross-Pollinator (15) 🌉 Interdisciplinary Bridge 🧭 Keyword Pioneer 🐣 Hot Topic Early Bird 🌍 Conference Polyglot (3)
🏃
Academic Marathon
(9)
⚡
Prolific Year
(6)
💎
Century Club
(16)
❓
The Questioner
(2)
Conferences
ICML (9)
ICLR (4)
NIPS (3)
Top co-authors
Keywords
neural network
(2)
representation learning
(1)
continual learning
(1)
stochastic gradient descent
(1)
generalization error
(1)
activation function
(1)
hidden unit
(1)
vanishing gradient
(1)
teacher-student setup
(1)
modular architecture
(1)
rectified linear unit
(1)
gated network
(1)
relu nonlinearity
(1)
task abstraction
(1)
tensor switching
(1)
weight specialization
(1)
neural network architecture
(1)
tensor switching network
(1)
Papers
Training Dynamics of In-Context Learning in Linear Attention
ICML 2025
Strategy Coopetition Explains the Emergence and Transience of In-Context Learning
ICML 2025
Algorithm Development in Neural Networks: Insights from the Streaming Parity Task
ICML 2025
Make Haste Slowly: A Theory of Emergent Structured Mixed Selectivity in Feature Learning ReLU Networks
ICLR 2025
A Theory of Initialisation's Impact on Specialisation
ICLR 2025
Not all solutions are created equal: An analytical dissociation of functional and representational similarity in deep linear neural networks
ICML 2025
From Lazy to Rich: Exact Learning Dynamics in Deep Linear Networks
ICLR 2025
Tilting the Odds at the Lottery: the Interplay of Overparameterisation and Curricula in Neural Networks
ICML 2024
Flexible task abstractions emerge in linear networks with fast and bounded units
NIPS 2024
Why Do Animals Need Shaping? A Theory of Task Composition and Curriculum Learning
ICML 2024
What needs to go right for an induction head? A mechanistic study of in-context learning circuits and their formation
ICML 2024
When Representations Align: Universality in Representation Learning Dynamics
ICML 2024
Understanding Unimodal Bias in Multimodal Deep Linear Networks
ICML 2024
On The Specialization of Neural Modules
ICLR 2023
Dynamics of stochastic gradient descent for two-layer neural networks in the teacher-student setup
NIPS 2019
Tensor Switching Networks
NIPS 2016