Darshil Doshi
4 papers · 2023–2025 · 3 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+1 more ↓ Show less ↑
π§ Keyword Pioneer π Conference Polyglot (3) π Cross-Pollinator (12) π Interdisciplinary Bridge πΊοΈ Taxonomy Completionist (10)
β
The Questioner
Conferences
NIPS (2)
ICLR (1)
ICML (1)
Top co-authors
Keywords
transformer architecture
(1)
in-context learning
(1)
out-of-distribution generalization
(1)
gaussian process
(1)
attention head
(1)
residual connection
(1)
skill composition
(1)
modular arithmetic
(1)
layer normalization
(1)
neural network initialization
(1)
neural network
(1)
critical initialization
(1)
partial jacobian
(1)
Papers
(How) Can Transformers Predict Pseudo-Random Numbers?
ICML 2025
Learning to grok: Emergence of in-context learning and skill composition in modular arithmetic tasks
NIPS 2024
To Grok or not to Grok: Disentangling Generalization and Memorization on Corrupted Algorithmic Datasets
ICLR 2024
Critical Initialization of Wide and Deep Neural Networks using Partial Jacobians: General Theory and Applications
NIPS 2023