Eran Malach
28 papers · 2017–2025 · 6 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+13 more ↓ Show less ↑
π Academic Marathon (8) π Conference Polyglot (6) π§ Keyword Pioneer π Interdisciplinary Bridge π£ Hot Topic Early Bird
πΊοΈ
Taxonomy Completionist
(39)
π§
Keyword Pioneer
π
Conference Polyglot
(6)
π¬
Deep Specialist
(10)
π§¬
Topic Evolution
π
Triple Crown
π
Keyword Champion
(2)
ποΈ
Keyword Collector
(92)
β‘
Prolific Year
(5)
π
Conference Pioneer
π
Century Club
(28)
π₯
Unstoppable
(9)
β
The Questioner
Conferences
NIPS (11)
ICML (8)
ICLR (5)
COLT (2)
COLING (1)
JMLR (1)
Top co-authors
Keywords
learning theory
(4)
gradient descent
(4)
neural network
(3)
representation learning
(3)
neural network optimization
(3)
computational complexity
(2)
depth separation
(2)
stochastic gradient descent
(2)
decision tree
(2)
lottery ticket hypothesis
(2)
differentiable learning
(2)
feature learning
(2)
sparse parity
(2)
neural network training
(2)
text generation
(1)
in-context learning
(1)
semi-supervised learning
(1)
knowledge distillation
(1)
noisy label learning
(1)
sample efficiency
(1)
Papers
The Power of Random Features and the Limits of Distribution-Free Gradient Descent
ICML 2025
LoRA Soups: Merging LoRAs for Practical Skill Composition Tasks
COLING 2025
Mixture of Parrots: Experts improve memorization more than reasoning
ICLR 2025
A New Perspective on Shampoo's Preconditioner
ICLR 2025
DONβT STOP ME NOW: EMBEDDING BASED SCHEDULING FOR LLMS
ICLR 2025
The Role of Sparsity for Length Generalization in LLMs
ICML 2025
Universal Length Generalization with Turing Programs
ICML 2025
Repeat After Me: Transformers are Better than State Space Models at Copying
ICML 2024
On the Power of Decision Trees in Auto-Regressive Language Modeling
NIPS 2024
The Evolution of Statistical Induction Heads: In-Context Learning Markov Chains
NIPS 2024
Transcendence: Generative Models Can Outperform The Experts That Train Them
NIPS 2024
Auto-Regressive Next-Token Predictors are Universal Learners
ICML 2024
Pareto Frontiers in Deep Feature Learning: Data, Compute, Width, and Luck
NIPS 2023
Knowledge Distillation: Bad Models Can Be Good Role Models
NIPS 2022
When Hardness of Approximation Meets Hardness of Learning
JMLR 2022
Efficient Learning of CNNs using Patch Based Features
ICML 2022
Hidden Progress in Deep Learning: SGD Learns Parities Near the Computational Limit
NIPS 2022
The Connection Between Approximation, Depth Separation and Learnability in Neural Networks
COLT 2021
On the Power of Differentiable Learning versus PAC and SQ Learning
NIPS 2021
Computational Separation Between Convolutional and Fully-Connected Networks
ICLR 2021
Quantifying the Benefit of Using Differentiable Learning over Tangent Kernels
ICML 2021
The Implications of Local Correlation on Learning Some Deep Functions
NIPS 2020
Learning Parities with Neural Networks
NIPS 2020
ID3 Learns Juntas for Smoothed Product Distributions
COLT 2020
Proving the Lottery Ticket Hypothesis: Pruning is All You Need
ICML 2020
Is Deeper Better only when Shallow is Good?
NIPS 2019
SGD Learns Over-parameterized Networks that Provably Generalize on Linearly Separable Data
ICLR 2018
Decoupling "when to update" from "how to update"
NIPS 2017