Avner May
9 papers · 2019–2025 · 6 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+5 more ↓ Show less ↑
π£ Hot Topic Early Bird π Interdisciplinary Bridge π Conference Polyglot (6) π Academic Marathon (6) π Cross-Pollinator (11)
π
Renaissance Researcher
(6)
πΊοΈ
Taxonomy Completionist
(19)
π§
Keyword Pioneer
π
Trend Setter
β
The Questioner
Conferences
NIPS (4)
ACL (1)
AISTATS (1)
ICLR (1)
ICML (1)
JMLR (1)
Top co-authors
Keywords
speculative decoding
(3)
kernel approximation
(2)
word embedding
(2)
token generation
(2)
inference acceleration
(2)
model compression
(2)
acoustic modeling
(1)
model serving
(1)
dynamic programming
(1)
model inference
(1)
generalization bound
(1)
state space model
(1)
batch processing
(1)
language model
(1)
downstream performance
(1)
low-precision quantization
(1)
consumer gpu
(1)
linear rnn
(1)
generalization performance
(1)
pretrained embedding
(1)
Papers
MagicDec: Breaking the Latency-Throughput Tradeoff for Long Context Generation with Speculative Decoding
ICLR 2025
Cost-efficient Collaboration between On-device and Cloud Language Models
ICML 2025
The Mamba in the Llama: Distilling and Accelerating Hybrid Models
NIPS 2024
Sequoia: Scalable and Robust Speculative Decoding
NIPS 2024
SpecExec: Massively Parallel Speculative Decoding For Interactive LLM Inference on Consumer Devices
NIPS 2024
Contextual Embeddings: When Are They Worth It?
ACL 2020
Kernel Approximation Methods for Speech Recognition
JMLR 2019
Low-Precision Random Fourier Features for Memory-constrained Kernel Approximation
AISTATS 2019
On the Downstream Performance of Compressed Word Embeddings
NIPS 2019