Tri Dao
40 papers · 2017–2025 · 6 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+13 more ↓ Show less ↑
πΊοΈ Taxonomy Completionist (14) π§ Keyword Pioneer π Interdisciplinary Bridge π Renaissance Researcher (6) π Conference Polyglot (6)
π
Conference Polyglot
(6)
π
Academic Marathon
(8)
π
Cross-Pollinator
(9)
π§¬
Topic Evolution
π€
Dynamic Duo
(26)
π¬
Deep Specialist
(10)
π
Triple Crown
π₯
Unstoppable
(9)
π
Century Club
(40)
π
Trend Setter
ποΈ
Keyword Collector
(169)
β‘
Prolific Year
(9)
π
Conference Pioneer
Conferences
NIPS (18)
ICML (12)
ICLR (7)
AISTATS (1)
ICCV (1)
UAI (1)
Top co-authors
Keywords
model compression
(7)
large language model
(4)
state space model
(4)
attention mechanism
(4)
neural network
(4)
language model
(3)
model architecture
(3)
sequence model
(3)
kernel methods
(2)
structured matrix
(2)
random fourier feature
(2)
gpu optimization
(2)
recurrent neural network
(2)
long convolution
(2)
distributed learning
(2)
language modeling
(2)
transfer learning
(2)
computational efficiency
(2)
model parallelism
(2)
kernel approximation
(2)
Papers
Long-Context State-Space Video World Models
ICCV 2025
Ladder-Residual: Parallelism-Aware Architecture for Accelerating Large Model Inference with Communication Overlapping
ICML 2025
BitDelta: Your Fine-Tune May Only Be Worth One Bit
NIPS 2024
Medusa: Simple LLM Inference Acceleration Framework with Multiple Decoding Heads
ICML 2024
Transformers are SSMs: Generalized Models and Efficient Algorithms Through Structured State Space Duality
ICML 2024
Caduceus: Bi-Directional Equivariant Long-Range DNA Sequence Modeling
ICML 2024
FlashAttention-2: Faster Attention with Better Parallelism and Work Partitioning
ICLR 2024
RedPajama: an Open Dataset for Training Large Language Models
NIPS 2024
FlashAttention-3: Fast and Accurate Attention with Asynchrony and Low-precision
NIPS 2024
Hydra: Bidirectional State Space Models Through Generalized Matrix Mixers
NIPS 2024
The Mamba in the Llama: Distilling and Accelerating Hybrid Models
NIPS 2024
Simple Hardware-Efficient Long Convolutions for Sequence Modeling
ICML 2023
Deja Vu: Contextual Sparsity for Efficient LLMs at Inference Time
ICML 2023
Hyena Hierarchy: Towards Larger Convolutional Language Models
ICML 2023
Effectively Modeling Time Series with Simple Discrete State Spaces
ICLR 2023
Hungry Hungry Hippos: Towards Language Modeling with State Space Models
ICLR 2023
Fine-tuning Language Models over Slow Networks using Activation Quantization with Guarantees
NIPS 2022
Transform Once: Efficient Operator Learning in Frequency Domain
NIPS 2022
S4ND: Modeling Images and Videos as Multidimensional Signals with State Spaces
NIPS 2022
Pixelated Butterfly: Simple and Efficient Sparse training for Neural Network Models
ICLR 2022
Monarch: Expressive Structured Matrices for Efficient and Accurate Training
ICML 2022
ButterflyFlow: Building Invertible Layers with Butterfly Matrices
ICML 2022
Decentralized Training of Foundation Models in Heterogeneous Environments
NIPS 2022
FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness
NIPS 2022
MONGOOSE: A Learnable LSH Framework for Efficient Neural Network Training
ICLR 2021
Combining Recurrent, Convolutional, and Continuous-time Models with Linear State Space Layers
NIPS 2021
Rethinking Neural Operations for Diverse Tasks
NIPS 2021
Scatterbrain: Unifying Sparse and Low-rank Attention
NIPS 2021
Catformer: Designing Stable Transformers via Sensitivity Analysis
ICML 2021
Knowledge Distillation as Semiparametric Inference
ICLR 2021
HiPPO: Recurrent Memory with Optimal Polynomial Projections
NIPS 2020
Kaleidoscope: An Efficient, Learnable Representation For All Structured Linear Maps
ICLR 2020
Adaptive Hashing for Model Counting
UAI 2019
Approximating the Permanent by Sampling from Adaptive Partitions
NIPS 2019
On the Downstream Performance of Compressed Word Embeddings
NIPS 2019
Low-Precision Random Fourier Features for Memory-constrained Kernel Approximation
AISTATS 2019
Learning Fast Algorithms for Linear Transforms Using Butterfly Factorizations
ICML 2019
A Kernel Theory of Modern Data Augmentation
ICML 2019
Learning Compressed Transforms with Low Displacement Rank
NIPS 2018
Gaussian Quadrature for Kernel Features
NIPS 2017