Albert Gu

32 papers · 2018–2025 · 5 conferences · across top CS/AI conferences

Achievements

+12 more ↓

🌍 Conference Polyglot (5) 🏃 Academic Marathon (7) 🌉 Interdisciplinary Bridge 🧭 Keyword Pioneer 🐣 Hot Topic Early Bird

🗺️ Taxonomy Completionist (43) 🌍 Conference Polyglot (5) 🏃 Academic Marathon (7) 🧬 Topic Evolution 👑 Triple Crown 🏆 Keyword Champion (5) 🤝 Dynamic Duo (18) 🗃️ Keyword Collector (92) ⚡ Prolific Year (5) 💎 Century Club (32) 🔥 Unstoppable (8) 📈 Trend Setter

Conferences

ICML (12) NIPS (11) ICLR (7) EMNLP (1) NAACL (1)

Top co-authors

Christopher Re (18) Tri Dao (11) Karan Goel (7) Atri Rudra (6) Razvan Pascanu (2) Ines Chami (2) Isys Johnson (2) Matthew Eichhorn (2) Ankit Gupta (2) Caglar Gulcehre (2)

Research topics

Models (1)

Keywords

state space model (6) sequence modeling (5) recurrent neural network (5) sequence model (3) model architecture (3) state-space model (3) model compression (3) dimensionality reduction (2) representation learning (2) long-range dependency (2) hyperbolic embedding (2) parallel training (2) long sequence modeling (2) long-range dependencies (2) data augmentation (1) knowledge distillation (1) attention mechanism (1) video classification (1) structured matrix (1) principal component analysis (1)

Papers

Understanding and Improving Length Generalization in Recurrent Models ICML 2025 Understanding the Skill Gap in Recurrent Language Models: The Role of the Gather-and-Aggregate Mechanism ICML 2025 On the Benefits of Memory for Modeling Time-Dependent PDEs ICLR 2025 Towards Codec-LM Co-design for Neural Codec Language Models NAACL 2025 Hydra: Bidirectional State Space Models Through Generalized Matrix Mixers NIPS 2024 Transformers to SSMs: Distilling Quadratic Knowledge to Subquadratic Models NIPS 2024 Caduceus: Bi-Directional Equivariant Long-Range DNA Sequence Modeling ICML 2024 Transformers are SSMs: Generalized Models and Efficient Algorithms Through Structured State Space Duality ICML 2024 How to Train your HIPPO: State Space Models with Generalized Orthogonal Basis Projections ICLR 2023 Structured State Space Models for In-Context Reinforcement Learning NIPS 2023 Pretraining Without Attention EMNLP 2023 Modelling Long Range Dependencies in $N$D: From Task-Specific to a General Purpose CNN ICLR 2023 Resurrecting Recurrent Neural Networks for Long Sequences ICML 2023 Efficiently Modeling Long Sequences with Structured State Spaces ICLR 2022 On the Parameterization and Initialization of Diagonal State Space Models NIPS 2022 Diagonal State Spaces are as Effective as Structured State Spaces NIPS 2022 S4ND: Modeling Images and Videos as Multidimensional Signals with State Spaces NIPS 2022 It’s Raw! Audio Generation with State-Space Models ICML 2022 HoroPCA: Hyperbolic Dimensionality Reduction via Horospherical Projections ICML 2021 Catformer: Designing Stable Transformers via Sensitivity Analysis ICML 2021 Combining Recurrent, Convolutional, and Continuous-time Models with Linear State Space Layers NIPS 2021 Model Patching: Closing the Subgroup Performance Gap with Data Augmentation ICLR 2021 Kaleidoscope: An Efficient, Learnable Representation For All Structured Linear Maps ICLR 2020 From Trees to Continuous Embeddings and Back: Hyperbolic Hierarchical Clustering NIPS 2020 Improving the Gating Mechanism of Recurrent Neural Networks ICML 2020 HiPPO: Recurrent Memory with Optimal Polynomial Projections NIPS 2020 No Subclass Left Behind: Fine-Grained Robustness in Coarse-Grained Classification Problems NIPS 2020 Learning Fast Algorithms for Linear Transforms Using Butterfly Factorizations ICML 2019 A Kernel Theory of Modern Data Augmentation ICML 2019 Learning Mixed-Curvature Representations in Product Spaces ICLR 2019 Representation Tradeoffs for Hyperbolic Embeddings ICML 2018 Learning Compressed Transforms with Low Displacement Rank NIPS 2018