Joshua Ainslie

26 papers · 2020–2025 · 8 conferences · across top CS/AI conferences

Achievements

+13 more ↓

🌍 Conference Polyglot (8) 🏃 Academic Marathon (5) 🧭 Keyword Pioneer 🌉 Interdisciplinary Bridge 🐝 Cross-Pollinator (3)

🧭 Keyword Pioneer 🌈 Renaissance Researcher (6) 🌍 Conference Polyglot (8) 🤝 Dynamic Duo (11) 👑 Triple Crown 👥 Mega-Team (22) 🔬 Deep Specialist (10) 💎 Century Club (26) ❓ The Questioner 📈 Trend Setter ⚡ Prolific Year (9) 🗃️ Keyword Collector (90) 🔥 Unstoppable (6)

Conferences

ACL (6) EMNLP (6) NAACL (4) ICLR (3) ICML (2) IJCNLP (2) NIPS (2) COLING (1)

Top co-authors

Santiago Ontanon (11) Sumit Sanghai (8) Yury Zemlyanskiy (7) Michiel De Jong (7) James Lee-Thorp (5) Fei Sha (4) Anirudh Ravula (4) Mandy Guo (4) David Uthus (3) Kumar Avinava Dubey (3)

Keywords

question answering (5) transformer architecture (4) model compression (4) long sequence (3) text summarization (3) efficient computing (3) transformer model (3) compositional generalization (3) document understanding (3) long document (2) multimodal learning (2) semantic parsing (2) attention mechanism (2) retrieval augmentation (2) knowledge-intensive task (2) conditional computation (2) language model (2) attention mask (2) information extraction (2) parse tree (2)

Papers

Linear Transformer Topological Masking with Graph Random Features ICLR 2025 Learning the RoPEs: Better 2D and 3D Position Encodings with STRING ICML 2025 Functional Interpolation for Relative Positions improves Long Context Transformers ICLR 2024 MEMORY-VQ: Compression for Tractable Internet-Scale Memory NAACL 2024 CoLT5: Faster Long-Range Transformers with Conditional Computation EMNLP 2023 Conditional Adapters: Parameter-efficient Transfer Learning with Fast Inference NIPS 2023 FormNetV2: Multimodal Graph Contrastive Learning for Form Document Information Extraction ACL 2023 FiDO: Fusion-in-Decoder optimized for stronger performance and faster inference ACL 2023 A Suite of Generative Tasks for Multi-Level Multimodal Webpage Understanding EMNLP 2023 GQA: Training Generalized Multi-Query Transformer Models from Multi-Head Checkpoints EMNLP 2023 mLongT5: A Multilingual and Efficient Text-To-Text Transformer for Longer Sequences EMNLP 2023 Sparse Upcycling: Training Mixture-of-Experts from Dense Checkpoints ICLR 2023 Pre-computed memory or on-the-fly encoding? A hybrid approach to retrieval augmentation makes the most of your compute ICML 2023 LongT5: Efficient Text-To-Text Transformer for Long Sequences NAACL 2022 FormNet: Structural Encoding beyond Sequential Modeling in Form Document Information Extraction ACL 2022 Generate-and-Retrieve: Use Your Predictions to Improve Retrieval for Semantic Parsing COLING 2022 Sparse Mixers: Combining MoE and Mixing to build a more efficient BERT EMNLP 2022 Making Transformers Solve Compositional Tasks ACL 2022 FNet: Mixing Tokens with Fourier Transforms NAACL 2022 ReadTwice: Reading Very Large Documents with Memories NAACL 2021 Improving Compositional Generalization in Classification Tasks via Structure Annotations ACL 2021 RealFormer: Transformer Likes Residual Attention ACL 2021 Improving Compositional Generalization in Classification Tasks via Structure Annotations IJCNLP 2021 RealFormer: Transformer Likes Residual Attention IJCNLP 2021 Big Bird: Transformers for Longer Sequences NIPS 2020 ETC: Encoding Long and Structured Inputs in Transformers EMNLP 2020