Joshua Ainslie
26 papers · 2020–2025 · 8 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+13 more ↓ Show less ↑
🌍 Conference Polyglot (8) 🏃 Academic Marathon (5) 🧭 Keyword Pioneer 🌉 Interdisciplinary Bridge 🐝 Cross-Pollinator (3)
🧭
Keyword Pioneer
🌈
Renaissance Researcher
(6)
🌍
Conference Polyglot
(8)
🤝
Dynamic Duo
(11)
👑
Triple Crown
👥
Mega-Team
(22)
🔬
Deep Specialist
(10)
💎
Century Club
(26)
❓
The Questioner
📈
Trend Setter
⚡
Prolific Year
(9)
🗃️
Keyword Collector
(90)
🔥
Unstoppable
(6)
Conferences
ACL (6)
EMNLP (6)
NAACL (4)
ICLR (3)
ICML (2)
IJCNLP (2)
NIPS (2)
COLING (1)
Top co-authors
Keywords
question answering
(5)
transformer architecture
(4)
model compression
(4)
long sequence
(3)
text summarization
(3)
efficient computing
(3)
transformer model
(3)
compositional generalization
(3)
document understanding
(3)
long document
(2)
multimodal learning
(2)
semantic parsing
(2)
attention mechanism
(2)
retrieval augmentation
(2)
knowledge-intensive task
(2)
conditional computation
(2)
language model
(2)
attention mask
(2)
information extraction
(2)
parse tree
(2)
Papers
Linear Transformer Topological Masking with Graph Random Features
ICLR 2025
Learning the RoPEs: Better 2D and 3D Position Encodings with STRING
ICML 2025
Functional Interpolation for Relative Positions improves Long Context Transformers
ICLR 2024
MEMORY-VQ: Compression for Tractable Internet-Scale Memory
NAACL 2024
CoLT5: Faster Long-Range Transformers with Conditional Computation
EMNLP 2023
Conditional Adapters: Parameter-efficient Transfer Learning with Fast Inference
NIPS 2023
FormNetV2: Multimodal Graph Contrastive Learning for Form Document Information Extraction
ACL 2023
FiDO: Fusion-in-Decoder optimized for stronger performance and faster inference
ACL 2023
A Suite of Generative Tasks for Multi-Level Multimodal Webpage Understanding
EMNLP 2023
GQA: Training Generalized Multi-Query Transformer Models from Multi-Head Checkpoints
EMNLP 2023
mLongT5: A Multilingual and Efficient Text-To-Text Transformer for Longer Sequences
EMNLP 2023
Sparse Upcycling: Training Mixture-of-Experts from Dense Checkpoints
ICLR 2023
Pre-computed memory or on-the-fly encoding? A hybrid approach to retrieval augmentation makes the most of your compute
ICML 2023
LongT5: Efficient Text-To-Text Transformer for Long Sequences
NAACL 2022
FormNet: Structural Encoding beyond Sequential Modeling in Form Document Information Extraction
ACL 2022
Generate-and-Retrieve: Use Your Predictions to Improve Retrieval for Semantic Parsing
COLING 2022
Sparse Mixers: Combining MoE and Mixing to build a more efficient BERT
EMNLP 2022
Making Transformers Solve Compositional Tasks
ACL 2022
FNet: Mixing Tokens with Fourier Transforms
NAACL 2022
ReadTwice: Reading Very Large Documents with Memories
NAACL 2021
Improving Compositional Generalization in Classification Tasks via Structure Annotations
ACL 2021
RealFormer: Transformer Likes Residual Attention
ACL 2021
Improving Compositional Generalization in Classification Tasks via Structure Annotations
IJCNLP 2021
RealFormer: Transformer Likes Residual Attention
IJCNLP 2021
Big Bird: Transformers for Longer Sequences
NIPS 2020
ETC: Encoding Long and Structured Inputs in Transformers
EMNLP 2020