Donald Metzler

37 papers · 2009–2025 · 9 conferences · across top CS/AI conferences

Achievements

+12 more ↓

🌍 Conference Polyglot (9) 🏃 Academic Marathon (16) 🧭 Keyword Pioneer 🌉 Interdisciplinary Bridge 🐝 Cross-Pollinator (9)

🐝 Cross-Pollinator (9) 🗺️ Taxonomy Completionist (54) 🧬 Topic Evolution 🤝 Dynamic Duo (25) 👑 Triple Crown 🔬 Deep Specialist (10) ❓ The Questioner (7) 📈 Trend Setter ⚡ Prolific Year (10) 💎 Century Club (37) 🔥 Unstoppable (6) 🗃️ Keyword Collector (111)

Conferences

ACL (7) EMNLP (7) ICLR (7) NAACL (5) ICML (4) IJCNLP (3) NIPS (2) COLING (1) IJCAI (1)

Top co-authors

Yi Tay (25) Dara Bahri (17) Mostafa Dehghani (11) Jai Gupta (11) Zhen Qin (8) Tal Schuster (8) Vamsi Aribandi (7) Jinfeng Rao (6) Kai Hui (6) Vinh Tran (6)

Research topics

Architectures (1) Optimization & Theory (1)

Keywords

information retrieval (5) model architecture (5) transformer architecture (4) large language model (4) few-shot learning (3) language model (3) zero-shot learning (3) scaling law (2) natural language inference (2) masked language modeling (2) attention mechanism (2) document ranking (2) constituency parsing (2) text generation (2) model compression (2) document retrieval (2) dependency parsing (2) transfer learning (2) efficient computing (2) convolutional neural network (2)

Papers

Tomato, Tomahto, Tomate: Do Multilingual Language Models Understand Based on Subword-Level Semantic Concepts? NAACL 2025 SEMQA: Semi-Extractive Multi-Source Question Answering NAACL 2024 OpenMSD: Towards Multilingual Scientific Documents Similarity Measurement COLING 2024 Large Language Models are Effective Text Rankers with Pairwise Ranking Prompting NAACL 2024 Scaling Laws vs Model Architectures: How does Inductive Bias Influence Scaling? EMNLP 2023 UL2: Unifying Language Learning Paradigms ICLR 2023 PaRaDe: Passage Ranking using Demonstrations with LLMs EMNLP 2023 How Does Generative Retrieval Scale to Millions of Passages? EMNLP 2023 LAIT: Efficient Multi-Segment Encoding in Transformers with Layer-Adjustable Interaction ACL 2023 Transcending Scaling Laws with 0.1% Extra Compute EMNLP 2023 DSI++: Updating Transformer Memory with New Documents EMNLP 2023 Confident Adaptive Language Modeling NIPS 2022 Transformer Memory as a Differentiable Search Index NIPS 2022 ED2LM: Encoder-Decoder to Language Model for Faster Document Re-ranking Inference ACL 2022 Dense Feature Memory Augmented Transformers for COVID-19 Vaccination Search Classification EMNLP 2022 Stretching Sentence-pair NLI Models to Reason over Long Documents and Clusters EMNLP 2022 ExT5: Towards Extreme Multi-Task Scaling for Transfer Learning ICLR 2022 Scale Efficiently: Insights from Pretraining and Finetuning Transformers ICLR 2022 Charformer: Fast Character Transformers via Gradient-based Subword Tokenization ICLR 2022 Scarf: Self-Supervised Contrastive Learning using Random Feature Corruption ICLR 2022 HyperPrompt: Prompt-based Task-Conditioning of Transformers ICML 2022 Are Pretrained Convolutions Better than Pretrained Transformers? ACL 2021 Are Pretrained Convolutions Better than Pretrained Transformers? IJCNLP 2021 StructFormer: Joint Unsupervised Induction of Dependency and Constituency Structure from Masked Language Modeling IJCNLP 2021 How Reliable are Model Diagnostics? IJCNLP 2021 How Reliable are Model Diagnostics? ACL 2021 HyperGrid Transformers: Towards A Single Model for Multiple Tasks ICLR 2021 Long Range Arena : A Benchmark for Efficient Transformers ICLR 2021 StructFormer: Joint Unsupervised Induction of Dependency and Constituency Structure from Masked Language Modeling ACL 2021 Synthesizer: Rethinking Self-Attention for Transformer Models ICML 2021 OmniNet: Omnidirectional Representations from Transformers ICML 2021 Reverse Engineering Configurations of Neural Text Generation Models ACL 2020 Sparse Sinkhorn Attention ICML 2020 Learning with Sparse and Biased Feedback for Personal Search IJCAI 2018 Structured Event Retrieval over Microblog Archives NAACL 2012 An Empirical Evaluation of Data-Driven Paraphrase Generation Techniques ACL 2011 Search Engine Adaptation by Feedback Control Adjustment for Time-sensitive Query NAACL 2009