Dani Yogatama

42 papers · 2009–2026 · 9 conferences · across top CS/AI conferences

Achievements

+15 more ↓

🌍 Conference Polyglot (8) 🐣 Hot Topic Early Bird 🧭 Keyword Pioneer 🌉 Interdisciplinary Bridge 🏃 Academic Marathon (16)

🧭 Keyword Pioneer 🐣 Hot Topic Early Bird 🐝 Cross-Pollinator (11) 🌟 Keyword Trendsetter Combo (4) 🤝 Dynamic Duo (10) 👥 Mega-Team (69) 🌱 Topic Pioneer 🧬 Topic Evolution 🗃️ Keyword Collector (115) 🚀 Conference Pioneer 📈 Trend Setter ❓ The Questioner 🔥 Unstoppable (12) ⚡ Prolific Year (7) 💎 Century Club (40)

Conferences

ACL (11) EMNLP (9) ICLR (8) NIPS (4) ICML (3) IJCNLP (3) EACL (2) AISTATS (1) NAACL (1)

Top co-authors

Chris Dyer (10) Noah A. Smith (10) Lingpeng Kong (7) Phil Blunsom (4) Ting-Rui Chiang (4) Wang Ling (4) Sebastian Ruder (4) Hao Peng (3) Noah Smith (3) Robert Stanforth (3)

Research topics

Architectures (1) Optimization & Theory (1)

Keywords

language model (5) text classification (3) question answering (3) neural network (3) transformer model (2) in-context learning (2) adversarial robustness (2) transfer learning (2) masked language modeling (2) inference efficiency (2) formal verification (2) attention mechanism (2) symbol substitution (2) language modeling (2) interval bound propagation (2) word embedding (2) mathematical reasoning (1) logistic regression (1) zero-shot learning (1) natural language processing (1)

Papers

Pelican Soup Framework: A Theoretical Framework for Language Model Capabilities EACL 2026 FOL-Traces: Verified First-Order Logic Reasoning Traces at Scale EACL 2026 The Semantic Hub Hypothesis: Language Models Share Semantic Representations Across Languages and Modalities ICLR 2025 DeLLMa: Decision Making Under Uncertainty with Large Language Models ICLR 2025 The Rotary Position Embedding May Cause Dimension Inefficiency in Attention Heads for Long-Distance Retrieval ACL 2025 On Retrieval Augmentation and the Limitations of Language Model Training NAACL 2024 Interpretable Diffusion via Information Decomposition ICLR 2024 The Distributional Hypothesis Does Not Fully Explain the Benefits of Masked Language Model Pretraining EMNLP 2023 Scaling Laws vs Model Architectures: How does Inductive Bias Influence Scaling? EMNLP 2023 ABC: Attention with Bounded-memory Control ACL 2022 Scale Efficiently: Insights from Pretraining and Finetuning Transformers ICLR 2022 A Contrastive Framework for Neural Text Generation NIPS 2022 Finetuning Pretrained Transformers into RNNs EMNLP 2021 Mind the Gap: Assessing Temporal Generalization in Neural Language Models NIPS 2021 Random Feature Attention ICLR 2021 End-to-End Training of Multi-Document Reader and Retriever for Open-Domain Question Answering NIPS 2021 A Mutual Information Maximization Perspective of Language Representation Learning ICLR 2020 On the Cross-lingual Transferability of Monolingual Representations ACL 2020 A Call for More Rigor in Unsupervised Cross-lingual Learning ACL 2020 Reducing Sentiment Bias in Language Models via Counterfactual Evaluation EMNLP 2020 Variational Smoothing in Recurrent Neural Network Language Models ICLR 2019 Achieving Verified Robustness to Symbol Substitutions via Interval Bound Propagation EMNLP 2019 Episodic Memory in Lifelong Language Learning NIPS 2019 Achieving Verified Robustness to Symbol Substitutions via Interval Bound Propagation IJCNLP 2019 Memory Architectures in Recurrent Neural Network Language Models ICLR 2018 LSTMs Can Learn Syntax-Sensitive Dependencies Well, But Modeling Structure Makes Them Better ACL 2018 Program Induction by Rationale Generation: Learning to Solve and Explain Algebraic Word Problems ACL 2017 Deep Speech 2 : End-to-End Speech Recognition in English and Mandarin ICML 2016 Bayesian Optimization of Text Representations EMNLP 2015 Extractive Summarization by Maximizing Semantic Volume EMNLP 2015 Sparse Overcomplete Word Vector Representations IJCNLP 2015 Embedding Methods for Fine Grained Entity Type Classification ACL 2015 Sparse Overcomplete Word Vector Representations ACL 2015 Embedding Methods for Fine Grained Entity Type Classification IJCNLP 2015 Learning Word Representations with Hierarchical Sparse Coding ICML 2015 Making the Most of Bag of Words: Sentence Regularization with Alternating Direction Method of Multipliers ICML 2014 Efficient Transfer Learning Method for Automatic Hyperparameter Tuning AISTATS 2014 Linguistic Structured Sparsity in Text Categorization ACL 2014 A Probabilistic Model for Canonicalizing Named Entity Mentions ACL 2012 Part-of-Speech Tagging for Twitter: Annotation, Features, and Experiments ACL 2011 Predicting a Scientific Community’s Response to an Article EMNLP 2011 Multilingual Spectral Clustering Using Document Similarity Propagation EMNLP 2009