Kazuki Irie
23 papers · 2016–2025 · 6 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+13 more ↓ Show less ↑
🧭 Keyword Pioneer 🌍 Conference Polyglot (6) 🌉 Interdisciplinary Bridge 🌈 Renaissance Researcher (5) 🏃 Academic Marathon (9)
🏃
Academic Marathon
(9)
🐝
Cross-Pollinator
(8)
🗺️
Taxonomy Completionist
(43)
🤝
Dynamic Duo
(13)
👑
Triple Crown
🏆
Keyword Champion
(2)
🧬
Topic Evolution
🔥
Unstoppable
(5)
🗃️
Keyword Collector
(94)
⚡
Prolific Year
(5)
❓
The Questioner
💎
Century Club
(23)
🚀
Conference Pioneer
Conferences
INTERSPEECH (6)
NIPS (6)
EMNLP (4)
ICLR (3)
ICML (3)
ACL (1)
Top co-authors
Research topics
Keywords
language model
(6)
attention mechanism
(6)
language modeling
(4)
long short-term memory
(3)
speech recognition
(3)
automatic speech recognition
(3)
mixture of expert
(3)
recurrent neural network
(3)
fast weight programmer
(3)
positional encoding
(2)
bidirectional lstm
(2)
linear transformer
(2)
neural network
(2)
systematic generalization
(2)
universal transformer
(2)
knowledge distillation
(1)
reinforcement learning
(1)
bayesian learning
(1)
contrastive learning
(1)
transformer architecture
(1)
Papers
Why Are Positional Encodings Nonessential for Deep Autoregressive Transformers? A Petroglyph Revisited
ACL 2025
SwitchHead: Accelerating Transformers with Mixture-of-Experts Attention
NIPS 2024
Exploring the Promise and Limits of Real-Time Recurrent Learning
ICLR 2024
MoEUT: Mixture-of-Experts Universal Transformers
NIPS 2024
Dissecting the Interplay of Attention Paths in a Statistical Mechanics Theory of Transformers
NIPS 2024
Contrastive Training of Complex-Valued Autoencoders for Object Discovery
NIPS 2023
Practical Computational Power of Linear Transformers and Their Recurrent and Self-Referential Extensions
EMNLP 2023
Approximating Two-Layer Feedforward Networks for Efficient Transformers
EMNLP 2023
Images as Weight Matrices: Sequential Image Generation Through Synaptic Learning Rules
ICLR 2023
The Neural Data Router: Adaptive Control Flow in Transformers Improves Systematic Generalization
ICLR 2022
The Dual Form of Neural Networks Revisited: Connecting Test Time Predictions to Training Patterns via Spotlights of Attention
ICML 2022
A Modern Self-Referential Weight Matrix That Learns to Modify Itself
ICML 2022
Neural Differential Equations for Learning to Program Neural Nets Through Continuous Learning Rules
NIPS 2022
CTL++: Evaluating Generalization on Never-Seen Compositional Patterns of Known Functions, and Compatibility of Neural Representations
EMNLP 2022
Going Beyond Linear Transformers with Recurrent Fast Weight Programmers
NIPS 2021
The Devil is in the Detail: Simple Tricks Improve Systematic Generalization of Transformers
EMNLP 2021
Linear Transformers Are Secretly Fast Weight Programmers
ICML 2021
RWTH ASR Systems for LibriSpeech: Hybrid vs Attention
INTERSPEECH 2019
On the Choice of Modeling Unit for Sequence-to-Sequence Speech Recognition
INTERSPEECH 2019
Language Modeling with Deep Transformers
INTERSPEECH 2019
Improved Training of End-to-end Attention Models for Speech Recognition
INTERSPEECH 2018
Investigation on Estimation of Sentence Probability by Combining Forward, Backward and Bi-directional LSTM-RNNs
INTERSPEECH 2018
LSTM, GRU, Highway and a Bit of Attention: An Empirical Overview for Language Modeling in Speech Recognition
INTERSPEECH 2016