Kazuki Irie

23 papers · 2016–2025 · 6 conferences · across top CS/AI conferences

Achievements

+13 more ↓

🧭 Keyword Pioneer 🌍 Conference Polyglot (6) 🌉 Interdisciplinary Bridge 🌈 Renaissance Researcher (5) 🏃 Academic Marathon (9)

🏃 Academic Marathon (9) 🐝 Cross-Pollinator (8) 🗺️ Taxonomy Completionist (43) 🤝 Dynamic Duo (13) 👑 Triple Crown 🏆 Keyword Champion (2) 🧬 Topic Evolution 🔥 Unstoppable (5) 🗃️ Keyword Collector (94) ⚡ Prolific Year (5) ❓ The Questioner 💎 Century Club (23) 🚀 Conference Pioneer

Conferences

INTERSPEECH (6) NIPS (6) EMNLP (4) ICLR (3) ICML (3) ACL (1)

Top co-authors

Jürgen Schmidhuber (13) Róbert Csordás (10) Hermann Ney (5) Ralf Schlüter (5) Albert Zeyer (3) Imanol Schlag (3) Juergen Schmidhuber (2) Anand Gopalakrishnan (2) Haim Sompolinsky (1) Tamer Alkhouli (1)

Research topics

Statistics (1)

Keywords

language model (6) attention mechanism (6) language modeling (4) long short-term memory (3) speech recognition (3) automatic speech recognition (3) mixture of expert (3) recurrent neural network (3) fast weight programmer (3) positional encoding (2) bidirectional lstm (2) linear transformer (2) neural network (2) systematic generalization (2) universal transformer (2) knowledge distillation (1) reinforcement learning (1) bayesian learning (1) contrastive learning (1) transformer architecture (1)

Papers

Why Are Positional Encodings Nonessential for Deep Autoregressive Transformers? A Petroglyph Revisited ACL 2025 SwitchHead: Accelerating Transformers with Mixture-of-Experts Attention NIPS 2024 Exploring the Promise and Limits of Real-Time Recurrent Learning ICLR 2024 MoEUT: Mixture-of-Experts Universal Transformers NIPS 2024 Dissecting the Interplay of Attention Paths in a Statistical Mechanics Theory of Transformers NIPS 2024 Contrastive Training of Complex-Valued Autoencoders for Object Discovery NIPS 2023 Practical Computational Power of Linear Transformers and Their Recurrent and Self-Referential Extensions EMNLP 2023 Approximating Two-Layer Feedforward Networks for Efficient Transformers EMNLP 2023 Images as Weight Matrices: Sequential Image Generation Through Synaptic Learning Rules ICLR 2023 The Neural Data Router: Adaptive Control Flow in Transformers Improves Systematic Generalization ICLR 2022 The Dual Form of Neural Networks Revisited: Connecting Test Time Predictions to Training Patterns via Spotlights of Attention ICML 2022 A Modern Self-Referential Weight Matrix That Learns to Modify Itself ICML 2022 Neural Differential Equations for Learning to Program Neural Nets Through Continuous Learning Rules NIPS 2022 CTL++: Evaluating Generalization on Never-Seen Compositional Patterns of Known Functions, and Compatibility of Neural Representations EMNLP 2022 Going Beyond Linear Transformers with Recurrent Fast Weight Programmers NIPS 2021 The Devil is in the Detail: Simple Tricks Improve Systematic Generalization of Transformers EMNLP 2021 Linear Transformers Are Secretly Fast Weight Programmers ICML 2021 RWTH ASR Systems for LibriSpeech: Hybrid vs Attention INTERSPEECH 2019 On the Choice of Modeling Unit for Sequence-to-Sequence Speech Recognition INTERSPEECH 2019 Language Modeling with Deep Transformers INTERSPEECH 2019 Improved Training of End-to-end Attention Models for Speech Recognition INTERSPEECH 2018 Investigation on Estimation of Sentence Probability by Combining Forward, Backward and Bi-directional LSTM-RNNs INTERSPEECH 2018 LSTM, GRU, Highway and a Bit of Attention: An Empirical Overview for Language Modeling in Speech Recognition INTERSPEECH 2016