Hirofumi Inaguma

25 papers · 2017–2025 · 6 conferences · across top CS/AI conferences

Achievements

+11 more ↓

🏃 Academic Marathon (8) 🌉 Interdisciplinary Bridge 🧭 Keyword Pioneer 🌍 Conference Polyglot (6) 🐝 Cross-Pollinator (12)

🌍 Conference Polyglot (6) 🏃 Academic Marathon (8) 🌈 Renaissance Researcher (5) 👥 Mega-Team (62) 🤝 Dynamic Duo (10) 🔬 Deep Specialist (10) 🏆 Keyword Champion (3) 💎 Century Club (25) ⚡ Prolific Year (5) 🔥 Unstoppable (6) 🗃️ Keyword Collector (96)

Conferences

INTERSPEECH (11) ACL (9) EMNLP (2) ICLR (1) IJCNLP (1) NAACL (1)

Top co-authors

Shinji Watanabe (10) Tatsuya Kawahara (9) Jiatong Shi (7) Juan Pino (7) Xutai Ma (6) Yun Tang (6) Masato Mimura (5) Kevin Duh (5) Ilia Kulikov (4) Brian Yan (3)

Keywords

automatic speech recognition (10) speech translation (6) speech recognition (6) speech-to-speech translation (5) knowledge distillation (4) connectionist temporal classification (4) speech-to-text translation (3) neural machine translation (3) monotonic attention (3) self-supervised learning (3) end-to-end speech translation (2) speech processing (2) spoken language translation (2) streaming speech recognition (2) simultaneous translation (2) discrete representation (2) speech synthesis (2) model ensembling (2) encoder-decoder model (2) speech representation (2)

Papers

SSR: Alignment-Aware Modality Connector for Speech Language Models ACL 2025 Seeing is Believing: Emotion-Aware Audio-Visual Language Modeling for Expressive Speech Generation EMNLP 2025 Investigating Decoder-only Large Language Models for Speech-to-text Translation INTERSPEECH 2024 MMM: Multi-Layer Multi-Residual Multi-Stream Discrete Speech Representation from Self-supervised Learning Model INTERSPEECH 2024 Multi-resolution HuBERT: Multi-resolution Speech Self-Supervised Learning with Masked Unit Prediction ICLR 2024 Speech-to-Speech Translation for a Real-world Unwritten Language ACL 2023 FINDINGS OF THE IWSLT 2023 EVALUATION CAMPAIGN ACL 2023 Simple and Effective Unsupervised Speech Translation ACL 2023 Exploration on HuBERT with Multiple Resolution INTERSPEECH 2023 Hybrid Transducer and Attention based Encoder-Decoder Modeling for Speech-to-Text Tasks ACL 2023 UnitY: Two-pass Direct Speech-to-speech Translation with Discrete Units ACL 2023 ESPnet-ST-v2: Multipurpose Spoken Language Translation Toolkit ACL 2023 Non-autoregressive Error Correction for CTC-based ASR with Phone-conditioned Masked LM INTERSPEECH 2022 Source and Target Bidirectional Knowledge Distillation for End-to-end Speech Translation NAACL 2021 ESPnet-ST IWSLT 2021 Offline Speech Translation System ACL 2021 ESPnet-ST IWSLT 2021 Offline Speech Translation System IJCNLP 2021 StableEmit: Selection Probability Discount for Reducing Emission Latency of Streaming Monotonic Attention ASR INTERSPEECH 2021 VAD-Free Streaming Hybrid CTC/Attention ASR for Unsegmented Recording INTERSPEECH 2021 CTC-Synchronous Training for Monotonic Attention Model INTERSPEECH 2020 ESPnet-ST: All-in-One Speech Translation Toolkit ACL 2020 Enhancing Monotonic Multihead Attention for Streaming ASR INTERSPEECH 2020 Distilling the Knowledge of BERT for Sequence-to-Sequence ASR INTERSPEECH 2020 End-to-End Speech-to-Dialog-Act Recognition INTERSPEECH 2020 The JHU/KyotoU Speech Translation System for IWSLT 2018 EMNLP 2018 Social Signal Detection in Spontaneous Dialogue Using Bidirectional LSTM-CTC INTERSPEECH 2017