Takafumi Moriya

29 papers · 2018–2024 · 1 conference · across top CS/AI conferences

Achievements

+10 more ↓

🏃 Academic Marathon (6) 🧭 Keyword Pioneer 🌉 Interdisciplinary Bridge 🗺️ Taxonomy Completionist (12) 🐝 Cross-Pollinator (9)

🧭 Keyword Pioneer 🐝 Cross-Pollinator (9) 🏠 Conference Loyalist (29) 🤝 Dynamic Duo (16) 🔬 Deep Specialist (16) 💎 Century Club (29) 🗃️ Keyword Collector (122) ❓ The Questioner (2) ⚡ Prolific Year (6) 🔥 Unstoppable (7)

Conferences

INTERSPEECH (29)

Top co-authors

Tomohiro Tanaka (16) Ryo Masumura (15) Takanori Ashihara (14) Hiroshi Sato (14) Marc Delcroix (12) Tsubasa Ochiai (8) Kohei Matsuura (7) Mana Ihori (7) Atsushi Ando (6) Yusuke Shinohara (6)

Research topics

Speech & Audio (1)

Keywords

automatic speech recognition (10) knowledge distillation (4) neural transducer (4) self-supervised learning (4) speech representation (3) end-to-end speech recognition (3) speech recognition (3) neural network (2) attention mechanism (2) speech summarization (2) domain adaptation (2) speech synthesis (2) speech enhancement (2) model compression (2) acoustic model (2) end-to-end model (2) multi-talker speech (2) speech separation (1) voice conversion (1) neural network pruning (1)

Papers

SpeakerBeam-SS: Real-time Target Speaker Extraction with Lightweight Conv-TasNet and State Space Modeling INTERSPEECH 2024 Boosting Hybrid Autoregressive Transducer-based ASR with Internal Acoustic Model Training and Dual Blank Thresholding INTERSPEECH 2024 Pre-training Neural Transducer-based Streaming Voice Conversion for Faster Convergence and Alignment-free Training INTERSPEECH 2024 Sentence-wise Speech Summarization: Task, Datasets, and End-to-End Modeling with LM Knowledge Distillation INTERSPEECH 2024 Factor-Conditioned Speaking-Style Captioning INTERSPEECH 2024 Unified Multi-Talker ASR with and without Target-speaker Enrollment INTERSPEECH 2024 Knowledge Distillation for Neural Transducer-based Target-Speaker ASR: Exploiting Parallel Mixture/Single-Talker Speech Data INTERSPEECH 2023 Transfer Learning from Pre-trained Language Models Improves End-to-End Speech Summarization INTERSPEECH 2023 Downstream Task Agnostic Speech Enhancement with Self-Supervised Representation Loss INTERSPEECH 2023 End-to-End Joint Target and Non-Target Speakers ASR INTERSPEECH 2023 SpeechGLUE: How Well Can Self-Supervised Speech Models Capture Linguistic Knowledge? INTERSPEECH 2023 VC-T: Streaming Voice Conversion Based on Neural Transducer INTERSPEECH 2023 Domain Adversarial Self-Supervised Speech Representation Learning for Improving Unknown Domain Downstream Tasks INTERSPEECH 2022 Deep versus Wide: An Analysis of Student Architectures for Task-Agnostic Knowledge Distillation of Self-Supervised Speech Models INTERSPEECH 2022 Strategies to Improve Robustness of Target Speech Extraction to Enrollment Variations INTERSPEECH 2022 Streaming Target-Speaker ASR with Neural Transducer INTERSPEECH 2022 End-to-End Joint Modeling of Conversation History-Dependent and Independent ASR Systems with Multi-History Training INTERSPEECH 2022 Cross-Modal Transformer-Based Neural Correction Models for Automatic Speech Recognition INTERSPEECH 2021 Streaming End-to-End Speech Recognition for Hybrid RNN-T/Attention Architecture INTERSPEECH 2021 Investigating the Impact of Spectral and Temporal Degradation on End-to-End Automatic Speech Recognition Performance INTERSPEECH 2021 Should We Always Separate?: Switching Between Enhanced and Observed Signals for Overlapping Speech Recognition INTERSPEECH 2021 Self-Distillation for Improving CTC-Transformer-Based ASR Systems INTERSPEECH 2020 A Joint End-to-End and DNN-HMM Hybrid Automatic Speech Recognition System with Transferring Sharable Knowledge INTERSPEECH 2019 Joint Maximization Decoder with Neural Converters for Fully Neural Network-Based Japanese Speech Recognition INTERSPEECH 2019 Neural Whispered Speech Detection with Imbalanced Learning INTERSPEECH 2019 End-to-End Automatic Speech Recognition with a Reconstruction Criterion Using Speech-to-Text and Text-to-Speech Encoder-Decoders INTERSPEECH 2019 Multi-task Learning with Augmentation Strategy for Acoustic-to-word Attention-based Encoder-decoder Speech Recognition INTERSPEECH 2018 Encoder Transfer for Attention-based Acoustic-to-word Speech Recognition INTERSPEECH 2018 Automatic DNN Node Pruning Using Mixture Distribution-based Group Regularization INTERSPEECH 2018