Tomohiro Nakatani

33 papers · 2016–2024 · 1 conference · across top CS/AI conferences

Achievements

+14 more ↓

🏃 Academic Marathon (8) 🗺️ Taxonomy Completionist (10) 🧭 Keyword Pioneer 🌉 Interdisciplinary Bridge 🐣 Hot Topic Early Bird

🏃 Academic Marathon (8) 🌈 Renaissance Researcher (6) 🏠 Conference Loyalist (33) 🔬 Deep Specialist (14) 🧬 Topic Evolution 🏆 Keyword Champion (2) 🤝 Dynamic Duo (22) 🗃️ Keyword Collector (52) 📈 Trend Setter 🔥 Unstoppable (9) 🚀 Conference Pioneer ⚡ Prolific Year (5) ❓ The Questioner 💎 Century Club (33)

Conferences

INTERSPEECH (33)

Top co-authors

Marc Delcroix (22) Keisuke Kinoshita (21) Atsunori Ogawa (18) Shoko Araki (11) Toshio Irino (7) Shigeki Karita (6) Tsubasa Ochiai (6) Dung T. Tran (4) Kenichi Arai (4) Katsuhiko Yamamoto (4)

Keywords

speech enhancement (14) automatic speech recognition (8) speech intelligibility (6) neural network (5) deep neural network (5) source separation (4) word error rate (4) noise robustness (3) speech separation (3) speaker separation (3) speech recognition (3) target speech extraction (3) recurrent neural network (3) acoustic modeling (3) acoustic model (3) acoustic model adaptation (3) end-to-end speech recognition (2) bottleneck feature (2) speaker extraction (2) attention mechanism (2)

Papers

Array Geometry-Robust Attention-Based Neural Beamformer for Moving Speakers INTERSPEECH 2024 Multi-Stream Extension of Variational Bayesian HMM Clustering (MS-VBx) for Combined End-to-End and Vector Clustering-based Diarization INTERSPEECH 2023 Impact of Residual Noise and Artifacts in Speech Enhancement Errors on Intelligibility of Human and Machine INTERSPEECH 2023 Target Speech Extraction with Conditional Diffusion Model INTERSPEECH 2023 Listen only to me! How well can target speech extraction handle false alarms? INTERSPEECH 2022 PILOT: Introducing Transformers for Probabilistic Sound Event Localization INTERSPEECH 2021 Comparison of Remote Experiments Using Crowdsourcing and Laboratory Experiments on Speech Intelligibility INTERSPEECH 2021 Predicting Intelligibility of Enhanced Speech Using Posteriors Derived from DNN-Based ASR System INTERSPEECH 2020 Multi-Talker ASR for an Unknown Number of Sources: Joint Training of Source Counting, Separation and ASR INTERSPEECH 2020 Multi-Path RNN for Hierarchical Modeling of Long Sequential Data and its Application to Speaker Stream Separation INTERSPEECH 2020 Computationally Efficient and Versatile Framework for Joint Optimization of Blind Speech Separation and Dereverberation INTERSPEECH 2020 Simultaneous Denoising and Dereverberation for Low-Latency Applications Using Frame-by-Frame Online Unified Convolutional Beamformer INTERSPEECH 2019 End-to-End SpeakerBeam for Single Channel Target Speech Recognition INTERSPEECH 2019 Improving Transformer-Based End-to-End Speech Recognition with Connectionist Temporal Classification and Language Model Integration INTERSPEECH 2019 Multimodal SpeakerBeam: Single Channel Target Speech Extraction with Audio-Visual Speaker Clues INTERSPEECH 2019 Improved Deep Duel Model for Rescoring N-Best Speech Recognition List Using Backward LSTMLM and Ensemble Encoders INTERSPEECH 2019 Predicting Speech Intelligibility of Enhanced Speech Using Phone Accuracy of DNN-Based ASR System INTERSPEECH 2019 Multi-resolution Gammachirp Envelope Distortion Index for Intelligibility Prediction of Noisy Speech INTERSPEECH 2018 Integrating Neural Network Based Beamforming and Weighted Prediction Error Dereverberation INTERSPEECH 2018 Auxiliary Feature Based Adaptation of End-to-end ASR Systems INTERSPEECH 2018 Neural Network-Based Spectrum Estimation for Online WPE Dereverberation INTERSPEECH 2017 Improved Example-Based Speech Enhancement by Using Deep Neural Network Acoustic Model for Noise Robust Example Search INTERSPEECH 2017 Forward-Backward Convolutional LSTM for Acoustic Modeling INTERSPEECH 2017 Unfolded Deep Recurrent Convolutional Neural Network with Jump Ahead Connections for Acoustic Modeling INTERSPEECH 2017 Deep Clustering-Based Beamforming for Separation with Unknown Number of Sources INTERSPEECH 2017 Uncertainty Decoding with Adaptive Sampling for Noise Robust DNN-Based Acoustic Modeling INTERSPEECH 2017 Predicting Speech Intelligibility Using a Gammachirp Envelope Distortion Index Based on the Signal-to-Distortion Ratio INTERSPEECH 2017 Speaker-Aware Neural Network Based Beamformer for Speaker Extraction in Speech Mixtures INTERSPEECH 2017 Speech Intelligibility Prediction Based on the Envelope Power Spectrum Model with the Dynamic Compressive Gammachirp Auditory Filterbank INTERSPEECH 2016 Optimization of Speech Enhancement Front-End with Speech Recognition-Level Criterion INTERSPEECH 2016 Factorized Linear Input Network for Acoustic Model Adaptation in Noisy Conditions INTERSPEECH 2016 Context Adaptive Neural Network for Rapid Adaptation of Deep CNN Based Acoustic Models INTERSPEECH 2016 Robust Example Search Using Bottleneck Features for Example-Based Speech Enhancement INTERSPEECH 2016