Hiroshi Sato

20 papers · 2013–2024 · 3 conferences · across top CS/AI conferences

Achievements

+10 more ↓

🧭 Keyword Pioneer 🗺️ Taxonomy Completionist (12) 🌉 Interdisciplinary Bridge 🌈 Renaissance Researcher (5) 🌍 Conference Polyglot (3)

🌉 Interdisciplinary Bridge 🗺️ Taxonomy Completionist (12) 🤝 Dynamic Duo (14) 🔬 Deep Specialist (11) 🚀 Conference Pioneer ⚡ Prolific Year (5) 🔥 Unstoppable (6) ❓ The Questioner (3) 🗃️ Keyword Collector (84) 💎 Century Club (20)

Conferences

INTERSPEECH (18) COLING (1) IJCAI (1)

Top co-authors

Takafumi Moriya (14) Ryo Masumura (13) Tomohiro Tanaka (12) Marc Delcroix (10) Tsubasa Ochiai (10) Mana Ihori (8) Takanori Ashihara (8) Nobukatsu Hojo (5) Saki Mizuno (5) Atsushi Ando (4)

Research topics

Speech & Audio (1)

Keywords

automatic speech recognition (7) speech enhancement (4) speech recognition (3) speaker verification (2) self-supervised learning (2) attention mechanism (2) speaker separation (2) speech separation (2) knowledge distillation (2) multi-talker speech (2) target speech extraction (2) neural transducer (2) orthogonal projection (1) multi-task learning (1) real-time processing (1) grammatical error correction (1) mutual information (1) model complexity (1) video analysis (1) feature representation (1)

Papers

SpeakerBeam-SS: Real-time Target Speaker Extraction with Lightweight Conv-TasNet and State Space Modeling INTERSPEECH 2024 Boosting Hybrid Autoregressive Transducer-based ASR with Internal Acoustic Model Training and Dual Blank Thresholding INTERSPEECH 2024 End-to-End Joint Target and Non-Target Speakers ASR INTERSPEECH 2023 Audio-Visual Praise Estimation for Conversational Video based on Synchronization-Guided Multimodal Transformer INTERSPEECH 2023 Knowledge Distillation for Neural Transducer-based Target-Speaker ASR: Exploiting Parallel Mixture/Single-Talker Speech Data INTERSPEECH 2023 Downstream Task Agnostic Speech Enhancement with Self-Supervised Representation Loss INTERSPEECH 2023 Transcribing Speech as Spoken and Written Dual Text Using an Autoregressive Model INTERSPEECH 2023 Streaming Target-Speaker ASR with Neural Transducer INTERSPEECH 2022 Listen only to me! How well can target speech extraction handle false alarms? INTERSPEECH 2022 Strategies to Improve Robustness of Target Speech Extraction to Enrollment Variations INTERSPEECH 2022 Domain Adversarial Self-Supervised Speech Representation Learning for Improving Unknown Domain Downstream Tasks INTERSPEECH 2022 Multi-Perspective Document Revision COLING 2022 End-to-End Joint Modeling of Conversation History-Dependent and Independent ASR Systems with Multi-History Training INTERSPEECH 2022 How bad are artifacts?: Analyzing the impact of speech enhancement errors on ASR INTERSPEECH 2022 Streaming End-to-End Speech Recognition for Hybrid RNN-T/Attention Architecture INTERSPEECH 2021 Should We Always Separate?: Switching Between Enhanced and Observed Signals for Overlapping Speech Recognition INTERSPEECH 2021 Self-Distillation for Improving CTC-Transformer-Based ASR Systems INTERSPEECH 2020 Neural Whispered Speech Detection with Imbalanced Learning INTERSPEECH 2019 End-to-End Automatic Speech Recognition with a Reconstruction Criterion Using Speech-to-Text and Text-to-Speech Encoder-Decoders INTERSPEECH 2019 Prior-Free Exploration Bonus for and beyond Near Bayes-Optimal Behavior IJCAI 2013