Naoyuki Kanda
28 papers · 2016–2024 · 3 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+15 more ↓ Show less ↑
π£ Hot Topic Early Bird πΊοΈ Taxonomy Completionist (19) π Interdisciplinary Bridge π§ Keyword Pioneer π Conference Polyglot (3)
πΊοΈ
Taxonomy Completionist
(19)
π§
Keyword Pioneer
π
Academic Marathon
(8)
π
Conference Loyalist
(26)
π
Keyword Champion
(3)
π§¬
Topic Evolution
π₯
Mega-Team
(20)
π¬
Deep Specialist
(15)
π€
Dynamic Duo
(14)
π
Trend Setter
π
Conference Pioneer
β‘
Prolific Year
(5)
π
Century Club
(28)
ποΈ
Keyword Collector
(54)
π₯
Unstoppable
(7)
Conferences
INTERSPEECH (26)
AAAI (1)
NAACL (1)
Top co-authors
Keywords
automatic speech recognition
(8)
speaker diarization
(5)
speaker identification
(4)
word error rate
(4)
speech separation
(4)
end-to-end speech recognition
(4)
speech recognition
(4)
serialized output training
(3)
multimodal learning
(3)
speaker counting
(3)
end-to-end model
(3)
transformer transducer
(3)
acoustic model
(3)
multi-talker speech recognition
(2)
overlapped speech
(2)
semi-supervised learning
(2)
speaker embedding
(2)
deep neural network
(2)
speech enhancement
(2)
language model
(2)
Papers
An Investigation of Noise Robustness for Flow-Matching-Based Zero-Shot TTS
INTERSPEECH 2024
i-Code V2: An Autoregressive Generation Framework over Vision, Language, and Speech Data
NAACL 2024
NOTSOFAR-1 Challenge: New Datasets, Baseline, and Tasks for Distant Meeting Transcription
INTERSPEECH 2024
Total-Duration-Aware Duration Modeling for Text-to-Speech Systems
INTERSPEECH 2024
i-Code: An Integrative and Composable Multimodal Learning Framework
AAAI 2023
Factual Consistency Oriented Speech Recognition
INTERSPEECH 2023
Adapting Multi-Lingual ASR Models for Handling Multiple Talkers
INTERSPEECH 2023
Speaker Diarization for ASR Output with T-vectors: A Sequence Classification Approach
INTERSPEECH 2023
Leveraging Real Conversational Data for Multi-Channel Continuous Speech Separation
INTERSPEECH 2022
Separating Long-Form Speech with Group-wise Permutation Invariant Training
INTERSPEECH 2022
Streaming Speaker-Attributed ASR with Token-Level Speaker Embeddings
INTERSPEECH 2022
Internal Language Model Adaptation with Text-Only Data for End-to-End Speech Recognition
INTERSPEECH 2022
Streaming Multi-Talker ASR with Token-Level Serialized Output Training
INTERSPEECH 2022
Large-Scale Pre-Training of End-to-End Multi-Talker ASR for Meeting Transcription with Single Distant Microphone
INTERSPEECH 2021
On Minimum Word Error Rate Training of the Hybrid Autoregressive Transducer
INTERSPEECH 2021
End-to-End Speaker-Attributed ASR with Transformer
INTERSPEECH 2021
Streaming Multi-Talker Speech Recognition with Joint Speaker Identification
INTERSPEECH 2021
Minimum Word Error Rate Training with Language Model Fusion for End-to-End Speech Recognition
INTERSPEECH 2021
Investigation of Practical Aspects of Single Channel Speech Separation for ASR
INTERSPEECH 2021
Joint Speaker Counting, Speech Recognition, and Speaker Identification for Overlapped Speech of any Number of Speakers
INTERSPEECH 2020
Serialized Output Training for End-to-End Overlapped Speech Recognition
INTERSPEECH 2020
Auxiliary Interference Speaker Loss for Target-Speaker Speech Recognition
INTERSPEECH 2019
Guided Source Separation Meets a Strong ASR Backend: Hitachi/Paderborn University Joint Investigation for Dinner Party ASR
INTERSPEECH 2019
Multimodal Response Obligation Detection with Unsupervised Online Domain Adaptation
INTERSPEECH 2019
End-to-End Neural Speaker Diarization with Permutation-Free Objectives
INTERSPEECH 2019
Lattice-free State-level Minimum Bayes Risk Training of Acoustic Models
INTERSPEECH 2018
Maximum a posteriori Based Decoding for CTC Acoustic Models
INTERSPEECH 2016
Investigation of Semi-Supervised Acoustic Model Training Based on the Committee of Heterogeneous Neural Networks
INTERSPEECH 2016