Takuya Yoshioka

29 papers · 2016–2024 · 4 conferences · across top CS/AI conferences

Achievements

+13 more ↓

🌍 Conference Polyglot (4) 🧭 Keyword Pioneer 🌉 Interdisciplinary Bridge 🗺️ Taxonomy Completionist (19) 🏃 Academic Marathon (8)

🗺️ Taxonomy Completionist (19) 🧭 Keyword Pioneer 🏃 Academic Marathon (8) 🏠 Conference Loyalist (26) 🔬 Deep Specialist (12) 👥 Mega-Team (20) 🤝 Dynamic Duo (15) 🔥 Unstoppable (7) ⚡ Prolific Year (5) 📈 Trend Setter 🚀 Conference Pioneer 💎 Century Club (29) 🗃️ Keyword Collector (54)

Conferences

INTERSPEECH (26) AAAI (1) EMNLP (1) NAACL (1)

Top co-authors

Zhuo Chen (15) Naoyuki Kanda (14) Xiaofei Wang (9) Zhong Meng (7) Jinyu Li (6) Yashesh Gaur (6) Yu Wu (5) Michael Zeng (5) Jian Wu (5) Sefik Emre Eskimez (5)

Keywords

automatic speech recognition (10) speech separation (8) speech enhancement (6) speaker diarization (4) meeting transcription (3) speech recognition (3) speaker identification (3) speaker embedding (3) word error rate (3) speaker counting (3) multimodal learning (3) serialized output training (3) model compression (3) bidirectional long short-term memory (2) end-to-end speech recognition (2) teacher-student learning (2) knowledge distillation (2) convolutional neural network (2) deep neural network (2) deep learning (2)

Papers

Target conversation extraction: Source separation using turn-taking dynamics INTERSPEECH 2024 i-Code Studio: A Configurable and Composable Framework for Integrative AI EMNLP 2024 i-Code V2: An Autoregressive Generation Framework over Vision, Language, and Speech Data NAACL 2024 Knowledge boosting during low-latency inference INTERSPEECH 2024 i-Code: An Integrative and Composable Multimodal Learning Framework AAAI 2023 Factual Consistency Oriented Speech Recognition INTERSPEECH 2023 Real-Time Joint Personalized Speech Enhancement and Acoustic Echo Cancellation INTERSPEECH 2023 Adapting Multi-Lingual ASR Models for Handling Multiple Talkers INTERSPEECH 2023 Speaker Diarization for ASR Output with T-vectors: A Sequence Classification Approach INTERSPEECH 2023 Streaming Multi-Talker ASR with Token-Level Serialized Output Training INTERSPEECH 2022 Leveraging Real Conversational Data for Multi-Channel Continuous Speech Separation INTERSPEECH 2022 Separating Long-Form Speech with Group-wise Permutation Invariant Training INTERSPEECH 2022 Streaming Speaker-Attributed ASR with Token-Level Speaker Embeddings INTERSPEECH 2022 Fast Real-time Personalized Speech Enhancement: End-to-End Enhancement Network (E3Net) and Knowledge Distillation INTERSPEECH 2022 Investigation of Practical Aspects of Single Channel Speech Separation for ASR INTERSPEECH 2021 Large-Scale Pre-Training of End-to-End Multi-Talker ASR for Meeting Transcription with Single Distant Microphone INTERSPEECH 2021 End-to-End Speaker-Attributed ASR with Transformer INTERSPEECH 2021 Human Listening and Live Captioning: Multi-Task Training for Speech Enhancement INTERSPEECH 2021 Ultra Fast Speech Separation Model with Teacher Student Learning INTERSPEECH 2021 Joint Speaker Counting, Speech Recognition, and Speaker Identification for Overlapped Speech of any Number of Speakers INTERSPEECH 2020 Neural Speech Separation Using Spatially Distributed Microphones INTERSPEECH 2020 Serialized Output Training for End-to-End Overlapped Speech Recognition INTERSPEECH 2020 An End-to-End Architecture of Online Multi-Channel Speech Separation INTERSPEECH 2020 Meeting Transcription Using Asynchronous Distant Microphones INTERSPEECH 2019 Recognizing Overlapped Speech in Meetings: A Multichannel Separation Approach Using Neural Networks INTERSPEECH 2018 Investigations on Data Augmentation and Loss Functions for Deep Learning Based Speech-Background Separation INTERSPEECH 2018 Optimization of Speech Enhancement Front-End with Speech Recognition-Level Criterion INTERSPEECH 2016 Robust Example Search Using Bottleneck Features for Example-Based Speech Enhancement INTERSPEECH 2016 Context Adaptive Neural Network for Rapid Adaptation of Deep CNN Based Acoustic Models INTERSPEECH 2016