Co-occurring keywords
Papers
Empowering Whisper as a Joint Multi-Talker and Target-Talker Speech Recognition System
INTERSPEECH 2024
Noise-robust Speech Separation with Fast Generative Correction
INTERSPEECH 2024
Unified Audio Visual Cues for Target Speaker Extraction
INTERSPEECH 2024
Target Speaker Extraction with Curriculum Learning
INTERSPEECH 2024
Enhanced Deep Speech Separation in Clustered Ad Hoc Distributed Microphone Environments
INTERSPEECH 2024
LibriheavyMix: A 20,000-Hour Dataset for Single-Channel Reverberant Multi-Talker Speech Separation, ASR and Speaker Diarization
INTERSPEECH 2024
Serialized Output Training by Learned Dominance
INTERSPEECH 2024
Improving Generalization of Speech Separation in Real-World Scenarios: Strategies in Simulation, Optimization, and Evaluation
INTERSPEECH 2024
Transcription-Free Fine-Tuning of Speech Separation Models for Noisy and Reverberant Multi-Speaker Automatic Speech Recognition
INTERSPEECH 2024
Multimodal Representation Loss Between Timed Text and Audio for Regularized Speech Separation
INTERSPEECH 2024
Towards Audio Codec-based Speech Separation
INTERSPEECH 2024
Text-aware Speech Separation for Multi-talker Keyword Spotting
INTERSPEECH 2024
Does the Lombard Effect Matter in Speech Separation? Introducing the Lombard-GRID-2mix Dataset
INTERSPEECH 2024
Joint Speaker Features Learning for Audio-visual Multichannel Speech Separation and Recognition
INTERSPEECH 2024