Co-occurring keywords
Papers
Multimodal Representation Loss Between Timed Text and Audio for Regularized Speech Separation
INTERSPEECH 2024
SVSNet+: Enhancing Speaker Voice Similarity Assessment Models with Representations from Speech Foundation Models
INTERSPEECH 2024
Parameter-efficient Fine-tuning of Speaker-Aware Dynamic Prompts for Speaker Verification
INTERSPEECH 2024
Tackling Missing Modalities in Audio-Visual Representation Learning Using Masked Autoencoders
INTERSPEECH 2024