Multimodal Learning
13,185 papers
Papers per year
1
3
6
2
5
2
3
6
24
20
46
109
205
299
622
675
987
1084
1697
2500
3655
1234
'10
'15
'20
'25
Papers
GPU-accelerated Guided Source Separation for Meeting Transcription
INTERSPEECH 2023
MyVoice: Arabic Speech Resource Collaboration Platform
INTERSPEECH 2023
PIAVE: A Pose-Invariant Audio-Visual Speaker Extraction Network
INTERSPEECH 2023
Spatial LibriSpeech: An Augmented Dataset for Spatial Audio Learning
INTERSPEECH 2023
Rethinking the Visual Cues in Audio-Visual Speaker Extraction
INTERSPEECH 2023
Improved DeepFake Detection Using Whisper Features
INTERSPEECH 2023