Multimodal Learning
13,057 papers
Papers per year
1
3
6
2
5
2
3
6
24
20
46
109
205
299
622
675
987
1084
1697
2500
3654
1107
'10
'15
'20
'25
Papers
Prosodic alignment for off-screen automatic dubbing
INTERSPEECH 2022
SAQAM: Spatial Audio Quality Assessment Metric
INTERSPEECH 2022
Speech Quality Assessment through MOS using Non-Matching References
INTERSPEECH 2022
Deep Speech Synthesis from Articulatory Representations
INTERSPEECH 2022
NeMo Open Source Speaker Diarization System
INTERSPEECH 2022
DAVIS: Driver’s Audio-Visual Speech recognition
INTERSPEECH 2022
End-to-End Audio-Visual Neural Speaker Diarization
INTERSPEECH 2022
Event-related data conditioning for acoustic event classification
INTERSPEECH 2022
Separate What You Describe: Language-Queried Audio Source Separation
INTERSPEECH 2022
End-to-end Speech-to-Punctuated-Text Recognition
INTERSPEECH 2022