Multimodal Learning
13,185 papers
Papers per year
1
3
6
2
5
2
3
6
24
20
46
109
205
299
622
675
987
1084
1697
2500
3655
1234
'10
'15
'20
'25
Papers
Target Speaker Extraction with Curriculum Learning
INTERSPEECH 2024
PitchFlow: adding pitch control to a Flow-matching based TTS model
INTERSPEECH 2024
Leveraging Language Model Capabilities for Sound Event Detection
INTERSPEECH 2024