conftrace_

Artificial Intelligence › Core AI ›

Multimodal Learning

13,057 papers

Papers per year

1

3

6

2

5

2

3

6

24

20

46

109

205

299

622

675

987

1084

1697

2500

3654

1107

'10

'15

'20

'25

Papers

Learning Audio-Text Agreement for Open-vocabulary Keyword Spotting INTERSPEECH 2022

Application for Real-time Personalized Speaker Extraction INTERSPEECH 2022

Audio-Visual Domain Adaptation Feature Fusion for Speech Emotion Recognition INTERSPEECH 2022

Exploiting Fine-tuning of Self-supervised Learning Models for Improving Bi-modal Sentiment Analysis and Emotion Recognition INTERSPEECH 2022

Context-aware Multimodal Fusion for Emotion Recognition INTERSPEECH 2022

Unsupervised Instance Discriminative Learning for Depression Detection from Speech Signals INTERSPEECH 2022

Automated Detection of Wilson’s Disease Based on Improved Mel-frequency Cepstral Coefficients with Signal Decomposition INTERSPEECH 2022

Automated Voice Pathology Discrimination from Continuous Speech Benefits from Analysis by Phonetic Context INTERSPEECH 2022

A Multimodal Strategy for Singing Language Identification INTERSPEECH 2022

Speaker Trait Enhancement for Cochlear Implant Users: A Case Study for Speaker Emotion Perception INTERSPEECH 2022

Comparison of Models for Detecting Off-Putting Speaking Styles INTERSPEECH 2022

Text-driven Emotional Style Control and Cross-speaker Style Transfer in Neural TTS INTERSPEECH 2022

Deep CNN-based Inductive Transfer Learning for Sarcasm Detection in Speech INTERSPEECH 2022

End-to-End Text-to-Speech Based on Latent Representation of Speaking Styles Using Spontaneous Dialogue INTERSPEECH 2022

Data-augmented cross-lingual synthesis in a teacher-student framework INTERSPEECH 2022

SoundDoA: Learn Sound Source Direction of Arrival and Semantics from Sound Raw Waveforms INTERSPEECH 2022

Visually-aware Acoustic Event Detection using Heterogeneous Graphs INTERSPEECH 2022

Analyzing the impact of SARS-CoV-2 variants on respiratory sound signals INTERSPEECH 2022

Personalized Acoustic Echo Cancellation for Full-duplex Communications INTERSPEECH 2022

Improving Spoken Language Understanding with Cross-Modal Contrastive Learning INTERSPEECH 2022

Speech2Slot: A Limited Generation Framework with Boundary Detection for Slot Filling from Speech INTERSPEECH 2022

On Breathing Pattern Information in Synthetic Speech INTERSPEECH 2022

Attacker Attribution of Audio Deepfakes INTERSPEECH 2022

Word Discovery in Visually Grounded, Self-Supervised Speech Models INTERSPEECH 2022

Detecting Heart Failure Through Voice Analysis using Self-Supervised Mode-Based Memory Fusion INTERSPEECH 2022