Artificial Intelligence › Core AI ›

Multimodal Learning

13057 directly classified papers

Papers per year

Papers

Empirical Interpretation of Speech Emotion Perception with Attention Based Model for Speech Emotion Recognition INTERSPEECH 2020

An Efficient Subband Linear Prediction for LPCNet-Based Neural Synthesis INTERSPEECH 2020

Evaluating Automatically Generated Phoneme Captions for Images INTERSPEECH 2020

Risk Forecasting from Earnings Calls Acoustics and Network Correlations INTERSPEECH 2020

Dual-Path Transformer Network: Direct Context-Aware Modeling for End-to-End Monaural Speech Separation INTERSPEECH 2020

Advancing Multiple Instance Learning with Attention Modeling for Categorical Speech Emotion Recognition INTERSPEECH 2020

Predicting Collaborative Task Performance Using Graph Interlocutor Acoustic Network in Small Group Interaction INTERSPEECH 2020

Audio-Visual Multi-Speaker Tracking Based on the GLMB Framework INTERSPEECH 2020

Multi-Modal Fusion with Gating Using Audio, Lexical and Disfluency Features for Alzheimer’s Dementia Recognition from Spontaneous Speech INTERSPEECH 2020

A Comparison of Acoustic and Linguistics Methodologies for Alzheimer’s Dementia Recognition INTERSPEECH 2020

Tackling the ADReSS Challenge: A Multimodal Approach to the Automated Recognition of Alzheimer’s Dementia INTERSPEECH 2020

A Dynamic 3D Pronunciation Teaching Model Based on Pronunciation Attributes and Anatomy INTERSPEECH 2020

Recognition-Synthesis Based Non-Parallel Voice Conversion with Adversarial Learning INTERSPEECH 2020

PEIA: Personality and Emotion Integrated Attentive Model for Music Recommendation on Social Media Platforms AAAI 2020

DATA-GRU: Dual-Attention Time-Aware Gated Recurrent Unit for Irregular Multivariate Time Series AAAI 2020

Spatio-Temporal Graph Structure Learning for Traffic Forecasting AAAI 2020

Combining Real-Time Segmentation and Classification of Rehabilitation Exercises with LSTM Networks and Pointwise Boosting AAAI 2020

Cross-Modal Attention Network for Temporal Inconsistent Audio-Visual Event Localization AAAI 2020

Towards Omni-Supervised Face Alignment for Large Scale Unlabeled Videos AAAI 2020

HMSid and HMSid2 at PARSEME Shared Task 2020: Computational Corpus Linguistics and unseen-in-training MWEs COLING 2020

Generating Well-Formed Answers by Machine Reading with Stochastic Selector Networks AAAI 2020

NIT-Agartala-NLP-Team at SemEval-2020 Task 8: Building Multimodal Classifiers to Tackle Internet Humor COLING 2020

I Know Where You Are Coming From: On the Impact of Social Media Sources on AI Model Performance (Student Abstract) AAAI 2020

Instance-Adaptive Graph for EEG Emotion Recognition AAAI 2020

DCMN+: Dual Co-Matching Network for Multi-Choice Reading Comprehension AAAI 2020