Artificial Intelligence › Core AI ›

Multimodal Learning

13057 directly classified papers

Papers per year

Papers

Speaker Direction-of-Arrival Estimation Based on Frequency-Independent Beampattern INTERSPEECH 2017

Weighted Spatial Covariance Matrix Estimation for MUSIC Based TDOA Estimation of Speech Source INTERSPEECH 2017

Time Delay Histogram Based Speech Source Separation Using a Planar Array INTERSPEECH 2017

Analysis of the Relationship Between Prosodic Features of Fillers and its Forms or Occurrence Positions INTERSPEECH 2017

Entrainment in Multi-Party Spoken Dialogues at Multiple Linguistic Levels INTERSPEECH 2017

Social Signal Detection in Spontaneous Dialogue Using Bidirectional LSTM-CTC INTERSPEECH 2017

Recurrent Neural Aligner: An Encoder-Decoder Neural Network Model for Sequence to Sequence Mapping INTERSPEECH 2017

An Investigation of Emotion Dynamics and Kalman Filtering for Speech-Based Emotion Prediction INTERSPEECH 2017

An Affect Prediction Approach Through Depression Severity Parameter Incorporation in Neural Networks INTERSPEECH 2017

Learning Weakly Supervised Multimodal Phoneme Embeddings INTERSPEECH 2017

“Did you laugh enough today?” — Deep Neural Networks for Mobile and Wearable Laughter Trackers INTERSPEECH 2017

The Feeling of Success: Does Touch Sensing Help Predict Grasp Outcomes? CORL 2017

Mapping Instructions and Visual Observations to Actions with Reinforcement Learning EMNLP 2017

Interactive Attention Networks for Aspect-Level Sentiment Classification IJCAI 2017

Natural Language Informs the Interpretation of Iconic Gestures: A Computational Approach IJCNLP 2017

Grounding Abstract Spatial Concepts for Language Interaction with Robots IJCAI 2017

Cross-Modal Analysis Between Phonation Differences and Texture Images Based on Sentiment Correlations INTERSPEECH 2017

Learning Cognitive Features from Gaze Data for Sentiment and Sarcasm Classification using Convolutional Neural Network ACL 2017

Parallel-Data-Free Many-to-Many Voice Conversion Based on DNN Integrated with Eigenspace Using a Non-Parallel Speech Corpus INTERSPEECH 2017

Acoustic Feature Learning via Deep Variational Canonical Correlation Analysis INTERSPEECH 2017

Here’s My Point: Joint Pointer Architecture for Argument Mining EMNLP 2017

Context-Dependent Sentiment Analysis in User-Generated Videos ACL 2017

FOIL it! Find One mismatch between Image and Language caption ACL 2017

Discriminative Bimodal Networks for Visual Localization and Detection With Natural Language Queries CVPR 2017

Draw and Tell: Multimodal Descriptions Outperform Verbal- or Sketch-Only Descriptions in an Image Retrieval Task IJCNLP 2017