conftrace
_
Papers
Trends
Conferences
Explore
More
Authors
Topics
Keywords
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
← Keywords
multimodal learning
4622 papers
Explore in graph
Co-occurring keywords
large language model
(12755)
vision-language model
(2235)
visual question answering
(1000)
video understanding
(1647)
multi-modal learning
(1276)
contrastive learning
(3979)
representation learning
(6174)
transfer learning
(5442)
zero-shot learning
(3637)
vision language model
(752)
Papers
Quantum Cognitively Motivated Decision Fusion for Video Sentiment Analysis
AAAI 2021
YNU-HPCC at SemEval-2021 Task 6: Combining ALBERT and Text-CNN for Persuasion Detection in Texts and Images
ACL 2021
LayoutLMv2: Multi-modal Pre-training for Visually-rich Document Understanding
ACL 2021
Towards Visual Question Answering on Pathology Images
ACL 2021
Video Question Answering Using Language-Guided Deep Compressed-Domain Video Feature
ICCV 2021
Check It Again:Progressive Visual Question Answering via Visual Entailment
ACL 2021
MELINDA: A Multimodal Dataset for Biomedical Experiment Method Classification
AAAI 2021
VMLoc: Variational Fusion For Learning-Based Multimodal Camera Localization
AAAI 2021
CHEF: Cross-modal Hierarchical Embeddings for Food Domain Retrieval
AAAI 2021
SMIL: Multimodal Learning with Severely Missing Modality
AAAI 2021
MultiMET: A Multimodal Dataset for Metaphor Understanding
ACL 2021
Multimodal Knowledge Expansion
ICCV 2021
Multimodal Item Categorization Fully Based on Transformer
ACL 2021
Dual Compositional Learning in Interactive Image Retrieval
AAAI 2021
Inferring Emotion from Large-scale Internet Voice Data: A Semi-supervised Curriculum Augmentation based Deep Learning Approach
AAAI 2021
Audio-Visual Localization by Synthetic Acoustic Image Generation
AAAI 2021
Exploiting Audio-Visual Consistency with Partial Supervision for Spatial Audio Generation
AAAI 2021
How to leverage the multimodal EHR data for better medical prediction?
EMNLP 2021
WhyAct: Identifying Action Reasons in Lifestyle Vlogs
EMNLP 2021
CLIPScore: A Reference-free Evaluation Metric for Image Captioning
EMNLP 2021
Detecting Propaganda Techniques in Memes
ACL 2021
VLGrammar: Grounded Grammar Induction of Vision and Language
ICCV 2021
An animated picture says at least a thousand words: Selecting Gif-based Replies in Multimodal Dialog
EMNLP 2021
UniMF: A Unified Framework to Incorporate Multimodal Knowledge Bases intoEnd-to-End Task-Oriented Dialogue Systems
IJCAI 2021
Learning Mutual Correlation in Multimodal Transformer for Speech Emotion Recognition
INTERSPEECH 2021
<
1
…
153
154
155
…
185
>