Research Explorer
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
About
Methodology
← Learning Types
Deep Learning
›
Learning Types
›
Multi-Modal Learning
3194 directly classified papers
Papers per year
2003: 1
2010: 1
2011: 1
2013: 5
2014: 3
2015: 9
2016: 23
2017: 49
2018: 78
2019: 158
2020: 223
2021: 261
2022: 354
2023: 471
2024: 705
2025: 835
2026: 17
Papers
Adaptive Transformers for Learning Multimodal Representations
ACL 2020
Let Me Choose: From Verbal Context to Font Selection
ACL 2020
Cross-Modality Relevance for Reasoning on Language and Vision
ACL 2020
Improving Image Captioning with Better Use of Caption
ACL 2020
Multimodal Neural Graph Memory Networks for Visual Question Answering
ACL 2020
Aligned Dual Channel Graph Convolutional Network for Visual Question Answering
ACL 2020
Words Aren’t Enough, Their Order Matters: On the Robustness of Grounding Visual Referring Expressions
ACL 2020
Knowledge Supports Visual Language Grounding: A Case Study on Colour Terms
ACL 2020
Cross-modal Coherence Modeling for Caption Generation
ACL 2020
Amalgamation of protein sequence, structure and textual information for improving protein-protein interaction identification
ACL 2020
Shaping Visual Representations with Language for Few-Shot Classification
ACL 2020
Multimodal Transformer for Multimodal Machine Translation
ACL 2020
Enhancing Pre-trained Chinese Character Representation with Word-aligned Attention
ACL 2020
Improving Multimodal Named Entity Recognition via Entity Span Detection with Unified Multimodal Transformer
ACL 2020
A Novel Graph-based Multi-modal Fusion Encoder for Neural Machine Translation
ACL 2020
Glyph2Vec: Learning Chinese Out-of-Vocabulary Word Embedding from Glyphs
ACL 2020
COCAS: A Large-Scale Clothes Changing Person Dataset for Re-Identification
CVPR 2020
Cross-Modal Deep Face Normals With Deactivable Skip Connections
CVPR 2020
A Shared Multi-Attention Framework for Multi-Label Zero-Shot Learning
CVPR 2020
TA-Student VQA: Multi-Agents Training by Self-Questioning
CVPR 2020
In Defense of Grid Features for Visual Question Answering
CVPR 2020
End-to-End Adversarial-Attention Network for Multi-Modal Clustering
CVPR 2020
Multimodal Categorization of Crisis Events in Social Media
CVPR 2020
Music Gesture for Visual Sound Separation
CVPR 2020
Recognizing Objects From Any View With Object and Viewer-Centered Representations
CVPR 2020
<
1
…
110
111
112
…
128
>