Research Explorer
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
About
Methodology
← Core AI
Artificial Intelligence
›
Core AI
›
Multimodal Learning
13057 directly classified papers
Papers per year
2003: 1
2006: 3
2007: 6
2008: 2
2009: 5
2010: 2
2011: 3
2012: 6
2013: 24
2014: 20
2015: 46
2016: 109
2017: 205
2018: 299
2019: 622
2020: 675
2021: 987
2022: 1084
2023: 1697
2024: 2500
2025: 3654
2026: 1107
Papers
Cross-Modal Subspace Clustering via Deep Canonical Correlation Analysis
AAAI 2020
Semi-Supervised Multi-Modal Learning with Balanced Spectral Decomposition
AAAI 2020
EEMEFN: Low-Light Image Enhancement via Edge-Enhanced Multi-Exposure Fusion Network
AAAI 2020
Rethinking the Bottom-Up Framework for Query-Based Video Localization
AAAI 2020
Bilinear Attention Networks for Person Retrieval
ICCV 2019
DenseRaC: Joint 3D Pose and Shape Estimation by Dense Render-and-Compare
ICCV 2019
Learning to Reconstruct 3D Manhattan Wireframes From a Single Image
ICCV 2019
Unpaired Image-to-Speech Synthesis With Multimodal Information Bottleneck
ICCV 2019
View-LSTM: Novel-View Video Synthesis Through View Decomposition
ICCV 2019
VideoBERT: A Joint Model for Video and Language Representation Learning
ICCV 2019
Transferable Representation Learning in Vision-and-Language Navigation
ICCV 2019
Deep Single-Image Portrait Relighting
ICCV 2019
FSGAN: Subject Agnostic Face Swapping and Reenactment
ICCV 2019
Photo-Realistic Monocular Gaze Redirection Using Generative Adversarial Networks
ICCV 2019
ViSiL: Fine-Grained Spatio-Temporal Video Similarity Learning
ICCV 2019
Uncertainty-Aware Audiovisual Activity Recognition Using Deep Bayesian Variational Inference
ICCV 2019
Language-Agnostic Visual-Semantic Embeddings
ICCV 2019
Controllable Attention for Structured Layered Video Decomposition
ICCV 2019
Understanding Human Gaze Communication by Spatio-Temporal Graph Reasoning
ICCV 2019
Human Mesh Recovery From Monocular Images via a Skeleton-Disentangled Representation
ICCV 2019
SILCO: Show a Few Images, Localize the Common Object
ICCV 2019
Attention on Attention for Image Captioning
ICCV 2019
From Strings to Things: Knowledge-Enabled VQA Model That Can Read and Reason
ICCV 2019
G3raphGround: Graph-Based Language Grounding
ICCV 2019
Sequential Latent Spaces for Modeling the Intention During Diverse Image Captioning
ICCV 2019
<
1
…
468
469
470
…
523
>