conftrace
_
Papers
Trends
Conferences
Explore
Authors
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
← Core AI
Artificial Intelligence
›
Core AI
›
Multimodal Learning
13,057 papers
Papers per year
2003: 1
2006: 3
2007: 6
2008: 2
2009: 5
2010: 2
2011: 3
2012: 6
2013: 24
2014: 20
2015: 46
2016: 109
2017: 205
2018: 299
2019: 622
2020: 675
2021: 987
2022: 1084
2023: 1697
2024: 2500
2025: 3654
2026: 1107
Papers
Modality Shifting Attention Network for Multi-Modal Video Question Answering
CVPR 2020
Vision-Language Navigation With Self-Supervised Auxiliary Reasoning Tasks
CVPR 2020
Hypergraph Attention Networks for Multimodal Learning
CVPR 2020
ALFRED: A Benchmark for Interpreting Grounded Instructions for Everyday Tasks
CVPR 2020
Straight to the Point: Fast-Forwarding Videos via Reinforcement Learning Using Textual Data
CVPR 2020
Video Object Grounding Using Semantic Roles in Language Description
CVPR 2020
Adaptive Hierarchical Down-Sampling for Point Cloud Classification
CVPR 2020
Sign Language Transformers: Joint End-to-End Sign Language Recognition and Translation
CVPR 2020
Cross-Modality Person Re-Identification With Shared-Specific Feature Transfer
CVPR 2020
SQuINTing at VQA Models: Introspecting VQA Models With Sub-Questions
CVPR 2020
What Makes Training Multi-Modal Classification Networks Hard?
CVPR 2020
Dynamic Convolution: Attention Over Convolution Kernels
CVPR 2020
Intelligent Home 3D: Automatic 3D-House Design From Linguistic Descriptions Only
CVPR 2020
MCEN: Bridging Cross-Modal Gap between Cooking Recipes and Dish Images with Latent Variable Model
CVPR 2020
Uncertainty-Aware Score Distribution Learning for Action Quality Assessment
CVPR 2020
Seeing Through Fog Without Seeing Fog: Deep Multimodal Sensor Fusion in Unseen Adverse Weather
CVPR 2020
Vision-Dialog Navigation by Exploring Cross-Modal Memory
CVPR 2020
Telling Left From Right: Learning Spatial Correspondence of Sight and Sound
CVPR 2020
Towards Accurate Scene Text Recognition With Semantic Reasoning Networks
CVPR 2020
Deep Relational Reasoning Graph Network for Arbitrary Shape Text Detection
CVPR 2020
On the General Value of Evidence, and Bilingual Scene-Text Visual Question Answering
CVPR 2020
REVERIE: Remote Embodied Visual Referring Expression in Real Indoor Environments
CVPR 2020
Multi-Modal Graph Neural Network for Joint Reasoning on Vision and Scene Text
CVPR 2020
Iterative Context-Aware Graph Inference for Visual Dialog
CVPR 2020
Sketchformer: Transformer-Based Representation for Sketched Structure
CVPR 2020
<
1
…
455
456
457
…
523
>