conftrace
_
Papers
Trends
Conferences
Explore
Authors
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
← Core AI
Artificial Intelligence
›
Core AI
›
Multimodal Learning
13,057 papers
Papers per year
2003: 1
2006: 3
2007: 6
2008: 2
2009: 5
2010: 2
2011: 3
2012: 6
2013: 24
2014: 20
2015: 46
2016: 109
2017: 205
2018: 299
2019: 622
2020: 675
2021: 987
2022: 1084
2023: 1697
2024: 2500
2025: 3654
2026: 1107
Papers
Detector-Free Weakly Supervised Grounding by Separation
ICCV 2021
CrossCLR: Cross-Modal Contrastive Learning for Multi-Modal Video Representations
ICCV 2021
TransView: Inside, Outside, and Across the Cropping View Boundaries
ICCV 2021
TRAR: Routing the Attention Spans in Transformer for Visual Question Answering
ICCV 2021
How To Design a Three-Stage Architecture for Audio-Visual Active Speaker Detection in the Wild
ICCV 2021
Just Ask: Learning To Answer Questions From Millions of Narrated Videos
ICCV 2021
UniT: Multimodal Multitask Learning With a Unified Transformer
ICCV 2021
Compressing Visual-Linguistic Model via Knowledge Distillation
ICCV 2021
Telling the What While Pointing to the Where: Multimodal Queries for Image Retrieval
ICCV 2021
Motion Guided Region Message Passing for Video Captioning
ICCV 2021
AD-NeRF: Audio Driven Neural Radiance Fields for Talking Head Synthesis
ICCV 2021
TACo: Token-Aware Cascade Contrastive Learning for Video-Text Alignment
ICCV 2021
Zero-Shot Natural Language Video Localization
ICCV 2021
MDETR - Modulated Detection for End-to-End Multi-Modal Understanding
ICCV 2021
STVGBert: A Visual-Linguistic Transformer Based Framework for Spatio-Temporal Video Grounding
ICCV 2021
Explain Me the Painting: Multi-Topic Knowledgeable Art Description Generation
ICCV 2021
LapsCore: Language-Guided Person Search via Color Reasoning
ICCV 2021
Multimodal Clustering Networks for Self-Supervised Learning From Unlabeled Videos
ICCV 2021
Multi-Modality Associative Bridging Through Memory: Speech Sound Recollected From Face Video
ICCV 2021
Move2Hear: Active Audio-Visual Source Separation
ICCV 2021
Three Steps to Multimodal Trajectory Prediction: Modality Clustering, Classification and Synthesis
ICCV 2021
Consistency-Aware Graph Network for Human Interaction Understanding
ICCV 2021
Benchmark Platform for Ultra-Fine-Grained Visual Categorization Beyond Human Performance
ICCV 2021
GLoRIA: A Multimodal Global-Local Representation Learning Framework for Label-Efficient Medical Image Recognition
ICCV 2021
Summarize and Search: Learning Consensus-Aware Dynamic Convolution for Co-Saliency Detection
ICCV 2021
<
1
…
423
424
425
…
523
>