conftrace
_
Papers
Trends
Conferences
Explore
Authors
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
← Core AI
Artificial Intelligence
›
Core AI
›
Multimodal Learning
13,057 papers
Papers per year
2003: 1
2006: 3
2007: 6
2008: 2
2009: 5
2010: 2
2011: 3
2012: 6
2013: 24
2014: 20
2015: 46
2016: 109
2017: 205
2018: 299
2019: 622
2020: 675
2021: 987
2022: 1084
2023: 1697
2024: 2500
2025: 3654
2026: 1107
Papers
Dynamic Refinement Network for Oriented and Densely Packed Object Detection
CVPR 2020
Universal Weighting Metric Learning for Cross-Modal Matching
CVPR 2020
PhraseCut: Language-Based Image Segmentation in the Wild
CVPR 2020
Learning User Representations for Open Vocabulary Image Hashtag Prediction
CVPR 2020
DAVD-Net: Deep Audio-Aided Video Decompression of Talking Heads
CVPR 2020
Referring Image Segmentation via Cross-Modal Progressive Comprehension
CVPR 2020
The Garden of Forking Paths: Towards Multi-Future Trajectory Prediction
CVPR 2020
Fine-Grained Video-Text Retrieval With Hierarchical Graph Reasoning
CVPR 2020
Q-learning with Language Model for Edit-based Unsupervised Summarization
EMNLP 2020
Learning to Represent Image and Text with Denotation Graph
EMNLP 2020
Does my multimodal model learn cross-modal interactions? It’s harder to tell than you might think!
EMNLP 2020
Reading Between the Lines: Exploring Infilling in Visual Narratives
EMNLP 2020
Hashtags, Emotions, and Comments: A Large-Scale Dataset to Understand Fine-Grained Social Emotions to Online Topics
EMNLP 2020
Table Fact Verification with Structure-Aware Transformer
EMNLP 2020
Multimodal Routing: Improving Local and Global Interpretability of Multimodal Language Analysis
EMNLP 2020
Multistage Fusion with Forget Gate for Multimodal Summarization in Open-Domain Videos
EMNLP 2020
BiST: Bi-directional Spatio-Temporal Reasoning for Video-Grounded Dialogues
EMNLP 2020
Domain-Specific Lexical Grounding in Noisy Visual-Textual Documents
EMNLP 2020
HERO: Hierarchical Encoder for Video+Language Omni-representation Pre-training
EMNLP 2020
Vokenization: Improving Language Understanding with Contextualized, Visual-Grounded Supervision
EMNLP 2020
Detecting Cross-Modal Inconsistency to Defend Against Neural Fake News
EMNLP 2020
Multimodal Joint Attribute Prediction and Value Extraction for E-commerce Product
EMNLP 2020
Neural Deepfake Detection with Factual Structure of Text
EMNLP 2020
TED-CDB: A Large-Scale Chinese Discourse Relation Dataset on TED Talks
EMNLP 2020
STL-CQA: Structure-based Transformers with Localization and Encoding for Chart Question Answering
EMNLP 2020
<
1
…
456
457
458
…
523
>