Research Explorer
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
About
Methodology
← Learning Types
Deep Learning
›
Learning Types
›
Multi-Modal Learning
3194 directly classified papers
Papers per year
2003: 1
2010: 1
2011: 1
2013: 5
2014: 3
2015: 9
2016: 23
2017: 49
2018: 78
2019: 158
2020: 223
2021: 261
2022: 354
2023: 471
2024: 705
2025: 835
2026: 17
Papers
ERNIE-Layout: Layout Knowledge Enhanced Pre-training for Visually-rich Document Understanding
EMNLP 2022
Cross-Modal Mutual Learning for Audio-Visual Speech Recognition and Manipulation
AAAI 2022
L-CoDe:Language-Based Colorization Using Color-Object Decoupled Conditions
AAAI 2022
Interact, Embed, and EnlargE: Boosting Modality-Specific Representations for Multi-Modal Person Re-identification
AAAI 2022
DOC2PPT: Automatic Presentation Slides Generation from Scientific Documents
AAAI 2022
VPAI_Lab at MedVidQA 2022: A Two-Stage Cross-modal Fusion Method for Medical Instructional Video Classification
ACL 2022
Comprehensive Multi-Modal Interactions for Referring Image Segmentation
ACL 2022
One Agent To Rule Them All: Towards Multi-agent Conversational AI
ACL 2022
XFUND: A Benchmark Dataset for Multilingual Visually Rich Form Understanding
ACL 2022
UNIMO-2: End-to-End Unified Vision-Language Grounded Learning
ACL 2022
Interpreting Gender Bias in Neural Machine Translation: Multilingual Architecture Matters
AAAI 2022
Self-Supervised Audio-and-Text Pre-training with Extremely Low-Resource Parallel Data
AAAI 2022
Team IITP-AINLPML at WASSA 2022: Empathy Detection, Emotion Classification and Personality Detection
ACL 2022
SSNCSE NLP@TamilNLP-ACL2022: Transformer based approach for detection of abusive comment for Tamil language
ACL 2022
Understanding Attention for Vision-and-Language Tasks
COLING 2022
On Guiding Visual Attention With Language Specification
CVPR 2022
M5Product: Self-Harmonized Contrastive Learning for E-Commercial Multi-Modal Pretraining
CVPR 2022
CAT-Det: Contrastively Augmented Transformer for Multi-Modal 3D Object Detection
CVPR 2022
Breaking Down Multilingual Machine Translation
ACL 2022
Assessing Multilingual Fairness in Pre-trained Multimodal Representations
ACL 2022
Modality-specific Learning Rates for Effective Multimodal Additive Late-fusion
ACL 2022
What do Models Learn From Training on More Than Text? Measuring Visual Commonsense Knowledge
ACL 2022
Data Augmented 3D Semantic Scene Completion With 2D Segmentation Priors
WACV 2022
Improving Single-Image Defocus Deblurring: How Dual-Pixel Images Help Through Multi-Task Learning
WACV 2022
DG-Labeler and DGL-MOTS Dataset: Boost the Autonomous Driving Perception
WACV 2022
<
1
…
90
91
92
…
128
>