Research Explorer
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
About
Methodology
← Learning Types
Machine Learning
›
Learning Types
›
Multi-Modal Learning
1213 directly classified papers
Papers per year
2007: 2
2008: 1
2009: 1
2011: 2
2012: 5
2013: 5
2014: 1
2015: 5
2016: 8
2017: 21
2018: 42
2019: 42
2020: 69
2021: 72
2022: 149
2023: 143
2024: 258
2025: 370
2026: 17
Papers
Reliable Conflictive Multi-View Learning
AAAI 2024
LAFA: Multimodal Knowledge Graph Completion with Link Aware Fusion and Aggregation
AAAI 2024
Diff2Lip: Audio Conditioned Diffusion Models for Lip-Synchronization
WACV 2024
SURER: Structure-Adaptive Unified Graph Neural Network for Multi-View Clustering
AAAI 2024
Improving Subject-Driven Image Synthesis with Subject-Agnostic Guidance
CVPR 2024
Little Red Riding Hood Goes around the Globe: Crosslingual Story Planning and Generation with Large Language Models
COLING 2024
Knowledge-Guided Cross-Topic Visual Question Generation
COLING 2024
Improving Personalized Sentiment Representation with Knowledge-enhanced and Parameter-efficient Layer Normalization
COLING 2024
LGMRec: Local and Global Graph Learning for Multimodal Recommendation
AAAI 2024
Enhancing Multi-View Pedestrian Detection Through Generalized 3D Feature Pulling
WACV 2024
ActionIE: Action Extraction from Scientific Literature with Programming Languages
ACL 2024
Language-driven Grasp Detection
CVPR 2024
Eyes Wide Shut? Exploring the Visual Shortcomings of Multimodal LLMs
CVPR 2024
Novel Class Discovery in Chest X-rays via Paired Images and Text
AAAI 2024
Unveiling Implicit Deceptive Patterns in Multi-Modal Fake News via Neuro-Symbolic Reasoning
AAAI 2024
Cross-spectral Gated-RGB Stereo Depth Estimation
CVPR 2024
Language-aware Visual Semantic Distillation for Video Question Answering
CVPR 2024
Towards Surveillance Video-and-Language Understanding: New Dataset Baselines and Challenges
CVPR 2024
BodyMAP - Jointly Predicting Body Mesh and 3D Applied Pressure Map for People in Bed
CVPR 2024
Semantic Fusion Augmentation and Semantic Boundary Detection: A Novel Approach to Multi-Target Video Moment Retrieval
WACV 2024
Controllable Text-to-Image Synthesis for Multi-Modality MR Images
WACV 2024
Ranking Distillation for Open-Ended Video Question Answering with Insufficient Labels
CVPR 2024
FELGA: Unsupervised Fragment Embedding for Fine-Grained Cross-Modal Association
WACV 2024
BirdSAT: Cross-View Contrastive Masked Autoencoders for Bird Species Classification and Mapping
WACV 2024
RELI11D: A Comprehensive Multimodal Human Motion Dataset and Method
CVPR 2024
<
1
…
21
22
23
…
49
>