Research Explorer
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
About
Methodology
← Learning Types
Deep Learning
›
Learning Types
›
Multi-Modal Learning
3194 directly classified papers
Papers per year
2003: 1
2010: 1
2011: 1
2013: 5
2014: 3
2015: 9
2016: 23
2017: 49
2018: 78
2019: 158
2020: 223
2021: 261
2022: 354
2023: 471
2024: 705
2025: 835
2026: 17
Papers
Multimodal Prior Learning with Double Constraint Alignment for Snapshot Spectral Compressive Imaging
IJCAI 2025
Latency Robust Cooperative Perception using Asynchronous Feature Fusion
WACV 2025
Music-Aligned Holistic 3D Dance Generation via Hierarchical Motion Modeling
ICCV 2025
SyncViolinist: Music-Oriented Violin Motion Generation Based on Bowing and Fingering
WACV 2025
SemTalk: Holistic Co-speech Motion Generation with Frame-level Semantic Emphasis
ICCV 2025
Deduce and Select Evidences with Language Models for Training-Free Video Goal Inference
WACV 2025
Instruction-Grounded Visual Projectors for Continual Learning of Generative Vision-Language Models
ICCV 2025
VMAs: Video-to-Music Generation via Semantic Alignment in Web Music Videos
WACV 2025
Scaling Omni-modal Pretraining with Multimodal Context: Advancing Universal Representation Learning Across Modalities
ICCV 2025
VideoGameBunny: Towards Vision Assistants for Video Games
WACV 2025
FedMVP: Federated Multimodal Visual Prompt Tuning for Vision-Language Models
ICCV 2025
GEXIA: Granularity Expansion and Iterative Approximation for Scalable Multi-Grained Video-Language Learning
WACV 2025
SketchAgent: Generating Structured Diagrams from Hand-Drawn Sketches
IJCAI 2025
HeightMapNet: Explicit Height Modeling for End-to-End HD Map Learning
WACV 2025
Rethinking Multi-modal Object Detection from the Perspective of Mono-Modality Feature Learning
ICCV 2025
Multi-Resolution Guided 3D GANs for Medical Image Translation
WACV 2025
Preserve or Modify? Context-Aware Evaluation for Balancing Preservation and Modification in Text-Guided Image Editing
CVPR 2025
Event-Guided Fusion-Mamba for Context-Aware 3D Human Pose Estimation
WACV 2025
Decoupling and Reconstructing: A Multimodal Sentiment Analysis Framework Towards Robustness
IJCAI 2025
Combining Inherent Knowledge of Vision-Language Models with Unsupervised Domain Adaptation through Strong-Weak Guidance
WACV 2025
Connecting Giants: Synergistic Knowledge Transfer of Large Multimodal Models for Few-Shot Learning
IJCAI 2025
Click&Describe: Multimodal Grounding and Tracking for Aerial Objects
WACV 2025
BMIP: Bi-directional Modality Interaction Prompt Learning for VLM
IJCAI 2025
PrevPredMap: Exploring Temporal Modeling with Previous Predictions for Online Vectorized HD Map Construction
WACV 2025
Learning Dynamic Similarity by Bidirectional Hierarchical Sliding Semantic Probe for Efficient Text Video Retrieval
AAAI 2025
<
1
…
7
8
9
…
128
>