Research Explorer
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
About
Methodology
← Learning Types
Deep Learning
›
Learning Types
›
Multi-Modal Learning
3194 directly classified papers
Papers per year
2003: 1
2010: 1
2011: 1
2013: 5
2014: 3
2015: 9
2016: 23
2017: 49
2018: 78
2019: 158
2020: 223
2021: 261
2022: 354
2023: 471
2024: 705
2025: 835
2026: 17
Papers
LPCG: A Self-conditional Architecture for Labeled Point Cloud Generation
AAAI 2025
PoseLLaVA: Pose Centric Multimodal LLM for Fine-Grained 3D Pose Manipulation
AAAI 2025
SSLFusion: Scale and Space Aligned Latent Fusion Model for Multimodal 3D Object Detection
AAAI 2025
HeightMapNet: Explicit Height Modeling for End-to-End HD Map Learning
WACV 2025
L4DR: LiDAR-4DRadar Fusion for Weather-Robust 3D Object Detection
AAAI 2025
Asymmetric Reinforcing Against Multi-Modal Representation Bias
AAAI 2025
Multimodal Fusion Using Multi-View Domains for Data Heterogeneity in Federated Learning
AAAI 2025
Multi-Resolution Guided 3D GANs for Medical Image Translation
WACV 2025
CryoDomain: Sequence-free Protein Domain Identification from Low-resolution Cryo-EM Density Maps
AAAI 2025
Preserve or Modify? Context-Aware Evaluation for Balancing Preservation and Modification in Text-Guided Image Editing
CVPR 2025
Learning Dynamic Similarity by Bidirectional Hierarchical Sliding Semantic Probe for Efficient Text Video Retrieval
AAAI 2025
Event-Guided Fusion-Mamba for Context-Aware 3D Human Pose Estimation
WACV 2025
Retrieval-Augmented Dynamic Prompt Tuning for Incomplete Multimodal Learning
AAAI 2025
CTPD: Cross-Modal Temporal Pattern Discovery for Enhanced Multimodal Electronic Health Records Analysis
ACL 2025
From Specific-MLLMs to Omni-MLLMs: A Survey on MLLMs Aligned with Multi-modalities
ACL 2025
Combining Inherent Knowledge of Vision-Language Models with Unsupervised Domain Adaptation through Strong-Weak Guidance
WACV 2025
mmFAS: Multimodal Face Anti-Spoofing Using Multi-Level Alignment and Switch-Attention Fusion
AAAI 2025
MVL-SIB: A Massively Multilingual Vision-Language Benchmark for Cross-Modal Topical Matching
ACL 2025
VP-MEL: Visual Prompts Guided Multimodal Entity Linking
ACL 2025
Click&Describe: Multimodal Grounding and Tracking for Aerial Objects
WACV 2025
BottleHumor: Self-Informed Humor Explanation using the Information Bottleneck Principle
ACL 2025
CPIQA: Climate Paper Image Question Answering Dataset for Retrieval-Augmented Generation with Context-based Query Expansion
ACL 2025
Team INSAntive at SlavicNLP-2025 Shared Task: Data Augmentation and Enhancement via Explanations for Persuasion Technique Classification
ACL 2025
PrevPredMap: Exploring Temporal Modeling with Previous Predictions for Online Vectorized HD Map Construction
WACV 2025
AdaDARE-gamma: Balancing Stability and Plasticity in Multi-modal LLMs through Efficient Adaptation
CVPR 2025
<
1
…
15
16
17
…
128
>