Research Explorer
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
About
Methodology
← Learning Types
Machine Learning
›
Learning Types
›
Multi-Modal Learning
1213 directly classified papers
Papers per year
2007: 2
2008: 1
2009: 1
2011: 2
2012: 5
2013: 5
2014: 1
2015: 5
2016: 8
2017: 21
2018: 42
2019: 42
2020: 69
2021: 72
2022: 149
2023: 143
2024: 258
2025: 370
2026: 17
Papers
MPBR: Multimodal Progressive Bidirectional Reasoning for Open-Set Fine-Grained Recognition
ICCV 2025
Hierarchical Divide-and-Conquer Grouping for Classification Adaptation of Pre-Trained Models
ICCV 2025
LIRA: Inferring Segmentation in Large Multi-modal Models with Local Interleaved Region Assistance
ICCV 2025
Multimodal Fusion and Coherence Modeling for Video Topic Segmentation
ACL 2025
Image Conductor: Precision Control for Interactive Video Synthesis
AAAI 2025
AGFSync: Leveraging AI-Generated Feedback for Preference Optimization in Text-to-Image Generation
AAAI 2025
CA-MLIF: Cross-Attention and Multimodal Low-Rank Interaction Fusion Framework for Tumor Prognostic Prediction
AAAI 2025
ReEdit: Multimodal Exemplar-Based Image Editing
WACV 2025
Retrieval Augmented Recipe Generation
WACV 2025
Active Learning for Vision Language Models
WACV 2025
Multispectral Object Detection Enhanced by Cross-Modal Information Complementary and Cosine Similarity Channel Resampling Modules
WACV 2025
Enhancing Novel Object Detection via Cooperative Foundational Models
WACV 2025
Point Cloud Color Upsampling with Attention-Based Coarse Colorization and Refinement
WACV 2025
MetaVIn: Meteorological and Visual Integration for Atmospheric Turbulence Strength Estimation
WACV 2025
DSTR: Dual Scenes Transformer for Cross-Modal Fusion in 3D Object Detection
WACV 2025
Unleashing Potentials of Vision-Language Models for Zero-Shot HOI Detection
WACV 2025
Cross-Aligned Fusion for Multimodal Understanding
WACV 2025
ReFu: Recursive Fusion for Exemplar-Free 3D Class-Incremental Learning
WACV 2025
Optimizing Vision-Language Model for Road Crossing Intention Estimation
WACV 2025
Asymmetric Reinforcing Against Multi-Modal Representation Bias
AAAI 2025
Solar Multimodal Transformer: Intraday Solar Irradiance Predictor using Public Cameras and Time Series
WACV 2025
Semantically Conditioned Prompts for Visual Recognition under Missing Modality Scenarios
WACV 2025
OccFlowNet: Occupancy Estimation via Differentiable Rendering and Occupancy Flow
WACV 2025
Cross-Domain Multi-Modal Few-Shot Object Detection via Rich Text
WACV 2025
From Generation to Detection: A Multimodal Multi-Task Dataset for Benchmarking Health Misinformation
EMNLP 2025
<
1
…
11
12
13
…
49
>