Research Explorer
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
About
Methodology
← Learning Types
Deep Learning
›
Learning Types
›
Multi-Modal Learning
3194 directly classified papers
Papers per year
2003: 1
2010: 1
2011: 1
2013: 5
2014: 3
2015: 9
2016: 23
2017: 49
2018: 78
2019: 158
2020: 223
2021: 261
2022: 354
2023: 471
2024: 705
2025: 835
2026: 17
Papers
Super-Class Guided Transformer for Zero-Shot Attribute Classification
AAAI 2025
Retrieval-Augmented Dynamic Prompt Tuning for Incomplete Multimodal Learning
AAAI 2025
Utilizing Vision-Language Models for Detection of Leaf-Based Diseases in Tomatoes
AAAI 2025
ADIEE: Automatic Dataset Creation and Scorer for Instruction-Guided Image Editing Evaluation
ICCV 2025
Large Multi-modal Models Can Interpret Features in Large Multi-modal Models
ICCV 2025
Are VLMs Ready for Autonomous Driving? An Empirical Study from the Reliability, Data and Metric Perspectives
ICCV 2025
Flow4Agent: Long-form Video Understanding via Motion Prior from Optical Flow
ICCV 2025
AIM: Adaptive Inference of Multi-Modal LLMs via Token Merging and Pruning
ICCV 2025
VCRMNER: Visual Cue Refinement in Multimodal NER using CLIP Prompts
COLING 2025
PanSt3R: Multi-view Consistent Panoptic Segmentation
ICCV 2025
4D LangSplat: 4D Language Gaussian Splatting via Multimodal Large Language Models
CVPR 2025
G2SF: Geometry-Guided Score Fusion for Multimodal Industrial Anomaly Detection
ICCV 2025
BottleHumor: Self-Informed Humor Explanation using the Information Bottleneck Principle
ACL 2025
SA-Occ: Satellite-Assisted 3D Occupancy Prediction in Real World
ICCV 2025
Octopus: Alleviating Hallucination via Dynamic Contrastive Decoding
CVPR 2025
CMT: A Cascade MAR with Topology Predictor for Multimodal Conditional CAD Generation
ICCV 2025
Immune: Improving Safety Against Jailbreaks in Multi-modal LLMs via Inference-Time Alignment
CVPR 2025
Derm1M: A Million-scale Vision-Language Dataset Aligned with Clinical Ontology Knowledge for Dermatology
ICCV 2025
CTYUN-AI at SemEval-2025 Task 1: Learning to Rank for Idiomatic Expressions
SEMEVAL 2025
DADM: Dual Alignment of Domain and Modality for Face Anti-spoofing
ICCV 2025
Identifying and Mitigating Position Bias of Multi-image Vision-Language Models
CVPR 2025
ILLUME: Illuminating Your LLMs to See, Draw, and Self-Enhance
ICCV 2025
Joint Vision-Language Social Bias Removal for CLIP
CVPR 2025
ReMP-AD: Retrieval-enhanced Multi-modal Prompt Fusion for Few-Shot Industrial Visual Anomaly Detection
ICCV 2025
What Is That Talk About? A Video-to-Text Summarization Dataset for Scientific Presentations
ACL 2025
<
1
…
21
22
23
…
128
>