conftrace
_
Papers
Trends
Conferences
Explore
Authors
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
← Core AI
Artificial Intelligence
›
Core AI
›
Multimodal Learning
13,057 papers
Papers per year
2003: 1
2006: 3
2007: 6
2008: 2
2009: 5
2010: 2
2011: 3
2012: 6
2013: 24
2014: 20
2015: 46
2016: 109
2017: 205
2018: 299
2019: 622
2020: 675
2021: 987
2022: 1084
2023: 1697
2024: 2500
2025: 3654
2026: 1107
Papers
RoboBrain: A Unified Brain Model for Robotic Manipulation from Abstract to Concrete
CVPR 2025
Enhancing Few-Shot Class-Incremental Learning via Training-Free Bi-Level Modality Calibration
CVPR 2025
Hybrid-Level Instruction Injection for Video Token Compression in Multi-modal Large Language Models
CVPR 2025
Mamba as a Bridge: Where Vision Foundation Models Meet Vision Language Models for Domain-Generalized Semantic Segmentation
CVPR 2025
LOGICZSL: Exploring Logic-induced Representation for Compositional Zero-shot Learning
CVPR 2025
MLLM-as-a-Judge for Image Safety without Human Labeling
CVPR 2025
CoT-VLA: Visual Chain-of-Thought Reasoning for Vision-Language-Action Models
CVPR 2025
Forensics Adapter: Adapting CLIP for Generalizable Face Forgery Detection
CVPR 2025
Geometry-guided Online 3D Video Synthesis with Multi-View Temporal Consistency
CVPR 2025
Lost in Translation, Found in Context: Sign Language Translation with Contextual Cues
CVPR 2025
Semantic-guided Cross-Modal Prompt Learning for Skeleton-based Zero-shot Action Recognition
CVPR 2025
ChatGen: Automatic Text-to-Image Generation From FreeStyle Chatting
CVPR 2025
VEU-Bench: Towards Comprehensive Understanding of Video Editing
CVPR 2025
Touch2Shape: Touch-Conditioned 3D Diffusion for Shape Exploration and Reconstruction
CVPR 2025
MicroVQA: A Multimodal Reasoning Benchmark for Microscopy-Based Scientific Research
CVPR 2025
CustAny: Customizing Anything from A Single Example
CVPR 2025
3D-LLaVA: Towards Generalist 3D LMMs with Omni Superpoint Transformer
CVPR 2025
VidHalluc: Evaluating Temporal Hallucinations in Multimodal Large Language Models for Video Understanding
CVPR 2025
StageDesigner: Artistic Stage Generation for Scenography via Theater Scripts
CVPR 2025
Align-A-Video: Deterministic Reward Tuning of Image Diffusion Models for Consistent Video Editing
CVPR 2025
Interpreting Object-level Foundation Models via Visual Precision Search
CVPR 2025
LION-FS: Fast & Slow Video-Language Thinker as Online Video Assistant
CVPR 2025
Motion-Grounded Video Reasoning: Understanding and Perceiving Motion at Pixel Level
CVPR 2025
GPAvatar: High-fidelity Head Avatars by Learning Efficient Gaussian Projections
CVPR 2025
CAV-MAE Sync: Improving Contrastive Audio-Visual Mask Autoencoders via Fine-Grained Alignment
CVPR 2025
<
1
…
94
95
96
…
523
>