Research Explorer
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
About
Methodology
← Learning Types
Machine Learning
›
Learning Types
›
Multi-Modal Learning
1213 directly classified papers
Papers per year
2007: 2
2008: 1
2009: 1
2011: 2
2012: 5
2013: 5
2014: 1
2015: 5
2016: 8
2017: 21
2018: 42
2019: 42
2020: 69
2021: 72
2022: 149
2023: 143
2024: 258
2025: 370
2026: 17
Papers
Boosting Vision-Language Models with Transduction
NIPS 2024
Listenable Maps for Zero-Shot Audio Classifiers
NIPS 2024
MM-WLAuslan: Multi-View Multi-Modal Word-Level Australian Sign Language Recognition Dataset
NIPS 2024
VisionLLM v2: An End-to-End Generalist Multimodal Large Language Model for Hundreds of Vision-Language Tasks
NIPS 2024
LaSe-E2V: Towards Language-guided Semantic-aware Event-to-Video Reconstruction
NIPS 2024
MambaTree: Tree Topology is All You Need in State Space Model
NIPS 2024
Continual Audio-Visual Sound Separation
NIPS 2024
Conjugated Semantic Pool Improves OOD Detection with Pre-trained Vision-Language Models
NIPS 2024
CultureLLM: Incorporating Cultural Differences into Large Language Models
NIPS 2024
Aligning Vision Models with Human Aesthetics in Retrieval: Benchmarks and Algorithms
NIPS 2024
Mind's Eye of LLMs: Visualization-of-Thought Elicits Spatial Reasoning in Large Language Models
NIPS 2024
MECD: Unlocking Multi-Event Causal Discovery in Video Reasoning
NIPS 2024
Boosting Weakly Supervised Referring Image Segmentation via Progressive Comprehension
NIPS 2024
Omnipotent Distillation with LLMs for Weakly-Supervised Natural Language Video Localization: When Divergence Meets Consistency
AAAI 2024
Federated Modality-Specific Encoders and Multimodal Anchors for Personalized Brain Tumor Segmentation
AAAI 2024
Learning Multi-Modal Cross-Scale Deformable Transformer Network for Unregistered Hyperspectral Image Super-resolution
AAAI 2024
Joint Demosaicing and Denoising for Spike Camera
AAAI 2024
Context-I2W: Mapping Images to Context-Dependent Words for Accurate Zero-Shot Composed Image Retrieval
AAAI 2024
Unifying Multi-Modal Uncertainty Modeling and Semantic Alignment for Text-to-Image Person Re-identification
AAAI 2024
Uncertainty-Aware Yield Prediction with Multimodal Molecular Features
AAAI 2024
FT-GAN: Fine-Grained Tune Modeling for Chinese Opera Synthesis
AAAI 2024
Transformer-Empowered Multi-Modal Item Embedding for Enhanced Image Search in E-commerce
AAAI 2024
TelTrans: Applying Multi-Type Telecom Data to Transportation Evaluation and Prediction via Multifaceted Graph Modeling
AAAI 2024
Spatial-Temporal Augmentation for Crime Prediction (Student Abstract)
AAAI 2024
Translation Deserves Better: Analyzing Translation Artifacts in Cross-lingual Visual Question Answering
ACL 2024
<
1
…
24
25
26
…
49
>