Research Explorer
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
About
Methodology
← Learning Types
Deep Learning
›
Learning Types
›
Multi-Modal Learning
3194 directly classified papers
Papers per year
2003: 1
2010: 1
2011: 1
2013: 5
2014: 3
2015: 9
2016: 23
2017: 49
2018: 78
2019: 158
2020: 223
2021: 261
2022: 354
2023: 471
2024: 705
2025: 835
2026: 17
Papers
Narrating the Video: Boosting Text-Video Retrieval via Comprehensive Utilization of Frame-Level Captions
CVPR 2025
Decoupling and Reconstructing: A Multimodal Sentiment Analysis Framework Towards Robustness
IJCAI 2025
Multimodal Inverse Attention Network with Intrinsic Discriminant Feature Exploitation for Fake News Detection
IJCAI 2025
FIRM: Flexible Interactive Reflection ReMoval
AAAI 2025
Distilling Multi-modal Large Language Models for Autonomous Driving
CVPR 2025
MMGIA: Gradient Inversion Attack Against Multimodal Federated Learning via Intermodal Correlation
IJCAI 2025
Unified Molecule-Text Language Model with Discrete Token Representation
IJCAI 2025
Meme Trojan: Backdoor Attacks Against Hateful Meme Detection via Cross-Modal Triggers
AAAI 2025
Florence-VL: Enhancing Vision-Language Models with Generative Vision Encoder and Depth-Breadth Fusion
CVPR 2025
Generative Co-Design of Antibody Sequences and Structures via Black-Box Guidance in a Shared Latent Space
IJCAI 2025
Exploring Multimodal Foundation AI and Expert-in-the-Loop for Sustainable Management of Wild Salmon Fisheries in Indigenous Rivers
IJCAI 2025
Capturing the Unseen: Vision-Free Facial Motion Capture Using Inertial Measurement Units
AAAI 2025
Synthetic Data is an Elegant GIFT for Continual Vision-Language Models
CVPR 2025
Harnessing Vision Models for Time Series Analysis: A Survey
IJCAI 2025
CureGraph: Contrastive Multi-Modal Graph Representation Learning for Urban Living Circle Health Profiling and Prediction (Abstract Reprint)
IJCAI 2025
mmFAS: Multimodal Face Anti-Spoofing Using Multi-Level Alignment and Switch-Attention Fusion
AAAI 2025
DiTCtrl: Exploring Attention Control in Multi-Modal Diffusion Transformer for Tuning-Free Multi-Prompt Longer Video Generation
CVPR 2025
Limitations in Employing Natural Language Supervision for Sensor-Based Human Activity Recognition - And Ways to Overcome Them
AAAI 2025
Multi-modal Deepfake Detection via Multi-task Audio-Visual Prompt Learning
AAAI 2025
Multi-View Incremental Learning with Structured Hebbian Plasticity for Enhanced Fusion Efficiency
AAAI 2025
FLAME: Frozen Large Language Models Enable Data-Efficient Language-Image Pre-training
CVPR 2025
Multi-to-Single: Reducing Multimodal Dependency in Emotion Recognition Through Contrastive Learning
AAAI 2025
Multimodal Fine-Grained Apparent Personality Trait Recognition: Joint Modeling of Big Five and Questionnaire Item-level Scores
AAAI 2025
Pose as a Modality: A Psychology-Inspired Network for Personality Recognition with a New Multimodal Dataset
AAAI 2025
DCHM: Dynamic Collaboration of Heterogeneous Models Through Isomerism Learning in a Blockchain-Powered Federated Learning Framework
AAAI 2025
<
1
…
18
19
20
…
128
>