Research Explorer
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
About
Methodology
← Learning Types
Machine Learning
›
Learning Types
›
Multi-Modal Learning
1213 directly classified papers
Papers per year
2007: 2
2008: 1
2009: 1
2011: 2
2012: 5
2013: 5
2014: 1
2015: 5
2016: 8
2017: 21
2018: 42
2019: 42
2020: 69
2021: 72
2022: 149
2023: 143
2024: 258
2025: 370
2026: 17
Papers
Perceiving Longer Sequences With Bi-Directional Cross-Attention Transformers
NIPS 2024
MEANT: Multimodal Encoder for Antecedent Information
EMNLP 2024
Multi-Modal Discussion Transformer: Integrating Text, Images and Graph Transformers to Detect Hate Speech on Social Media
AAAI 2024
Fusion from a Distributional Perspective: A Unified Symbiotic Diffusion Framework for Any Multisource Remote Sensing Data Classification
IJCAI 2024
Unified Physical-Digital Face Attack Detection
IJCAI 2024
Homology Consistency Constrained Efficient Tuning for Vision-Language Models
NIPS 2024
Pelican: Correcting Hallucination in Vision-LLMs via Claim Decomposition and Program of Thought Verification
EMNLP 2024
Extending Multi-modal Contrastive Representations
NIPS 2024
Exploiting Auxiliary Caption for Video Grounding
AAAI 2024
Object Attribute Matters in Visual Question Answering
AAAI 2024
DrFuse: Learning Disentangled Representation for Clinical Multi-Modal Fusion with Missing Modality and Modal Inconsistency
AAAI 2024
DTGB: A Comprehensive Benchmark for Dynamic Text-Attributed Graphs
NIPS 2024
BOK-VQA: Bilingual outside Knowledge-Based Visual Question Answering via Graph Representation Pretraining
AAAI 2024
An Empirical Study of Multilingual Reasoning Distillation for Question Answering
EMNLP 2024
Revisiting motion information for RGB-Event tracking with MOT philosophy
NIPS 2024
Rethinking Reverse Distillation for Multi-Modal Anomaly Detection
AAAI 2024
A Hierarchical Network for Multimodal Document-Level Relation Extraction
AAAI 2024
Beyond Entities: A Large-Scale Multi-Modal Knowledge Graph with Triplet Fact Grounding
AAAI 2024
Posture-Informed Muscular Force Learning for Robust Hand Pressure Estimation
NIPS 2024
Quantifying the Gaps Between Translation and Native Perception in Training for Multimodal, Multilingual Retrieval
EMNLP 2024
SocraticLM: Exploring Socratic Personalized Teaching with Large Language Models
NIPS 2024
Who Evaluates the Evaluations? Objectively Scoring Text-to-Image Prompt Coherence Metrics with T2IScoreScore (TS2)
NIPS 2024
Analyzing Key Factors Influencing Emotion Prediction Performance of VLLMs in Conversational Contexts
EMNLP 2024
DreamCatcher: A Wearer-aware Multi-modal Sleep Event Dataset Based on Earables in Non-restrictive Environments
NIPS 2024
Multimodal Graph Neural Architecture Search under Distribution Shifts
AAAI 2024
<
1
…
17
18
19
…
49
>