Research Explorer
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
About
Methodology
← Learning Types
Machine Learning
›
Learning Types
›
Multi-Modal Learning
1213 directly classified papers
Papers per year
2007: 2
2008: 1
2009: 1
2011: 2
2012: 5
2013: 5
2014: 1
2015: 5
2016: 8
2017: 21
2018: 42
2019: 42
2020: 69
2021: 72
2022: 149
2023: 143
2024: 258
2025: 370
2026: 17
Papers
DualReal: Adaptive Joint Training for Lossless Identity-Motion Fusion in Video Customization
ICCV 2025
VEGGIE: Instructional Editing and Reasoning Video Concepts with Grounded Generation
ICCV 2025
Boundary Probing for Input Privacy Protection When Using LMM Services
ICCV 2025
Enrich and Detect: Video Temporal Grounding with Multimodal LLMs
ICCV 2025
DrivingGPT: Unifying Driving World Modeling and Planning with Multi-modal Autoregressive Transformers
ICCV 2025
Identity-aware Language Gaussian Splatting for Open-vocabulary 3D Semantic Segmentation
ICCV 2025
TAViS: Text-bridged Audio-Visual Segmentation with Foundation Models
ICCV 2025
ChatReID: Open-ended Interactive Person Retrieval via Hierarchical Progressive Tuning for Vision Language Models
ICCV 2025
End-to-End Multi-Modal Diffusion Mamba
ICCV 2025
PRACTIQ: A Practical Conversational Text-to-SQL dataset with Ambiguous and Unanswerable Queries
NAACL 2025
Hybrid Graphs for Table-and-Text based Question Answering using LLMs
NAACL 2025
Can LLMs Convert Graphs to Text-Attributed Graphs?
NAACL 2025
An Interpretable and Crosslingual Method for Evaluating Second-Language Dialogues
NAACL 2025
MSc-SQL: Multi-Sample Critiquing Small Language Models For Text-To-SQL Translation
NAACL 2025
When and How to Augment Your Input: Question Routing Helps Balance the Accuracy and Efficiency of Large Language Models
NAACL 2025
VLind-Bench: Measuring Language Priors in Large Vision-Language Models
NAACL 2025
MRE-MI: A Multi-image Dataset for Multimodal Relation Extraction in Social Media Posts
NAACL 2025
Beyond the Mode: Sequence-Level Distillation of Multilingual Translation Models for Low-Resource Language Pairs
NAACL 2025
RusCode: Russian Cultural Code Benchmark for Text-to-Image Generation
NAACL 2025
Beyond Base Predictors: Using LLMs to Resolve Ambiguities in Akkadian Lemmatization
NAACL 2025
Does a code-switching dialogue system help users learn conversational fluency in Choctaw?
NAACL 2025
FUSE : A Ridge and Random Forest-Based Metric for Evaluating MT in Indigenous Languages
NAACL 2025
Enhancing Depression Detection via Question-wise Modality Fusion
NAACL 2025
From Posts to Timelines: Modeling Mental Health Dynamics from Social Media Timelines with Hybrid LLMs
NAACL 2025
Table Understanding and (Multimodal) LLMs: A Cross-Domain Case Study on Scientific vs. Non-Scientific Data
ACL 2025
<
1
…
14
15
16
…
49
>