Research Explorer
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
About
Methodology
← Core AI
Artificial Intelligence
›
Core AI
›
Multi-Modal Learning
1457 directly classified papers
Papers per year
2011: 1
2013: 4
2014: 3
2015: 3
2016: 9
2017: 11
2018: 27
2019: 61
2020: 109
2021: 87
2022: 153
2023: 213
2024: 391
2025: 384
2026: 1
Papers
Argumentative Stance Prediction: An Exploratory Study on Multimodality and Few-Shot Learning
EMNLP 2023
GC-Hunter at ImageArg Shared Task: Multi-Modal Stance and Persuasiveness Learning
EMNLP 2023
AVIS: Autonomous Visual Information Seeking with Large Language Model Agent
NIPS 2023
Open Visual Knowledge Extraction via Relation-Oriented Multimodality Model Prompting
NIPS 2023
Prototype-based Aleatoric Uncertainty Quantification for Cross-modal Retrieval
NIPS 2023
Cross-modal Active Complementary Learning with Self-refining Correspondence
NIPS 2023
EmbodiedGPT: Vision-Language Pre-Training via Embodied Chain of Thought
NIPS 2023
LAMM: Language-Assisted Multi-Modal Instruction-Tuning Dataset, Framework, and Benchmark
NIPS 2023
VisIT-Bench: A Dynamic Benchmark for Evaluating Instruction-Following Vision-and-Language Models
NIPS 2023
DataComp: In search of the next generation of multimodal datasets
NIPS 2023
Quantifying & Modeling Multimodal Interactions: An Information Decomposition Framework
NIPS 2023
Mass-Producing Failures of Multimodal Systems with Language Models
NIPS 2023
Brain encoding models based on multimodal transformers can transfer across language and vision
NIPS 2023
Symbol-LLM: Leverage Language Models for Symbolic System in Visual Human Activity Reasoning
NIPS 2023
Subject-driven Text-to-Image Generation via Apprenticeship Learning
NIPS 2023
Latent Field Discovery in Interacting Dynamical Systems with Neural Fields
NIPS 2023
Compressed Video Prompt Tuning
NIPS 2023
Foundation Model is Efficient Multimodal Multitask Model Selector
NIPS 2023
LOVM: Language-Only Vision Model Selection
NIPS 2023
From Pixels to UI Actions: Learning to Follow Instructions via Graphical User Interfaces
NIPS 2023
Visual Instruction Tuning
NIPS 2023
DesCo: Learning Object Recognition with Rich Language Descriptions
NIPS 2023
HuggingGPT: Solving AI Tasks with ChatGPT and its Friends in Hugging Face
NIPS 2023
What’s Left? Concept Grounding with Logic-Enhanced Foundation Models
NIPS 2023
SpokenWOZ: A Large-Scale Speech-Text Benchmark for Spoken Task-Oriented Dialogue Agents
NIPS 2023
<
1
…
32
33
34
…
59
>