Research Explorer
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
About
Methodology
← Core AI
Artificial Intelligence
›
Core AI
›
Multi-Modal Learning
1457 directly classified papers
Papers per year
2011: 1
2013: 4
2014: 3
2015: 3
2016: 9
2017: 11
2018: 27
2019: 61
2020: 109
2021: 87
2022: 153
2023: 213
2024: 391
2025: 384
2026: 1
Papers
One More Check: Making “Fake Background” Be Tracked Again
AAAI 2022
Memory-Guided Semantic Learning Network for Temporal Sentence Grounding
AAAI 2022
Improving Zero-Shot Phrase Grounding via Reasoning on External Knowledge and Spatial Relations
AAAI 2022
Self-Supervised Representation Learning Framework for Remote Physiological Measurement Using Spatiotemporal Augmentation Loss
AAAI 2022
End-to-End Transformer Based Model for Image Captioning
AAAI 2022
L-CoDe:Language-Based Colorization Using Color-Object Decoupled Conditions
AAAI 2022
Attribute-Based Progressive Fusion Network for RGBT Tracking
AAAI 2022
Multi-Head Modularization to Leverage Generalization Capability in Multi-Modal Networks
AAAI 2022
Eye of the Beholder: Improved Relation Generalization for Text-Based Reinforcement Learning Agents
AAAI 2022
Using Multimodal Data and AI to Dynamically Map Flood Risk
AAAI 2022
Knowledge-Enhanced Scene Graph Generation with Multimodal Relation Alignment (Student Abstract)
AAAI 2022
ALLURE: A Multi-Modal Guided Environment for Helping Children Learn to Solve a Rubik’s Cube with Automatic Solving and Interactive Explanations
AAAI 2022
LITMUS Predictor: An AI Assistant for Building Reliable, High-Performing and Fair Multilingual NLP Systems
AAAI 2022
Efficient Multi-View Stereo by Iterative Dynamic Cost Volume
CVPR 2022
Visible-Thermal UAV Tracking: A Large-Scale Benchmark and New Baseline
CVPR 2022
UniTranSeR: A Unified Transformer Semantic Representation Framework for Multimodal Task-Oriented Dialog System
ACL 2022
There’s a Time and Place for Reasoning Beyond the Image
ACL 2022
Multimodal fusion via cortical network inspired losses
ACL 2022
Towards Video Text Visual Question Answering: Benchmark and Baseline
NIPS 2022
Why do We Need Large Batchsizes in Contrastive Learning? A Gradient-Bias Perspective
NIPS 2022
Transferring Pre-trained Multimodal Representations with Cross-modal Similarity Matching
NIPS 2022
Multi-Lingual Acquisition on Multimodal Pre-training for Cross-modal Retrieval
NIPS 2022
Embracing Consistency: A One-Stage Approach for Spatio-Temporal Video Grounding
NIPS 2022
WinoGAViL: Gamified Association Benchmark to Challenge Vision-and-Language Models
NIPS 2022
Semantic Exploration from Language Abstractions and Pretrained Representations
NIPS 2022
<
1
…
42
43
44
…
59
>