Research Explorer
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
About
Methodology
← Core AI
Artificial Intelligence
›
Core AI
›
Multi-Modal Learning
1457 directly classified papers
Papers per year
2011: 1
2013: 4
2014: 3
2015: 3
2016: 9
2017: 11
2018: 27
2019: 61
2020: 109
2021: 87
2022: 153
2023: 213
2024: 391
2025: 384
2026: 1
Papers
HybriDialogue: An Information-Seeking Dialogue Dataset Grounded on Tabular and Textual Data
ACL 2022
Graph Neural Networks for Multiparallel Word Alignment
ACL 2022
VISITRON: Visual Semantics-Aligned Interactively Trained Object-Navigator
ACL 2022
Combining Static and Contextualised Multilingual Embeddings
ACL 2022
Enabling Multimodal Generation on CLIP via Vision-Language Knowledge Distillation
ACL 2022
Co-VQA : Answering by Interactive Sub Question Sequence
ACL 2022
xGQA: Cross-Lingual Visual Question Answering
ACL 2022
Selecting Stickers in Open-Domain Dialogue through Multitask Learning
ACL 2022
Prior Knowledge and Memory Enriched Transformer for Sign Language Translation
ACL 2022
Attention as Grounding: Exploring Textual and Cross-Modal Attention on Entities and Relations in Language-and-Vision Transformer
ACL 2022
Overview of the MedVidQA 2022 Shared Task on Medical Video Question-Answering
ACL 2022
Less Descriptive yet Discriminative: Quantifying the Properties of Multimodal Referring Utterances via CLIP
ACL 2022
Poirot at CMCL 2022 Shared Task: Zero Shot Crosslingual Eye-Tracking Data Prediction using Multilingual Transformer Models
ACL 2022
How does fake news use a thumbnail? CLIP-based Multimodal Detection on the Unrepresentative News Image
ACL 2022
Bridging the Gap between Recognition-level Pre-training and Commonsensical Vision-language Tasks
ACL 2022
A Selective Summary of Where to Hide a Stolen Elephant: Leaps in Creative Writing with Multimodal Machine Intelligence
ACL 2022
Multimodal Conversational AI: A Survey of Datasets and Approaches
ACL 2022
Relation-aware Video Reading Comprehension for Temporal Language Grounding
EMNLP 2021
Evidence Aware Neural Pornographic Text Identification for Child Protection
AAAI 2021
Improving Multimodal fusion via Mutual Dependency Maximisation
EMNLP 2021
Iconary: A Pictionary-Based Game for Testing Multimodal Communication with Drawings and Text
EMNLP 2021
Global Fusion Attention for Vision and Language Understanding (Student Abstract)
AAAI 2021
Mutual-Learning Improves End-to-End Speech Translation
EMNLP 2021
Vision Guided Generative Pre-trained Language Models for Multimodal Abstractive Summarization
EMNLP 2021
Language-Aligned Waypoint (LAW) Supervision for Vision-and-Language Navigation in Continuous Environments
EMNLP 2021
<
1
…
45
46
47
…
59
>