conftrace
_
Papers
Trends
Conferences
Explore
More
Authors
Topics
Keywords
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
← Keywords
multimodal learning
4645 papers
Explore in graph
Co-occurring keywords
large language model
(13587)
vision-language model
(2348)
visual question answering
(1017)
video understanding
(1658)
multi-modal learning
(1278)
contrastive learning
(4032)
representation learning
(6206)
transfer learning
(5449)
zero-shot learning
(3650)
vision language model
(767)
Papers
Skating-Mixer: Long-Term Sport Audio-Visual Modeling with MLPs
AAAI 2023
Video-Text Pre-training with Learned Regions for Retrieval
AAAI 2023
PEN: Prediction-Explanation Network to Forecast Stock Price Movement with Better Explainability
AAAI 2023
Towards Unified, Explainable, and Robust Multisensory Perception
AAAI 2023
Exploring Better Text Image Translation with Multimodal Codebook
ACL 2023
FormNetV2: Multimodal Graph Contrastive Learning for Form Document Information Extraction
ACL 2023
MatCha: Enhancing Visual Language Pretraining with Math Reasoning and Chart Derendering
ACL 2023
LIDA: A Tool for Automatic Generation of Grammar-Agnostic Visualizations and Infographics using Large Language Models
ACL 2023
TAM of SCNU at SemEval-2023 Task 1: FCLL: A Fine-grained Contrastive Language-Image Learning Model for Cross-language Visual Word Sense Disambiguation
ACL 2023
Rutgers Multimedia Image Processing Lab at SemEval-2023 Task-1: Text-Augmentation-based Approach for Visual Word Sense Disambiguation
ACL 2023
Large Scale Generative Multimodal Attribute Extraction for E-commerce Attributes
ACL 2023
SCCS: Semantics-Consistent Cross-domain Summarization via Optimal Transport Alignment
ACL 2023
Measuring Progress in Fine-grained Vision-and-Language Understanding
ACL 2023
Rethinking Multimodal Entity and Relation Extraction from a Translation Point of View
ACL 2023
VisText: A Benchmark for Semantically Rich Chart Captioning
ACL 2023
Attractive Storyteller: Stylized Visual Storytelling with Unpaired Text
ACL 2023
A Facial Expression-Aware Multimodal Multi-task Learning Framework for Emotion Recognition in Multi-party Conversations
ACL 2023
Evaluating pragmatic abilities of image captioners on A3DS
ACL 2023
When to Use Efficient Self Attention? Profiling Text, Speech and Image Transformer Variants
ACL 2023
Transferring General Multimodal Pretrained Models to Text Recognition
ACL 2023
Retrieving Multimodal Prompts for Generative Visual Question Answering
ACL 2023
Adversarial Textual Robustness on Visual Dialog
ACL 2023
Weakly-Supervised Spoken Video Grounding via Semantic Interaction Learning
ACL 2023
PV2TEA: Patching Visual Modality to Textual-Established Information Extraction
ACL 2023
Unified Language Representation for Question Answering over Text, Tables, and Images
ACL 2023
<
1
…
105
106
107
…
186
>