conftrace
_
Papers
Trends
Conferences
Explore
More
Authors
Topics
Keywords
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
← Keywords
multimodal learning
4645 papers
Explore in graph
Co-occurring keywords
large language model
(13587)
vision-language model
(2348)
visual question answering
(1017)
video understanding
(1658)
multi-modal learning
(1278)
contrastive learning
(4032)
representation learning
(6206)
transfer learning
(5449)
zero-shot learning
(3650)
vision language model
(767)
Papers
FashionVLP: Vision Language Transformer for Fashion Retrieval With Feedback
CVPR 2022
Revisiting the "Video" in Video-Language Understanding
CVPR 2022
Touch and Go: Learning from Human-Collected Vision and Touch
NIPS 2022
Multimodal Contrastive Learning with LIMoE: the Language-Image Mixture of Experts
NIPS 2022
CAESAR: An Embodied Simulator for Generating Multimodal Referring Expression Datasets
NIPS 2022
Flamingo: a Visual Language Model for Few-Shot Learning
NIPS 2022
TANGO: Text-driven Photorealistic and Robust 3D Stylization via Lighting Decomposition
NIPS 2022
Semi-Supervised Video Paragraph Grounding With Contrastive Encoder
CVPR 2022
RoCBert: Robust Chinese Bert with Multimodal Contrastive Pretraining
ACL 2022
Multimodal Dialogue Response Generation
ACL 2022
Contrastive Visual Semantic Pretraining Magnifies the Semantics of Natural Language Representations
ACL 2022
Image Retrieval from Contextual Descriptions
ACL 2022
M3ED: Multi-modal Multi-scene Multi-label Emotional Dialogue Database
ACL 2022
End-to-End Modeling via Information Tree for One-Shot Natural Language Spatial Video Grounding
ACL 2022
M-SENA: An Integrated Platform for Multimodal Sentiment Analysis
ACL 2022
xGQA: Cross-Lingual Visual Question Answering
ACL 2022
DU-VLG: Unifying Vision-and-Language Generation via Dual Sequence-to-Sequence Pre-training
ACL 2022
Assessing Multilingual Fairness in Pre-trained Multimodal Representations
ACL 2022
UNIMO-2: End-to-End Unified Vision-Language Grounded Learning
ACL 2022
VPAI_Lab at MedVidQA 2022: A Two-Stage Cross-modal Fusion Method for Medical Instructional Video Classification
ACL 2022
Less Descriptive yet Discriminative: Quantifying the Properties of Multimodal Referring Utterances via CLIP
ACL 2022
Combining Language Models and Linguistic Information to Label Entities in Memes
ACL 2022
Detecting the Role of an Entity in Harmful Memes: Techniques and their Limitations
ACL 2022
Early Diagnosis of Lyme Disease by Recognizing Erythema Migrans Skin Lesion from Images Utilizing Deep Learning Techniques
IJCAI 2022
Learnable Irrelevant Modality Dropout for Multimodal Action Recognition on Modality-Specific Annotated Videos
CVPR 2022
<
1
…
133
134
135
…
186
>