conftrace
_
Papers
Trends
Conferences
Explore
More
Authors
Topics
Keywords
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
← Keywords
multimodal learning
4645 papers
Explore in graph
Co-occurring keywords
large language model
(13587)
vision-language model
(2348)
visual question answering
(1017)
video understanding
(1658)
multi-modal learning
(1278)
contrastive learning
(4032)
representation learning
(6206)
transfer learning
(5449)
zero-shot learning
(3650)
vision language model
(767)
Papers
RealImpact: A Dataset of Impact Sound Fields for Real Objects
CVPR 2023
S3C: Semi-Supervised VQA Natural Language Explanation via Self-Critical Learning
CVPR 2023
Affection: Learning Affective Explanations for Real-World Visual Data
CVPR 2023
Improving Zero-Shot Generalization and Robustness of Multi-Modal Models
CVPR 2023
You Can Ground Earlier Than See: An Effective and Efficient Pipeline for Temporal Sentence Grounding in Compressed Videos
CVPR 2023
Fine-Grained Audible Video Description
CVPR 2023
EXIF As Language: Learning Cross-Modal Associations Between Images and Camera Metadata
CVPR 2023
Decoupled Multimodal Distilling for Emotion Recognition
CVPR 2023
SmallCap: Lightweight Image Captioning Prompted With Retrieval Augmentation
CVPR 2023
iCLIP: Bridging Image Classification and Contrastive Language-Image Pre-Training for Visual Recognition
CVPR 2023
AutoAD: Movie Description in Context
CVPR 2023
Grounding Counterfactual Explanation of Image Classifiers to Textual Concept Space
CVPR 2023
Rethinking Video ViTs: Sparse Video Tubes for Joint Image and Video Learning
CVPR 2023
Image Manipulation via Multi-Hop Instructions - A New Dataset and Weakly-Supervised Neuro-Symbolic Approach
EMNLP 2023
Predict and Use: Harnessing Predicted Gaze to Improve Multimodal Sarcasm Detection
EMNLP 2023
Learning the Visualness of Text Using Large Vision-Language Models
EMNLP 2023
Analyzing Modular Approaches for Visual Question Decomposition
EMNLP 2023
A Framework for Vision-Language Warm-up Tasks in Multimodal Dialogue Models
EMNLP 2023
Weakly-Supervised Learning of Visual Relations in Multimodal Pretraining
EMNLP 2023
Support or Refute: Analyzing the Stance of Evidence to Detect Out-of-Context Mis- and Disinformation
EMNLP 2023
Hallucination Detection for Grounded Instruction Generation
EMNLP 2023
Debiasing Multimodal Models via Causal Information Minimization
EMNLP 2023
Retrieving Multimodal Information for Augmented Generation: A Survey
EMNLP 2023
Exploring Large Language Models for Multi-Modal Out-of-Distribution Detection
EMNLP 2023
Black-Box Tuning of Vision-Language Models with Effective Gradient Approximation
EMNLP 2023
<
1
…
113
114
115
…
186
>