conftrace
_
Papers
Trends
Conferences
Explore
More
Authors
Topics
Keywords
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
← Keywords
multimodal learning
4645 papers
Explore in graph
Co-occurring keywords
large language model
(13587)
vision-language model
(2348)
visual question answering
(1017)
video understanding
(1658)
multi-modal learning
(1278)
contrastive learning
(4032)
representation learning
(6206)
transfer learning
(5449)
zero-shot learning
(3650)
vision language model
(767)
Papers
DiaLoc: An Iterative Approach to Embodied Dialog Localization
CVPR 2024
HHMR: Holistic Hand Mesh Recovery by Enhancing the Multimodal Controllability of Graph Diffusion Models
CVPR 2024
MuseChat: A Conversational Music Recommendation System for Videos
CVPR 2024
Step Differences in Instructional Video
CVPR 2024
The Audio-Visual Conversational Graph: From an Egocentric-Exocentric Perspective
CVPR 2024
On Scaling Up a Multilingual Vision and Language Model
CVPR 2024
Omni-SMoLA: Boosting Generalist Multimodal Models with Soft Mixture of Low-rank Experts
CVPR 2024
VCoder: Versatile Vision Encoders for Multimodal Large Language Models
CVPR 2024
DiffSal: Joint Audio and Video Learning for Diffusion Saliency Prediction
CVPR 2024
ASOS at ArAIEval Shared Task: Integrating Text and Image Embeddings for Multimodal Propaganda Detection in Arabic Memes
ACL 2024
Unveiling Opinion Evolution via Prompting and Diffusion for Short Video Fake News Detection
ACL 2024
Digital Life Project: Autonomous 3D Characters with Social Intelligence
CVPR 2024
MV-Adapter: Multimodal Video Transfer Learning for Video Text Retrieval
CVPR 2024
Just Add ?! Pose Induced Video Transformers for Understanding Activities of Daily Living
CVPR 2024
Modality-Collaborative Test-Time Adaptation for Action Recognition
CVPR 2024
Can Language Beat Numerical Regression? Language-Based Multimodal Trajectory Prediction
CVPR 2024
Multimodal Pathway: Improve Transformers with Irrelevant Data from Other Modalities
CVPR 2024
Narrative Action Evaluation with Prompt-Guided Multimodal Interaction
CVPR 2024
All in One Framework for Multimodal Re-identification in the Wild
CVPR 2024
Enhancing Multimodal Cooperation via Sample-level Modality Valuation
CVPR 2024
XAI for Better Exploitation of Text in Medical Decision Support
ACL 2024
Qalam: A Multimodal LLM for Arabic Optical Character and Handwriting Recognition
ACL 2024
ContextBLIP: Doubly Contextual Alignment for Contrastive Image Retrieval from Linguistically Complex Descriptions
ACL 2024
Amanda: Adaptively Modality-Balanced Domain Adaptation for Multimodal Emotion Recognition
ACL 2024
Zero-shot Commonsense Reasoning over Machine Imagination
EMNLP 2024
<
1
…
89
90
91
…
186
>