conftrace
_
Papers
Trends
Conferences
Explore
More
Authors
Topics
Keywords
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
← Keywords
multimodal learning
4645 papers
Explore in graph
Co-occurring keywords
large language model
(13587)
vision-language model
(2348)
visual question answering
(1017)
video understanding
(1658)
multi-modal learning
(1278)
contrastive learning
(4032)
representation learning
(6206)
transfer learning
(5449)
zero-shot learning
(3650)
vision language model
(767)
Papers
GLaMM: Pixel Grounding Large Multimodal Model
CVPR 2024
MORE-3S:Multimodal-based Offline Reinforcement Learning with Shared Semantic Spaces
COLING 2024
Multimodal and Multilingual Laughter Detection in Stand-Up Comedy Videos
COLING 2024
Multimodal Cross-lingual Phrase Retrieval
COLING 2024
Causal Mode Multiplexer: A Novel Framework for Unbiased Multispectral Pedestrian Detection
CVPR 2024
STimage-1K4M: A histopathology image-gene expression dataset for spatial transcriptomics
NIPS 2024
HOTVCOM: Generating Buzzworthy Comments for Videos
ACL 2024
Dynamic Knowledge Prompt for Chest X-ray Report Generation
COLING 2024
Binding Touch to Everything: Learning Unified Multimodal Tactile Representations
CVPR 2024
LL3DA: Visual Interactive Instruction Tuning for Omni-3D Understanding Reasoning and Planning
CVPR 2024
JRDB-Social: A Multifaceted Robotic Dataset for Understanding of Context and Dynamics of Human Interactions Within Social Groups
CVPR 2024
Generative Multimodal Models are In-Context Learners
CVPR 2024
Mobile-Agent-v2: Mobile Device Operation Assistant with Effective Navigation via Multi-Agent Collaboration
NIPS 2024
Modal-adaptive Knowledge-enhanced Graph-based Financial Prediction from Monetary Policy Conference Calls with LLM
COLING 2024
AIMA at SemEval-2024 Task 3: Simple Yet Powerful Emotion Cause Pair Analysis
NAACL 2024
VideoCon: Robust Video-Language Alignment via Contrast Captions
CVPR 2024
A Mapping on Current Classifying Categories of Emotions Used in Multimodal Models for Emotion Recognition
EACL 2024
Continual Multimodal Knowledge Graph Construction
IJCAI 2024
Medical Vision-Language Pre-Training for Brain Abnormalities
COLING 2024
MEVTR: A Multilingual Model Enhanced with Visual Text Representations
COLING 2024
HHMR: Holistic Hand Mesh Recovery by Enhancing the Multimodal Controllability of Graph Diffusion Models
CVPR 2024
DiaLoc: An Iterative Approach to Embodied Dialog Localization
CVPR 2024
WildVision: Evaluating Vision-Language Models in the Wild with Human Preferences
NIPS 2024
A Unified Debiasing Approach for Vision-Language Models across Modalities and Tasks
NIPS 2024
Easy Regional Contrastive Learning of Expressive Fashion Representations
NIPS 2024
<
1
…
78
79
80
…
186
>