multimodal learning
4622 papers
Also known as
VLM
VLLM
MM
VLA
MLLMS
MLM
MML
MULLM
LMM
MLLM
MMT
Co-occurring keywords
Papers
Towards Surveillance Video-and-Language Understanding: New Dataset Baselines and Challenges
CVPR 2024
On the Robustness of Language Guidance for Low-Level Vision Tasks: Findings from Depth Estimation
CVPR 2024
MAPLM: A Real-World Large-Scale Vision-Language Benchmark for Map and Traffic Scene Understanding
CVPR 2024
MMAD:Multi-modal Movie Audio Description
COLING 2024
ChatPose: Chatting about 3D Human Pose
CVPR 2024