multimodal learning
4622 papers
Also known as
VLM
VLLM
MM
VLA
MLLMS
MLM
MML
MULLM
LMM
MLLM
MMT
Co-occurring keywords
Papers
SaSR-Net: Source-Aware Semantic Representation Network for Enhancing Audio-Visual Question Answering
EMNLP 2024
VISTA-LLAMA: Reducing Hallucination in Video Language Models via Equal Distance to Visual Tokens
CVPR 2024
PERIA: Perceive, Reason, Imagine, Act via Holistic Language and Vision Planning for Manipulation
NIPS 2024