multimodal learning
4622 papers
Also known as
VLM
VLLM
MM
VLA
MLLMS
MLM
MML
MULLM
LMM
MLLM
MMT
Co-occurring keywords
Papers
A Vision Check-up for Language Models
CVPR 2024
“Image, Tell me your story!” Predicting the original meta-context of visual misinformation
EMNLP 2024
Towards Intelligent Speech Assistants in Operating Rooms: A Multimodal Model for Surgical Workflow Analysis
INTERSPEECH 2024
Generating Illustrated Instructions
CVPR 2024
Vlogger: Make Your Dream A Vlog
CVPR 2024
Enhancing Automated Audio Captioning via Large Language Models with Optimized Audio Encoding
INTERSPEECH 2024
An End-to-End Speech Summarization Using Large Language Model
INTERSPEECH 2024
Spontaneous Speech-Based Suicide Risk Detection Using Whisper and Large Language Models
INTERSPEECH 2024
LLM-Driven Multimodal Opinion Expression Identification
INTERSPEECH 2024
Participant-Pair-Wise Bottleneck Transformer for Engagement Estimation from Video Conversation
INTERSPEECH 2024