conftrace
_
Papers
Trends
Conferences
Explore
More
Authors
Topics
Keywords
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
← Keywords
multimodal learning
4645 papers
Explore in graph
Co-occurring keywords
large language model
(13587)
vision-language model
(2348)
visual question answering
(1017)
video understanding
(1658)
multi-modal learning
(1278)
contrastive learning
(4032)
representation learning
(6206)
transfer learning
(5449)
zero-shot learning
(3650)
vision language model
(767)
Papers
Bridging Modalities: Enhancing Cross-Modality Hate Speech Detection with Few-Shot In-Context Learning
EMNLP 2024
ECIS-VQG: Generation of Entity-centric Information-seeking Questions from Videos
EMNLP 2024
VIEWS: Entity-Aware News Video Captioning
EMNLP 2024
Advancing Social Intelligence in AI Agents: Technical Challenges and Open Questions
EMNLP 2024
Diversify, Rationalize, and Combine: Ensembling Multiple QA Strategies for Zero-shot Knowledge-based VQA
EMNLP 2024
Multiple Knowledge-Enhanced Interactive Graph Network for Multimodal Conversational Emotion Recognition
EMNLP 2024
Plot Twist: Multimodal Models Don’t Comprehend Simple Chart Details
EMNLP 2024
Financial Forecasting from Textual and Tabular Time Series
EMNLP 2024
Semantic Token Reweighting for Interpretable and Controllable Text Embeddings in CLIP
EMNLP 2024
Individuation in Neural Models with and without Visual Grounding
EMNLP 2024
DCU ADAPT at WMT24: English to Low-resource Multi-Modal Translation Task
EMNLP 2024
VIXEN: Visual Text Comparison Network for Image Difference Captioning
AAAI 2024
EVE: Efficient Vision-Language Pre-training with Masked Prediction and Modality-Aware MoE
AAAI 2024
Visual Chain-of-Thought Prompting for Knowledge-Based Visual Reasoning
AAAI 2024
CL2CM: Improving Cross-Lingual Cross-Modal Retrieval via Cross-Lingual Knowledge Transfer
AAAI 2024
Image as a Language: Revisiting Scene Text Recognition via Balanced, Unified and Synchronized Vision-Language Reasoning Network
AAAI 2024
MuLTI: Efficient Video-and-Language Understanding with Text-Guided MultiWay-Sampler and Multiple Choice Modeling
AAAI 2024
Embracing Language Inclusivity and Diversity in CLIP through Continual Language Learning
AAAI 2024
VQAttack: Transferable Adversarial Attacks on Visual Question Answering via Pre-trained Models
AAAI 2024
Visual Instruction Tuning with Polite Flamingo
AAAI 2024
CoPL: Contextual Prompt Learning for Vision-Language Understanding
AAAI 2024
Detecting and Preventing Hallucinations in Large Vision Language Models
AAAI 2024
Bootstrapping Large Language Models for Radiology Report Generation
AAAI 2024
KAM-CoT: Knowledge Augmented Multimodal Chain-of-Thoughts Reasoning
AAAI 2024
3MVRD: Multimodal Multi-task Multi-teacher Visually-Rich Form Document Understanding
ACL 2024
<
1
…
87
88
89
…
186
>