conftrace
_
Papers
Trends
Conferences
Explore
More
Authors
Topics
Keywords
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
← Keywords
multimodal learning
4645 papers
Explore in graph
Co-occurring keywords
large language model
(13587)
vision-language model
(2348)
visual question answering
(1017)
video understanding
(1658)
multi-modal learning
(1278)
contrastive learning
(4032)
representation learning
(6206)
transfer learning
(5449)
zero-shot learning
(3650)
vision language model
(767)
Papers
P²Net: Parallel Pointer-based Network for Key Information Extraction with Complex Layouts
ACL 2025
Forget the Token and Pixel: Rethinking Gradient Ascent for Concept Unlearning in Multimodal Generative Models
ACL 2025
MAGIC-VQA: Multimodal And Grounded Inference with Commonsense Knowledge for Visual Question Answering
ACL 2025
Sign2Vis: Automated Data Visualization from Sign Language
ACL 2025
READoc: A Unified Benchmark for Realistic Document Structured Extraction
ACL 2025
Latent Distribution Decouple for Uncertain-Aware Multimodal Multi-label Emotion Recognition
ACL 2025
Can Vision Language Models Understand Mimed Actions?
ACL 2025
Challenging Multimodal LLMs with African Standardized Exams: A Document VQA Evaluation
ACL 2025
Experiential Semantic Information and Brain Alignment: Are Multimodal Models Better than Language Models?
ACL 2025
NAVER LABS Europe Submission to the Instruction-following Track
ACL 2025
Quantifying Memorization and Parametric Response Rates in Retrieval-Augmented Vision-Language Models
ACL 2025
Adaptive Linguistic Prompting (ALP) Enhances Phishing Webpage Detection in Multimodal Large Language Models
ACL 2025
Instruction-tuned QwenChart for Chart Question Answering
ACL 2025
UoR-NCL at SemEval-2025 Task 1: Using Generative LLMs and CLIP Models for Multilingual Multimodal Idiomaticity Representation
ACL 2025
Zhoumou at SemEval-2025 Task 1: Leveraging Multimodal Data Augmentation and Large Language Models for Enhanced Idiom Understanding
ACL 2025
Argumentative Fallacy Detection in Political Debates
ACL 2025
A Survey of Mathematical Reasoning in the Era of Multimodal Large Language Model: Benchmark, Method & Challenges
ACL 2025
AIGuard: A Benchmark and Lightweight Detection for E-commerce AIGC Risks
ACL 2025
Dynamic Graph Neural ODE Network for Multi-modal Emotion Recognition in Conversation
COLING 2025
Acquired TASTE: Multimodal Stance Detection with Textual and Structural Embeddings
COLING 2025
Improvement in Sign Language Translation Using Text CTC Alignment
COLING 2025
Improving the Efficiency of Visually Augmented Language Models
COLING 2025
Howard University-AI4PC at SemEval-2025 Task 1: Using GPT-4o and CLIP-ViLT to Decode Figurative Language Across Text and Images
ACL 2025
Multimodal Aspect-Based Sentiment Analysis under Conditional Relation
COLING 2025
Temporally Grounding Instructional Diagrams in Unconstrained Videos
WACV 2025
<
1
…
48
49
50
…
186
>