conftrace
_
Papers
Trends
Conferences
Explore
More
Authors
Topics
Keywords
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
← Core AI
Artificial Intelligence
›
Core AI
›
Vision-Language Models
159 papers
Papers per year
2016: 1
1
2021: 1
1
2023: 1
1
2024: 7
7
2025: 3
3
2026: 146
146
Papers
REVEALER: Reinforcement-Guided Visual Reasoning for Element-Level Text-Image Alignment Evaluation
ACL 2026
LaMI: Augmenting Large Language Models via Late Multi-Image Fusion
ACL 2026
When More Words Say Less: Decoupling Length and Specificity in Image Description Evaluation
ACL 2026
DIXITWORLD: Evaluating Multimodal Abductive Reasoning in Vision-Language Models with Multi-Agent Dixit Gameplay
ACL 2026
Chain-of-Thought Degrades Visual Spatial Reasoning Capabilities of Multimodal LLMs
ACL 2026
Dash-M5H: An Interactive Dashboard for Multi-Modal, Multi-Model Mental Health Assessment
ACL 2026
Ryze: Evidence-Enriched Data Synthesis from Biomedical Papers
ACL 2026
Spectra: A Mechanistic Interpretability Library for Vision-Language Models
ACL 2026
Evaluation of Multilingual Ability to Use Spatial Deictic Expressions in Vision-Language Models
ACL 2026
UNIVID: Unified Vision-Language Model for Video Moderation
ACL 2026
Grounded Multimodal In-Context Learning for Product Weight Estimation at Scale in E-commerce
ACL 2026
ColorBrowserAgent: Complex Long-Horizon Browser Agent with Adaptive Knowledge Evolution
ACL 2026
From Relevance to Authority: Authority-aware Generative Retrieval in Web Search Engines
ACL 2026
Adaptive Weighted Proxy Tuning: Efficient Gray-Box Steering for Image Captioning.
ACL 2026
A Multistage Extraction Pipeline for Long Scanned Financial Documents: An Empirical Study in Industrial KYC Workflows
ACL 2026
CatVLM: Enhancing Temporal Understanding in Cataract Surgery Videos with Boundary-Aware VLM
MIDL 2026
NeuroLangSeg: Language-Guided Subcortical Segmentation with Pseudo-Supervision and Anatomical–Linguistic Validation
MIDL 2026
REVEAL: Multimodal Vision–Language Alignment of Retinal Morphometry and Clinical Risks for Incident AD and Dementia Prediction
MIDL 2026
RadVLM-GRPO: Enhancing Chest X-ray Report Generation and Visual Grounding via Reinforcement Learning
MIDL 2026
Decoupling Vision and Reasoning: A Data-Efficient Pipeline for Surgical VQA
MIDL 2026
HiPro-CT: A Hierarchical Probabilistic Framework for 3D Medical Vision-Language Alignment
MIDL 2026
Functionality Understanding and Segmentation in 3D Scenes
CVPR 2025
Visual Prompt Engineering for Vision Language Models in Radiology
MIDL 2025
A Balancing Act: Optimizing Classification and Retrieval in Cross-Modal Vision Models
MIDL 2025
Learning to Learn Better Visual Prompts
AAAI 2024
<
1
2
3
4
5
6
7
>