conftrace
_
Papers
Trends
Conferences
Explore
More
Authors
Topics
Keywords
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
← Core AI
Artificial Intelligence
›
Core AI
›
Vision-Language Models
159 papers
Papers per year
2016: 1
1
2021: 1
1
2023: 1
1
2024: 7
7
2025: 3
3
2026: 146
146
Papers
FastV-RAG: Towards Fast and Fine-Grained Video QA with Retrieval-Augmented Generation
ACL 2026
CoreGaze: Core Subgraph-Driven Visual Gaze Diffusion for Training-Free Referring Multimodal Large Language Models
ACL 2026
Leave My Images Alone: Preventing Multi-Modal Large Language Models from Analyzing Images via Visual Prompt Injection
ACL 2026
Mitigating Hallucinations in Large Vision-Language Models without Performance Degradation
ACL 2026
AraVQA: Building a New Arabic Factoid Visual Question Answering Dataset from Wikipedia
ACL 2026
CARES: Context-Aware Resolution Selector for VLMs
ACL 2026
DE-CLIP: Few-Shot Anomaly Detection via Difference-Guided Embedding Editing
ACL 2026
MMCLIP: Cross-Modal Attention Masked Modelling for Medical Language-Image Pre-Training
ACL 2026
Stable Language Guidance for Vision–Language–Action Models
ACL 2026
Simple-VGC: Enhancing Visual Grounding in Multimodal Reasoning via Adaptive Tool Composition
ACL 2026
Look Less, Reason More: Rollout-Guided Adaptive Pixel-Space Reasoning
ACL 2026
Spec-o3: A Tool-Augmented Vision-Language Agent for Rare Celestial Object Candidate Vetting via Automated Spectral Inspection
ACL 2026
Revisit What You See: Revealing Visual Semantics in Vision Tokens to Guide LVLM Decoding
ACL 2026
Visually-Guided Policy Optimization for Multimodal Reasoning
ACL 2026
Don’t Click That: Teaching Web Agents to Resist Deceptive Interfaces
ACL 2026
Reducing Token Redundancy in LVLMs: A Systematic Review of Token Pruning Methods
ACL 2026
ChartVerse: Scaling Chart Reasoning via Reliable Programmatic Synthesis from Scratch
ACL 2026
Protecting multimodal large language models against misleading visualizations
ACL 2026
Visual Attention Reasoning via Hierarchical Search and Self-Verification
ACL 2026
LVLMs and Humans Ground Differently in Referential Communication
ACL 2026
CityCube: Benchmarking Cross-view Spatial Reasoning on Vision-Language Models in Urban Environments
ACL 2026
VisPCO: Visual Token Pruning Configuration Optimization via Budget-Aware Pareto-Frontier Learning for Vision-Language Models
ACL 2026
OS-Sentinel: Towards Safety-Enhanced Mobile GUI Agents via Hybrid Validation in Realistic Workflows
ACL 2026
“I See What You Did There”: Can Large Vision-Language Models Understand Multimodal Puns?
ACL 2026
Sparrow: Text-Anchored Window Attention with Visual-Semantic Glimpsing for Speculative Decoding in Video LLMs
ACL 2026
<
1
2
3
4
5
6
7
>