conftrace_

Artificial Intelligence › Core AI ›

Vision-Language Models

159 papers

Papers per year

1

1

1

7

3

146

Papers

Vocabulary Hijacking in LVLMs: Unveiling Critical Attention Heads by Excluding Inert Tokens to Mitigate Hallucination ACL 2026

Vision-Language Introspection: Mitigating Overconfident Hallucinations in MLLMs via Interpretable Bi-Causal Steering ACL 2026

HowToNarrate: A General-Domain Benchmark for Synchronized Video Narration with External Knowledge ACL 2026

Libra-VLA: Achieving Learning Equilibrium via Asynchronous Coarse-to-Fine Dual-System ACL 2026

Measuring Social Bias in Vision-Language Models with Face-Only Counterfactuals from Real Photos ACL 2026

E-ViC: Reasoning Beyond Text via Embodied Visual Chain for Spatial Intelligence ACL 2026

MirrorQA: Benchmarking Multimodal LLMs on Mirror-Orientation Reasoning ACL 2026

GeoRC: A Benchmark for Geolocation Reasoning Chains ACL 2026

The Visual Iconicity Challenge: Evaluating Vision-Language Models on Sign Language Form–Meaning Mapping ACL 2026

Mechanisms of Prompt-Induced Hallucination in Vision–Language Models ACL 2026

RealChart2Code: Bridging the Gap in Real-World Chart-to-Code Generation via Multi-Task Evaluation ACL 2026

Response-G1: Explicit Scene Graph Modeling for Proactive Streaming Video Understanding ACL 2026

Automatic and Reliable Evaluation for Academic Caption-to-Figure Generation with LMMs ACL 2026

Through the Magnifying Glass: Adaptive Perception Magnification for Hallucination-Free VLM Decoding ACL 2026

INFACT: A Diagnostic Benchmark for Induced Faithfulness and Factuality Hallucinations in Video-LLMs ACL 2026

When Does Language Matter? Multilingual Instructions Reveal Step-wise Language Sensitivity in Vision-Language-Action Models ACL 2026

VL-Calibration: Decoupled Confidence Calibration for Large Vision-Language Models Reasoning ACL 2026

I2E: From Image Pixels to Actionable Interactive Environments for Text-Guided Image Editing ACL 2026

MMErroR: A Benchmark for Erroneous Reasoning in Vision-Language Models ACL 2026

OMIBench: Benchmarking Olympiad-Level Multi-Image Reasoning in Large Vision-Language Models ACL 2026

Render-of-Thought: Rendering Textual Chain-of-Thought as Images for Visual Latent Reasoning ACL 2026

CEBC: Conformal Evidence-Bounded Control for Low-Hallucination Vision–Language Generation ACL 2026

From Words to Pixels: A Comprehensive Survey on Large Language Models in Visual Segmentation ACL 2026

GrAInS: Gradient-based Attribution for Inference-Time Steering of LLMs and VLMs ACL 2026

MathSight: A Benchmark Exploring Have Vision-Language Models Really Seen in University-Level Mathematical Reasoning? ACL 2026