Co-occurring keywords
Papers
Do You See Me : A Multidimensional Benchmark for Evaluating Visual Perception in Multimodal LLMs
EACL 2026
VipAct: Visual-Perception Enhancement via Specialized VLM Agent Collaboration and Tool-use
AAAI 2026
R1-Onevision: Advancing Generalized Multimodal Reasoning through Cross-Modal Formalization
ICCV 2025
SemVink: Advancing VLMs’ Semantic Understanding of Optical Illusions via Visual Global Thinking
EMNLP 2025