Co-occurring keywords
Papers
KIA: Knowledge-Guided Implicit Vision-Language Alignment for Chest X-Ray Report Generation
COLING 2025
DDPA-3DVG: Vision-Language Dual-Decoupling and Progressive Alignment for 3D Visual Grounding
IJCAI 2025
Enhancing Spatial Reasoning in Multimodal Large Language Models through Reasoning-based Segmentation
ICCV 2025
Gaze-Language Alignment for Zero-Shot Prediction of Visual Search Targets from Human Gaze Scanpaths
ICCV 2025
Decoupled Proxy Alignment: Mitigating Language Prior Conflict for Multimodal Alignment in MLLMs
EMNLP 2025
DINOv2 Meets Text: A Unified Framework for Image- and Pixel-Level Vision-Language Alignment
CVPR 2025
DH-Set: Improving Vision-Language Alignment with Diverse and Hybrid Set-Embeddings Learning
CVPR 2025
Bridging the Gap between 2D and 3D Visual Question Answering: A Fusion Approach for 3D VQA
AAAI 2024
InternVL: Scaling up Vision Foundation Models and Aligning for Generic Visual-Linguistic Tasks
CVPR 2024