Co-occurring keywords
Papers
MANTA: A Large-Scale Multi-View and Visual-Text Anomaly Detection Dataset for Tiny Objects
CVPR 2025
What Makes for Good Image Captions?
EMNLP 2025
JNLP at SemEval-2025 Task 1: Multimodal Idiomaticity Representation with Large Language Models
ACL 2025
Argus: Benchmarking and Enhancing Vision-Language Models for 3D Radiology Report Generation
ACL 2025