Co-occurring keywords
Papers
GLIMPSE: Do Large Vision-Language Models Truly Think With Videos or Just Glimpse at Them?
EMNLP 2025
VideoComp: Advancing Fine-Grained Compositional and Temporal Alignment in Video-Text Models
CVPR 2025
Omnia de EgoTempo: Benchmarking Temporal Understanding of Multi-Modal LLMs in Egocentric Videos
CVPR 2025