Co-occurring keywords
Papers
Towards Video Thinking Test: A Holistic Benchmark for Advanced Video Reasoning and Understanding
ICCV 2025
FinMMR: Make Financial Numerical Reasoning More Multimodal, Comprehensive, and Challenging
ICCV 2025
Ref-Long: Benchmarking the Long-context Referencing Capability of Long-context Language Models
ACL 2025
In Benchmarks We Trust ... Or Not?
EMNLP 2025