Computer Vision › Processing ›

Video Understanding

1592 directly classified papers

Papers per year

Papers

Learning Temporally Consistent Video Depth from Video Diffusion Priors CVPR 2025

HarmonySet: A Comprehensive Dataset for Understanding Video-Music Semantic Alignment and Temporal Synchronization CVPR 2025

KVQ: Boosting Video Quality Assessment via Saliency-guided Local Perception CVPR 2025

How Do Optical Flow and Textual Prompts Collaborate to Assist in Audio-Visual Semantic Segmentation? ICCV 2025

Enhancing Partially Relevant Video Retrieval with Hyperbolic Learning ICCV 2025

LVAgent: Long Video Understanding by Multi-Round Dynamical Collaboration of MLLM Agents ICCV 2025

SAM2Long: Enhancing SAM 2 for Long Video Segmentation with a Training-Free Memory Tree ICCV 2025

HERO: Human Reaction Generation from Videos ICCV 2025

VideoTree: Adaptive Tree-based Video Representation for LLM Reasoning on Long Videos CVPR 2025

VisTRA: Visual Tool-use Reasoning Analyzer for Small Object Visual Question Answering ACL 2025

Punching Bag vs. Punching Person: Motion Transferability in Videos ICCV 2025

TemCoCo: Temporally Consistent Multi-modal Video Fusion with Visual-Semantic Collaboration ICCV 2025

VCA: Video Curious Agent for Long Video Understanding ICCV 2025

CoST: Efficient Collaborative Perception From Unified Spatiotemporal Perspective ICCV 2025

InteractionMap: Improving Online Vectorized HDMap Construction with Interaction CVPR 2025

Temporal Alignment-Free Video Matching for Few-shot Action Recognition CVPR 2025

ViSpeak: Visual Instruction Feedback in Streaming Videos ICCV 2025

Q-Bench-Video: Benchmark the Video Quality Understanding of LMMs CVPR 2025

Exploring Contextual Attribute Density in Referring Expression Counting CVPR 2025

Learning Beyond Still Frames: Scaling Vision-Language Models with Video ICCV 2025

LLaFEA: Frame-Event Complementary Fusion for Fine-Grained Spatiotemporal Understanding in LMMs ICCV 2025

RoMo: Robust Motion Segmentation Improves Structure from Motion ICCV 2025

Online Video Understanding: OVBench and VideoChat-Online CVPR 2025

Multi-Modal Few-Shot Temporal Action Segmentation ICCV 2025

CoTracker3: Simpler and Better Point Tracking by Pseudo-Labelling Real Videos ICCV 2025