Computer Vision › Analysis ›

Video Understanding

1098 directly classified papers

Papers per year

Papers

UMT: Unified Multi-Modal Transformers for Joint Video Moment Retrieval and Highlight Detection CVPR 2022

Set-Supervised Action Learning in Procedural Task Videos via Pairwise Order Consistency CVPR 2022

Ego4D: Around the World in 3,000 Hours of Egocentric Video CVPR 2022

JRDB-Act: A Large-Scale Dataset for Spatio-Temporal Action, Social Group and Activity Detection CVPR 2022

TAP-Vid: A Benchmark for Tracking Any Point in a Video NIPS 2022

Are all Frames Equal? Active Sparse Labeling for Video Action Detection NIPS 2022

PointTAD: Multi-Label Temporal Action Detection with Learnable Query Points NIPS 2022

Simple Unsupervised Object-Centric Learning for Complex and Naturalistic Videos NIPS 2022

How Would The Viewer Feel? Estimating Wellbeing From Video Scenarios NIPS 2022

VITA: Video Instance Segmentation via Object Token Association NIPS 2022

Segmenting Moving Objects via an Object-Centric Layered Representation NIPS 2022

SAVi++: Towards End-to-End Object-Centric Learning from Real-World Videos NIPS 2022

Embracing Consistency: A One-Stage Approach for Spatio-Temporal Video Grounding NIPS 2022

Look More but Care Less in Video Recognition NIPS 2022

Multi-modal Grouping Network for Weakly-Supervised Audio-Visual Video Parsing NIPS 2022

Masked Autoencoders As Spatiotemporal Learners NIPS 2022

Rethinking Resolution in the Context of Efficient Video Recognition NIPS 2022

Enabling Detailed Action Recognition Evaluation Through Video Dataset Augmentation NIPS 2022

Dynamic Multistep Reasoning based on Video Scene Graph for Video Question Answering NAACL 2022

PACE: Predictive and Contrastive Embedding for Unsupervised Action Segmentation IJCAI 2022

Scene Consistency Representation Learning for Video Scene Segmentation CVPR 2022

MAD: A Scalable Dataset for Language Grounding in Videos From Movie Audio Descriptions CVPR 2022

Weakly Supervised Temporal Action Localization via Representative Snippet Knowledge Propagation CVPR 2022

Improving Video Model Transfer With Dynamic Representation Learning CVPR 2022

Cannot See the Forest for the Trees: Aggregating Multiple Viewpoints To Better Classify Objects in Videos CVPR 2022