Computer Vision › Processing ›

Video Understanding

1592 directly classified papers

Papers per year

Papers

SWEM: Towards Real-Time Video Object Segmentation With Sequential Weighted Expectation-Maximization CVPR 2022

An Empirical Study of End-to-End Temporal Action Detection CVPR 2022

Semi-Weakly-Supervised Learning of Complex Actions From Instructional Task Videos CVPR 2022

Explore Spatio-Temporal Aggregation for Insubstantial Object Detection: Benchmark Dataset and Baseline CVPR 2022

Revisiting Temporal Alignment for Video Restoration CVPR 2022

Progressive Attention on Multi-Level Dense Difference Maps for Generic Event Boundary Detection CVPR 2022

SpeechFormer: A Hierarchical Efficient Framework Incorporating the Characteristics of Speech INTERSPEECH 2022

How to Listen? Rethinking Visual Sound Localization INTERSPEECH 2022

Siamese Network with Interactive Transformer for Video Object Segmentation AAAI 2022

Memory-Guided Semantic Learning Network for Temporal Sentence Grounding AAAI 2022

Exploring Motion and Appearance Information for Temporal Sentence Grounding AAAI 2022

Temporal Action Proposal Generation with Background Constraint AAAI 2022

Suppressing Static Visual Cues via Normalizing Flows for Self-Supervised Video Representation Learning AAAI 2022

Contrastive Spatio-Temporal Pretext Learning for Self-Supervised Video Representation AAAI 2022

D-vlog: Multimodal Vlog Dataset for Depression Detection AAAI 2022

Masking Modalities for Cross-Modal Video Retrieval WACV 2022

Hybrid Instance-Aware Temporal Fusion for Online Video Instance Segmentation AAAI 2022

Unsupervised Temporal Video Grounding with Deep Semantic Clustering AAAI 2022

SepFusion: Finding Optimal Fusion Structures for Visual Sound Separation AAAI 2022

C3D and Localization Model for Locating and Recognizing the Actions from Untrimmed Videos (Student Abstract) AAAI 2022

Building Goal-Oriented Dialogue Systems with Situated Visual Context AAAI 2022

FIBER: Fill-in-the-Blanks as a Challenging Video Understanding Evaluation Framework ACL 2022

Can Visual Dialogue Models Do Scorekeeping? Exploring How Dialogue Representations Incrementally Encode Shared Knowledge ACL 2022

M-SENA: An Integrated Platform for Multimodal Sentiment Analysis ACL 2022

Prior Knowledge and Memory Enriched Transformer for Sign Language Translation ACL 2022