Computer Vision › Processing ›

Video Understanding

1592 directly classified papers

Papers per year

Papers

A Local-to-Global Approach to Multi-Modal Movie Scene Segmentation CVPR 2020

Temporally Grounding Language Queries in Videos by Contextual Boundary-Aware Prediction AAAI 2020

End-To-End Trainable Video Super-Resolution Based on a New Mechanism for Implicit Motion Estimation and Compensation WACV 2020

G-TAD: Sub-Graph Localization for Temporal Action Detection CVPR 2020

Active Speakers in Context CVPR 2020

MAST: A Memory-Augmented Self-Supervised Tracker CVPR 2020

Weakly-Supervised Video Re-Localization with Multiscale Attention Model AAAI 2020

Context Modulated Dynamic Networks for Actor and Action Video Segmentation with Language Queries AAAI 2020

Spatial-Temporal Multi-Cue Network for Continuous Sign Language Recognition AAAI 2020

A Transformer-Based Audio Captioning Model with Keyword Estimation INTERSPEECH 2020

SpeedNet: Learning the Speediness in Videos CVPR 2020

Cycle-Contrast for Self-Supervised Video Representation Learning NIPS 2020

Hierarchical Conditional Relation Networks for Video Question Answering CVPR 2020

Discourse-Aware Neural Extractive Text Summarization ACL 2020

Long-Horizon Visual Planning with Goal-Conditioned Hierarchical Predictors NIPS 2020

A Spherical Convolution Approach for Learning Long Term Viewport Prediction in 360 Immersive Video AAAI 2020

Efficient Video Semantic Segmentation with Labels Propagation and Refinement WACV 2020

Co-Saliency Spatio-Temporal Interaction Network for Person Re-Identification in Videos IJCAI 2020

Person Tube Retrieval via Language Description AAAI 2020

Temporally Distributed Networks for Fast Video Semantic Segmentation CVPR 2020

Detecting the Starting Frame of Actions in Video WACV 2020

Counting Out Time: Class Agnostic Video Repetition Counting in the Wild CVPR 2020

Motion-Attentive Transition for Zero-Shot Video Object Segmentation AAAI 2020

A Visually-grounded First-person Dialogue Dataset with Verbal and Non-verbal Responses EMNLP 2020

Multimodal Neural Graph Memory Networks for Visual Question Answering ACL 2020