Computer Vision › Processing ›

Video Understanding

1592 directly classified papers

Papers per year

Papers

Local-Global Video-Text Interactions for Temporal Grounding CVPR 2020

Probabilistic Video Prediction From Noisy Data With a Posterior Confidence CVPR 2020

End-to-End Learning of Visual Representations From Uncurated Instructional Videos CVPR 2020

Caption Alignment for Low Resource Audio-Visual Data INTERSPEECH 2020

Video2Commonsense: Generating Commonsense Descriptions to Enrich Video Captioning EMNLP 2020

Screencast Tutorial Video Understanding CVPR 2020

Classifying, Segmenting, and Tracking Object Instances in Video with Mask Propagation CVPR 2020

Architecture Search of Dynamic Cells for Semantic Video Segmentation WACV 2020

Counterfactual Contrastive Learning for Weakly-Supervised Vision-Language Grounding NIPS 2020

A Benchmark for Structured Procedural Knowledge Extraction from Cooking Videos EMNLP 2020

Rethinking Temporal Fusion for Video-Based Person Re-Identification on Semantic and Time Aspect AAAI 2020

An End-to-End Visual-Audio Attention Network for Emotion Recognition in User-Generated Videos AAAI 2020

Weakly-Supervised Video Moment Retrieval via Semantic Completion Network AAAI 2020

Learning to Segment Actions from Observation and Narration ACL 2020

Temporal Aggregation with Clip-level Attention for Video-based Person Re-identification WACV 2020

Multi-Speaker Video Dialog with Frame-Level Temporal Localization AAAI 2020

RPM-Net: Robust Pixel-Level Matching Networks for Self-Supervised Video Object Segmentation WACV 2020

Simultaneous Machine Translation with Visual Context EMNLP 2020

Fantastic Answers and Where to Find Them: Immersive Question-Directed Visual Attention CVPR 2020

The Garden of Forking Paths: Towards Multi-Future Trajectory Prediction CVPR 2020

Visuo-Linguistic Question Answering (VLQA) Challenge EMNLP 2020

Natural Language Rationales with Full-Stack Visual Reasoning: From Pixels to Semantic Frames to Commonsense Graphs EMNLP 2020

Object Relational Graph With Teacher-Recommended Learning for Video Captioning CVPR 2020

MIMAMO Net: Integrating Micro- and Macro-Motion for Video Emotion Recognition AAAI 2020

Self-Supervised 3D Keypoint Learning for Ego-Motion Estimation CORL 2020