Computer Vision › Processing ›

Video Understanding

1592 directly classified papers

Papers per year

Papers

Towards Long-Form Video Understanding CVPR 2021

Temporal Query Networks for Fine-Grained Video Understanding CVPR 2021

Hierarchical Video Prediction Using Relational Layouts for Human-Object Interactions CVPR 2021

GATSBI: Generative Agent-Centric Spatio-Temporal Object Interaction CVPR 2021

From Semantic Categories to Fixations: A Novel Weakly-Supervised Visual-Auditory Saliency Detection Approach CVPR 2021

Learning To Segment Actions From Visual and Language Instructions via Differentiable Weak Sequence Alignment CVPR 2021

Over-the-Air Adversarial Flickering Attacks Against Video Recognition Networks CVPR 2021

The DKU-Duke-Lenovo System Description for the Fearless Steps Challenge Phase III INTERSPEECH 2021

Online Credit Payment Fraud Detection via Structure-Aware Hierarchical Recurrent Neural Network IJCAI 2021

Weakly-Supervised Spatio-Temporal Anomaly Detection in Surveillance Video IJCAI 2021

DeCEMBERT: Learning from Noisy Instructional Videos via Dense Captions and Entropy Minimization NAACL 2021

Is Space-Time Attention All You Need for Video Understanding? ICML 2021

Scalable Certified Segmentation via Randomized Smoothing ICML 2021

No Frame Left Behind: Full Video Action Recognition CVPR 2021

Phonovisual Biases in Language: is the Lexicon Tied to the Visual World? IJCAI 2021

Space-Time Crop & Attend: Improving Cross-Modal Video Representation Learning ICCV 2021

A Hybrid Video Anomaly Detection Framework via Memory-Augmented Flow Reconstruction and Flow-Guided Frame Prediction ICCV 2021

Multi-Modal Fusion Transformer for End-to-End Autonomous Driving CVPR 2021

Three Birds with One Stone: Multi-Task Temporal Action Detection via Recycling Temporal Annotations CVPR 2021

Traffic Flow Prediction with Vehicle Trajectories AAAI 2021

Cascaded Prediction Network via Segment Tree for Temporal Video Grounding CVPR 2021

Weakly Supervised Dense Video Captioning via Jointly Usage of Knowledge Distillation and Cross-modal Matching IJCAI 2021

TransformerFusion: Monocular RGB Scene Reconstruction using Transformers NIPS 2021

Dig into Multi-modal Cues for Video Retrieval with Hierarchical Alignment IJCAI 2021

Hierarchical Object-oriented Spatio-Temporal Reasoning for Video Question Answering IJCAI 2021