Research Explorer
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
About
Methodology
← Processing
Computer Vision
›
Processing
›
Video Understanding
1592 directly classified papers
Papers per year
2006: 1
2012: 1
2013: 30
2014: 15
2015: 38
2016: 22
2017: 39
2018: 49
2019: 91
2020: 115
2021: 207
2022: 160
2023: 254
2024: 216
2025: 297
2026: 57
Papers
No More Shortcuts: Realizing the Potential of Temporal Self-Supervision
AAAI 2024
CausalChaos! Dataset for Comprehensive Causal Action Question Answering Over Longer Causal Chains Grounded in Dynamic Visual Scenes
NIPS 2024
Fewer Steps, Better Performance: Efficient Cross-Modal Clip Trimming for Video Moment Retrieval Using Language
AAAI 2024
MECD: Unlocking Multi-Event Causal Discovery in Video Reasoning
NIPS 2024
CityPulse: Fine-Grained Assessment of Urban Change with Street View Time Series
AAAI 2024
Video-Text Prompting for Weakly Supervised Spatio-Temporal Video Grounding
EMNLP 2024
Do You Remember? Dense Video Captioning with Cross-Modal Memory Retrieval
CVPR 2024
Collaborative Weakly Supervised Video Correlation Learning for Procedure-Aware Instructional Video Analysis
AAAI 2024
DTF-AT: Decoupled Time-Frequency Audio Transformer for Event Classification
AAAI 2024
VERIFIED: A Video Corpus Moment Retrieval Benchmark for Fine-Grained Video Understanding
NIPS 2024
A Simple LLM Framework for Long-Range Video Question-Answering
EMNLP 2024
Lighthouse: A User-Friendly Library for Reproducible Video Moment Retrieval and Highlight Detection
EMNLP 2024
GMMFormer: Gaussian-Mixture-Model Based Transformer for Efficient Partially Relevant Video Retrieval
AAAI 2024
MMSum: A Dataset for Multimodal Summarization and Thumbnail Generation of Videos
CVPR 2024
IndustReal: A Dataset for Procedure Step Recognition Handling Execution Errors in Egocentric Videos in an Industrial-Like Setting
WACV 2024
Referred by Multi-Modality: A Unified Temporal Transformer for Video Object Segmentation
AAAI 2024
Context Enhanced Transformer for Single Image Object Detection in Video Data
AAAI 2024
Understanding Video Transformers via Universal Concept Discovery
CVPR 2024
LongVideoBench: A Benchmark for Long-context Interleaved Video-Language Understanding
NIPS 2024
OnlineTAS: An Online Baseline for Temporal Action Segmentation
NIPS 2024
Spatio-Temporal Pixel-Level Contrastive Learning-Based Source-Free Domain Adaptation for Video Semantic Segmentation
CVPR 2023
Unbiased Scene Graph Generation in Videos
CVPR 2023
VLTinT: Visual-Linguistic Transformer-in-Transformer for Coherent Video Paragraph Captioning
AAAI 2023
Source-Free Video Domain Adaptation With Spatial-Temporal-Historical Consistency Learning
CVPR 2023
SkateboardAI: The Coolest Video Action Recognition for Skateboarding (Student Abstract)
AAAI 2023
<
1
…
22
23
24
…
64
>