Research Explorer
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
About
Methodology
← Analysis
Computer Vision
›
Analysis
›
Scene Understanding
1887 directly classified papers
Papers per year
2006: 14
2007: 12
2008: 12
2009: 20
2010: 14
2011: 13
2012: 13
2013: 108
2014: 43
2015: 83
2016: 42
2017: 61
2018: 58
2019: 138
2020: 128
2021: 197
2022: 132
2023: 222
2024: 243
2025: 287
2026: 47
Papers
Amodal Scene Analysis via Holistic Occlusion Relation Inference and Generative Mask Completion
AAAI 2024
Large Spatial Model: End-to-end Unposed Images to Semantic 3D
NIPS 2024
On the Estimation of Image-matching Uncertainty in Visual Place Recognition
CVPR 2024
Video Discourse Parsing and Its Application to Multimodal Summarization: A Dataset and Baseline Approaches
EMNLP 2024
Dynamic Spiking Graph Neural Networks
AAAI 2024
Selective Vision is the Challenge for Visual Reasoning: A Benchmark for Visual Argument Understanding
EMNLP 2024
Benchmarking Vision Language Models for Cultural Understanding
EMNLP 2024
SphereCraft: A Dataset for Spherical Keypoint Detection, Matching and Camera Pose Estimation
WACV 2024
An Empirical Analysis on Spatial Reasoning Capabilities of Large Multimodal Models
EMNLP 2024
RSMPNet: Relationship Guided Semantic Map Prediction
WACV 2024
F3Loc: Fusion and Filtering for Floorplan Localization
CVPR 2024
Analyzing the Domain Shift Immunity of Deep Homography Estimation
WACV 2024
Beyond RGB: A Real World Dataset for Multispectral Imaging in Mobile Devices
WACV 2024
Sparse Convolutional Networks for Surface Reconstruction From Noisy Point Clouds
WACV 2024
Plot Twist: Multimodal Models Don’t Comprehend Simple Chart Details
EMNLP 2024
What if...?: Thinking Counterfactual Keywords Helps to Mitigate Hallucination in Large Multi-modal Models
EMNLP 2024
HIG: Hierarchical Interlacement Graph Approach to Scene Graph Generation in Video Understanding
CVPR 2024
VideoGrounding-DINO: Towards Open-Vocabulary Spatio-Temporal Video Grounding
CVPR 2024
DiffSal: Joint Audio and Video Learning for Diffusion Saliency Prediction
CVPR 2024
Single Domain Generalization for Crowd Counting
CVPR 2024
Effective Video Mirror Detection with Inconsistent Motion Cues
CVPR 2024
BirdSAT: Cross-View Contrastive Masked Autoencoders for Bird Species Classification and Mapping
WACV 2024
Sound3DVDet: 3D Sound Source Detection Using Multiview Microphone Array and RGB Images
WACV 2024
SceneDiff: Generative Scene-Level Image Retrieval with Text and Sketch Using Diffusion Models
IJCAI 2024
Semi-Supervised Scene Change Detection by Distillation From Feature-Metric Alignment
WACV 2024
<
1
…
17
18
19
…
76
>