Research Explorer
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
About
Methodology
← Analysis
Computer Vision
›
Analysis
›
Scene Understanding
1887 directly classified papers
Papers per year
2006: 14
2007: 12
2008: 12
2009: 20
2010: 14
2011: 13
2012: 13
2013: 108
2014: 43
2015: 83
2016: 42
2017: 61
2018: 58
2019: 138
2020: 128
2021: 197
2022: 132
2023: 222
2024: 243
2025: 287
2026: 47
Papers
Vision-Language Models Struggle to Align Entities across Modalities
ACL 2025
Semantically Conditioned Prompts for Visual Recognition under Missing Modality Scenarios
WACV 2025
Thermal Polarimetric Multi-view Stereo
ICCV 2025
Learning 3D Object Spatial Relationships from Pre-trained 2D Diffusion Models
ICCV 2025
Physics Context Builders: A Modular Framework for Physical Reasoning in Vision-Language Models
ICCV 2025
Planar Affine Rectification from Local Change of Scale and Orientation
ICCV 2025
A Hyperdimensional One Place Signature to Represent Them All: Stackable Descriptors For Visual Place Recognition
ICCV 2025
HUSH: Holistic Panoramic 3D Scene Understanding using Spherical Harmonics
CVPR 2025
Diorama: Unleashing Zero-shot Single-view 3D Indoor Scene Modeling
ICCV 2025
End-to-End Entity-Predicate Association Reasoning for Dynamic Scene Graph Generation
ICCV 2025
SVTRv2: CTC Beats Encoder-Decoder Models in Scene Text Recognition
ICCV 2025
UAVScenes: A Multi-Modal Dataset for UAVs
ICCV 2025
Leveraging Panoptic Scene Graph for Evaluating Fine-Grained Text-to-Image Generation
ICCV 2025
Articulate3D: Holistic Understanding of 3D Scenes as Universal Scene Description
ICCV 2025
Auto-Controlled Image Perception in MLLMs via Visual Perception Tokens
ICCV 2025
MTVQA: Benchmarking Multilingual Text-Centric Visual Question Answering
ACL 2025
Scene Coordinate Reconstruction Priors
ICCV 2025
Do It Yourself: Learning Semantic Correspondence from Pseudo-Labels
ICCV 2025
FastVLM: Self-Speculative Decoding for Fast Vision-Language Model Inference
IJCNLP 2025
INTERCHART: Benchmarking Visual Reasoning Across Decomposed and Distributed Chart Information
IJCNLP 2025
MovieBench: A Hierarchical Movie Level Dataset for Long Video Generation
CVPR 2025
The Confidence Paradox: Can LLM Know When It’s Wrong?
IJCNLP 2025
FROSS: Faster-Than-Real-Time Online 3D Semantic Scene Graph Generation from RGB-D Images
ICCV 2025
MIMO: Controllable Character Video Synthesis with Spatial Decomposed Modeling
CVPR 2025
RayZer: A Self-supervised Large View Synthesis Model
ICCV 2025
<
1
…
11
12
13
…
76
>