Research Explorer
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
About
Methodology
← Analysis
Computer Vision
›
Analysis
›
Scene Understanding
1887 directly classified papers
Papers per year
2006: 14
2007: 12
2008: 12
2009: 20
2010: 14
2011: 13
2012: 13
2013: 108
2014: 43
2015: 83
2016: 42
2017: 61
2018: 58
2019: 138
2020: 128
2021: 197
2022: 132
2023: 222
2024: 243
2025: 287
2026: 47
Papers
HiLo: Exploiting High Low Frequency Relations for Unbiased Panoptic Scene Graph Generation
ICCV 2023
JaSPICE: Automatic Evaluation Metric Using Predicate-Argument Structures for Image Captioning Models
EMNLP 2023
Unifying Text, Tables, and Images for Multimodal Question Answering
EMNLP 2023
Incorporating Structured Representations into Pretrained Vision & Language Models Using Scene Graphs
EMNLP 2023
3DRP-Net: 3D Relative Position-aware Network for 3D Visual Grounding
EMNLP 2023
Query-based Image Captioning from Multi-context 360cdegree Images
EMNLP 2023
Referring Image Segmentation via Joint Mask Contextual Embedding Learning and Progressive Alignment Network
EMNLP 2023
Weakly-Supervised Learning of Visual Relations in Multimodal Pretraining
EMNLP 2023
DepWiGNN: A Depth-wise Graph Neural Network for Multi-hop Spatial Reasoning in Text
EMNLP 2023
Hallucination Detection for Grounded Instruction Generation
EMNLP 2023
ARO-Net: Learning Implicit Fields From Anchored Radial Observations
CVPR 2023
A Dual Semantic-Aware Recurrent Global-Adaptive Network for Vision-and-Language Navigation
IJCAI 2023
RealGraph: A Multiview Dataset for 4D Real-world Context Graph Generation
ICCV 2023
Context-Aware Entity Grounding with Open-Vocabulary 3D Scene Graphs
CORL 2023
Measuring Faithful and Plausible Visual Grounding in VQA
EMNLP 2023
Pix2Struct: Screenshot Parsing as Pretraining for Visual Language Understanding
ICML 2023
UReader: Universal OCR-free Visually-situated Language Understanding with Multimodal Large Language Model
EMNLP 2023
Learning Spatial-context-aware Global Visual Feature Representation for Instance Image Retrieval
ICCV 2023
Scene-Aware Feature Matching
ICCV 2023
MaXM: Towards Multilingual Visual Question Answering
EMNLP 2023
A Critical View of Vision-Based Long-Term Dynamics Prediction Under Environment Misalignment
ICML 2023
TextPSG: Panoptic Scene Graph Generation from Textual Descriptions
ICCV 2023
Scratching Visual Transformer's Back with Uniform Attention
ICCV 2023
Unbiased Heterogeneous Scene Graph Generation with Relation-Aware Message Passing Neural Network
AAAI 2023
Scalable Theory-Driven Regularization of Scene Graph Generation Models
AAAI 2023
<
1
…
27
28
29
…
76
>