Computer Vision › Analysis ›

Scene Understanding

1887 directly classified papers

Papers per year

Papers

Conditional 360-degree Image Synthesis for Immersive Indoor Scene Decoration ICCV 2023

HiLo: Exploiting High Low Frequency Relations for Unbiased Panoptic Scene Graph Generation ICCV 2023

Does Visual Pretraining Help End-to-End Reasoning? NIPS 2023

Improving Transformer-based Image Matching by Cascaded Capturing Spatially Informative Keypoints ICCV 2023

RealGraph: A Multiview Dataset for 4D Real-world Context Graph Generation ICCV 2023

TextPSG: Panoptic Scene Graph Generation from Textual Descriptions ICCV 2023

Learning Spatial-context-aware Global Visual Feature Representation for Instance Image Retrieval ICCV 2023

Scene-Aware Feature Matching ICCV 2023

LICO: Explainable Models with Language-Image COnsistency NIPS 2023

Scratching Visual Transformer's Back with Uniform Attention ICCV 2023

Estimating Generic 3D Room Structures from 2D Annotations NIPS 2023

CVSformer: Cross-View Synthesis Transformer for Semantic Scene Completion ICCV 2023

Learning Long-Range Information with Dual-Scale Transformers for Indoor Scene Completion ICCV 2023

Separating Partially-Polarized Diffuse and Specular Reflection Components Under Unpolarized Light Sources WACV 2023

Human-centric Scene Understanding for 3D Large-scale Scenarios ICCV 2023

Emergent Correspondence from Image Diffusion NIPS 2023

Language Guided Visual Question Answering: Elevate Your Multimodal Language Model Using Knowledge-Enriched Prompts EMNLP 2023

Puzzlefusion: Unleashing the Power of Diffusion Models for Spatial Puzzle Solving NIPS 2023

Scene Graph Enhanced Pseudo-Labeling for Referring Expression Comprehension EMNLP 2023

ROME: Evaluating Pre-trained Vision-Language Models on Reasoning beyond Visual Common Sense EMNLP 2023

LayoutDIT: Layout-Aware End-to-End Document Image Translation with Multi-Step Conductive Decoder EMNLP 2023

M2C: Towards Automatic Multimodal Manga Complement EMNLP 2023

Unifying Text, Tables, and Images for Multimodal Question Answering EMNLP 2023

Query-based Image Captioning from Multi-context 360cdegree Images EMNLP 2023

Referring Image Segmentation via Joint Mask Contextual Embedding Learning and Progressive Alignment Network EMNLP 2023