Computer Vision › Analysis ›

Scene Understanding

1887 directly classified papers

Papers per year

Papers

HomoMatcher: Achieving Dense Feature Matching with Semi-Dense Efficiency by Homography Estimation AAAI 2025

Semantic Segmentation on Raindrop Degraded Images Using Two-Stage Dual Teacher-Student Learning AAAI 2025

Textured Mesh Saliency: Bridging Geometry and Texture for Human Perception in 3D Graphics AAAI 2025

Bootstraping Clustering of Gaussians for View-consistent 3D Scene Understanding AAAI 2025

UrBench: A Comprehensive Benchmark for Evaluating Large Multimodal Models in Multi-View Urban Scenarios AAAI 2025

Out of Length Text Recognition with Sub-String Matching AAAI 2025

Seeing Culture: A Benchmark for Visual Reasoning and Grounding EMNLP 2025

DiscoSG: Towards Discourse-Level Text Scene Graph Parsing through Iterative Graph Refinement EMNLP 2025

CLIP is Almost All You Need: Towards Parameter-Efficient Scene Text Retrieval without OCR CVPR 2025

Multi-Dimensional Hyena for Spatial Inductive Bias AISTATS 2024

Beyond Accuracy: Ensuring Correct Predictions With Correct Rationales NIPS 2024

Iteratively Refined Early Interaction Alignment for Subgraph Matching based Graph Retrieval NIPS 2024

CYCLO: Cyclic Graph Transformer Approach to Multi-Object Relationship Modeling in Aerial Videos NIPS 2024

II-Bench: An Image Implication Understanding Benchmark for Multimodal Large Language Models NIPS 2024

OMG-LLaVA: Bridging Image-level, Object-level, Pixel-level Reasoning and Understanding NIPS 2024

Map It Anywhere: Empowering BEV Map Prediction using Large-scale Public Datasets NIPS 2024

UniBench: Visual Reasoning Requires Rethinking Vision-Language Beyond Scaling NIPS 2024

Decomposing and Interpreting Image Representations via Text in ViTs Beyond CLIP NIPS 2024

Understanding Bias in Large-Scale Visual Datasets NIPS 2024

Unsupervised Homography Estimation on Multimodal Image Pair via Alternating Optimization NIPS 2024

ETO:Efficient Transformer-based Local Feature Matching by Organizing Multiple Homography Hypotheses NIPS 2024

Harmonizing Stochasticity and Determinism: Scene-responsive Diverse Human Motion Prediction NIPS 2024

SceneCraft: Layout-Guided 3D Scene Generation NIPS 2024

When does perceptual alignment benefit vision representations? NIPS 2024

A General Protocol to Probe Large Vision Models for 3D Physical Understanding NIPS 2024