Stephen Gould

79 papers · 2008–2025 · 11 conferences · across top CS/AI conferences

Achievements

+16 more ↓

🧭 Keyword Pioneer 🐣 Hot Topic Early Bird 🗺️ Taxonomy Completionist (11) 🌉 Interdisciplinary Bridge 🌍 Conference Polyglot (11)

🌉 Interdisciplinary Bridge 🐣 Hot Topic Early Bird 🧭 Keyword Pioneer 🌟 Keyword Trendsetter Combo (7) 🏠 Conference Loyalist (28) 🤝 Dynamic Duo (12) 🔬 Deep Specialist (14) 👑 Triple Crown 🏆 Keyword Champion (2) 🔥 Unstoppable (14) ❓ The Questioner (3) 🗃️ Keyword Collector (333) 📈 Trend Setter 💎 Century Club (79) 🚀 Conference Pioneer ⚡ Prolific Year (5)

Conferences

CVPR (28) ICCV (11) NIPS (10) WACV (10) ECCV (5) ICML (5) ICLR (4) EMNLP (2) JMLR (2) ACL (1) IJCAI (1)

Top co-authors

Dylan Campbell (12) Yizhak Ben-Shabat (11) Liang Zheng (10) Basura Fernando (8) Weijian Deng (8) Anoop Cherian (7) Yicong Hong (7) Cristian Rodriguez (7) Chamin Hewa Koneputugodage (6) Ming Xu (6)

Keywords

object detection (7) point cloud (7) action recognition (6) convolutional neural network (6) surface reconstruction (6) video understanding (5) representation learning (4) 3d vision (4) semantic segmentation (4) multimodal learning (4) implicit neural representation (4) neural network (4) vision-language navigation (4) contrastive learning (3) feature learning (3) 3d reconstruction (3) message passing (3) multi-modal learning (3) image captioning (3) riemannian optimization (3)

Papers

Pos3R: 6D Pose Estimation for Unseen Objects Made Easy CVPR 2025 VI^3NR: Variance Informed Initialization for Implicit Neural Representations CVPR 2025 Temporally Grounding Instructional Diagrams in Unconstrained Videos WACV 2025 Leaps and Bounds: An Improved Point Cloud Winding Number Formulation for Fast Normal Estimation and Surface Reconstruction ICCV 2025 Can We Predict Performance of Large Models across Vision-Language Tasks? ICML 2025 Manual-PA: Learning 3D Part Assembly from Instruction Diagrams ICCV 2025 Bi-Directional Training for Composed Image Retrieval via Text Prompt Learning WACV 2024 An Empirical Study Into What Matters for Calibrating Vision-Language Models ICML 2024 Small Steps and Level Sets: Fitting Neural Surface Models with Point Guidance CVPR 2024 Unsupervised Dense Prediction using Differentiable Normalized Cuts ECCV 2024 NeRFEditor: Differentiable Style Decomposition for 3D Scene Editing WACV 2024 Guiding Neural Collapse: Optimising Towards the Nearest Simplex Equiangular Tight Frame NIPS 2024 Neural Experts: Mixture of Experts for Implicit Neural Representations NIPS 2024 LipAT: Beyond Style Transfer for Controllable Neural Simulation of Lipstick Using Cosmetic Attributes WACV 2024 Towards Optimal Feature-Shaping Methods for Out-of-Distribution Detection ICLR 2024 The First to Know: How Token Distributions Reveal Hidden Knowledge in Large Vision-Language Models? ECCV 2024 IKEA Ego 3D Dataset: Understanding Furniture Assembly Actions From Ego-View 3D Point Clouds WACV 2024 Ray Deformation Networks for Novel View Synthesis of Refractive Objects WACV 2024 3DInAction: Understanding Human Actions in 3D Point Clouds CVPR 2024 Learning to Select Views for Efficient Multi-View Understanding CVPR 2024 Temporally Consistent Unbalanced Optimal Transport for Unsupervised Action Segmentation CVPR 2024 Differentiable Neural Surface Refinement for Modeling Transparent Objects CVPR 2024 Exploring Predicate Visual Context in Detecting of Human-Object Interactions ICCV 2023 Aligning Step-by-Step Instructional Diagrams to Video Demonstrations CVPR 2023 High-Fidelity Guided Image Synthesis With Latent Diffusion Models CVPR 2023 Octree Guided Unoriented Surface Reconstruction CVPR 2023 Revisiting Implicit Differentiation for Learning Problems in Optimal Control NIPS 2023 Semi-Supervised Semantic Segmentation under Label Noise via Diverse Learning Groups ICCV 2023 Scaling Data Generation in Vision-and-Language Navigation ICCV 2023 Learning Navigational Visual Representations with Semantic Map Supervision ICCV 2023 Confidence and Dispersity Speak: Characterizing Prediction Matrix for Unsupervised Accuracy Estimation ICML 2023 Deep Declarative Dynamic Time Warping for End-to-End Learning of Alignment Paths ICLR 2023 Bridging the Gap Between Learning in Discrete and Continuous Environments for Vision-and-Language Navigation CVPR 2022 Efficient Two-Stage Detection of Human-Object Interactions With a Novel Unary-Pairwise Transformer CVPR 2022 On the Strong Correlation Between Model Invariance and Generalization NIPS 2022 DiGS: Divergence Guided Shape Implicit Neural Representation for Unoriented Point Clouds CVPR 2022 Image Retrieval on Real-Life Images With Pre-Trained Vision-and-Language Models ICCV 2021 Rethinking conditional GAN training: An approach using geometrically structured latent manifolds NIPS 2021 Probabilistic Tracklet Scoring and Inpainting for Multiple Object Tracking CVPR 2021 VLN BERT: A Recurrent Vision-and-Language BERT for Navigation CVPR 2021 Spatially Conditioned Graphs for Detecting Human-Object Interactions ICCV 2021 Contextually Plausible and Diverse 3D Human Motion Prediction ICCV 2021 Conditional Generative Modeling via Learning the Latent Space ICLR 2021 What Does Rotation Prediction Tell Us about Classifier Accuracy under Varying Testing Environments? ICML 2021 DORi: Discovering Object Relationships for Moment Localization of a Natural Language Query in a Video WACV 2021 The IKEA ASM Dataset: Understanding People Assembling Furniture Through Actions, Objects and Pose WACV 2021 DeepFit: 3D Surface Fitting via Neural Network Weighted Least Squares ECCV 2020 Multiview Detection with Feature Perspective Transformation ECCV 2020 Solving the Blind Perspective-n-Point Problem End-To-End With Robust Differentiable Geometric Optimization ECCV 2020 Proposal-free Temporal Moment Localization of a Natural-Language Query in Video using Guided Attention WACV 2020 Blended Convolution and Synthesis for Efficient Discrimination of 3D Shapes WACV 2020 A Stochastic Conditioning Scheme for Diverse Human Motion Prediction CVPR 2020 A Signal Propagation Perspective for Pruning Neural Networks at Initialization ICLR 2020 Language and Visual Entity Relationship Graph for Agent Navigation NIPS 2020 Sub-Instruction Aware Vision-and-Language Navigation EMNLP 2020 Learning to Structure an Image With Few Colors CVPR 2020 A Multi-modal Approach to Fine-grained Opinion Mining on Video Reviews ACL 2020 The Alignment of the Spheres: Globally-Optimal Spherical Mixture Alignment for Camera Pose Estimation CVPR 2019 Learning to Find Common Objects Across Few Image Collections ICCV 2019 Partially-Supervised Image Captioning NIPS 2018 Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering CVPR 2018 Vision-and-Language Navigation: Interpreting Visually-Grounded Navigation Instructions in Real Environments CVPR 2018 Non-Linear Temporal Subspace Representations for Activity Recognition CVPR 2018 Video Representation Learning Using Discriminative Pooling CVPR 2018 Self-Supervised Video Representation Learning With Odd-One-Out Networks CVPR 2017 Guided Open Vocabulary Image Captioning with Constrained Beam Search EMNLP 2017 Generalized Rank Pooling for Activity Recognition CVPR 2017 DeepPermNet: Visual Permutation Learning CVPR 2017 Dynamic Image Networks for Action Recognition CVPR 2016 Learning End-to-end Video Classification with Rank-Pooling ICML 2016 Discriminative Hierarchical Rank Pooling for Activity Recognition CVPR 2016 Hierarchical Higher-Order Regression Forest Fields: An Application to 3D Indoor Scene Labelling ICCV 2015 An Exemplar-based CRF for Multi-instance Object Segmentation CVPR 2014 Efficient Extraction and Representation of Spatial Information from Video Data IJCAI 2013 DARWIN: A Framework for Machine Learning and Computer Vision Research and Development JMLR 2012 Region-based Segmentation and Object Detection NIPS 2009 Cascaded Classification Models: Combining Models for Holistic Scene Understanding NIPS 2008 Learning Bounded Treewidth Bayesian Networks JMLR 2008 Learning Bounded Treewidth Bayesian Networks NIPS 2008