Junsong Yuan
99 papers · 2012–2026 · 9 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+15 more ↓ Show less ↑
π§ Keyword Pioneer π Conference Polyglot (9) π Interdisciplinary Bridge πΊοΈ Taxonomy Completionist (11) π Academic Marathon (14)
π
Conference Polyglot
(9)
π
Academic Marathon
(14)
πΊοΈ
Taxonomy Completionist
(11)
π
Conference Loyalist
(33)
π¬
Deep Specialist
(18)
π§¬
Topic Evolution
π
Keyword Champion
π₯
Mega-Team
(35)
ποΈ
Keyword Collector
(369)
β‘
Prolific Year
(11)
π
Conference Pioneer
π
Trend Setter
π
Century Club
(97)
π₯
Unstoppable
(15)
β
The Questioner
Conferences
CVPR (33)
ICCV (24)
ECCV (21)
AAAI (9)
IJCAI (4)
WACV (4)
NIPS (2)
ACML (1)
ICLR (1)
Top co-authors
Keywords
hand pose estimation
(10)
action recognition
(9)
video understanding
(8)
point cloud
(6)
domain adaptation
(6)
object detection
(5)
semantic segmentation
(5)
weakly supervised learning
(5)
3d hand pose estimation
(5)
convolutional neural network
(5)
3d reconstruction
(5)
3d vision
(4)
depth image
(4)
3d pose estimation
(4)
data augmentation
(3)
zero-shot learning
(3)
pedestrian detection
(3)
semi-supervised learning
(3)
human pose estimation
(3)
synthetic datum
(3)
Papers
Textured Geometry Evaluation: Perceptual 3D Textured Shape Metric via 3D Latent-Geometry Network
AAAI 2026
Chain-of-Look Spatial Reasoning for Dense Surgical Instrument Counting
WACV 2026
SRAM: Shape-Realism Alignment Metric for No Reference 3D Shape Evaluation
AAAI 2026
dFLMoE: Decentralized Federated Learning via Mixture of Experts for Medical Data Analysis
CVPR 2025
PathDiff: Histopathology Image Synthesis with Unpaired Text and Mask Conditions
ICCV 2025
Recognizing Actions from Robotic View for Natural Human-Robot Interaction
ICCV 2025
Text2Outfit: Controllable Outfit Generation with Multimodal Language Models
ICCV 2025
CompSlider: Compositional Slider for Disentangled Multiple-Attribute Image Generation
ICCV 2025
UST-SSM: Unified Spatio-Temporal State Space Models for Point Cloud Video Modeling
ICCV 2025
IDOL: Unified Dual-Modal Latent Diffusion for Human-Centric Joint Video-Depth Generation
ECCV 2024
Exploring Pre-trained Text-to-Video Diffusion Models for Referring Video Object Segmentation
ECCV 2024
Spectrum AUC Difference (SAUCD): Human-aligned 3D Shape Evaluation
CVPR 2024
GRiT: A Generative Region-to-text Transformer for Object Understanding
ECCV 2024
Forecasting Future Videos from Novel Views via Disentangled 3D Scene Representation
ECCV 2024
FSC: Few-point Shape Completion
CVPR 2024
Show Your Face: Restoring Complete Facial Images From Partial Observations for VR Meeting
WACV 2024
Interaction-centric Spatio-Temporal Context Reasoning for Multi-Person Video HOI Recognition
ECCV 2024
Divide and Fuse: Body Part Mesh Recovery from Partially Visible Human Images
ECCV 2024
Motion Consistency Model: Accelerating Video Diffusion with Disentangled Motion-Appearance Distillation
NIPS 2024
High Fidelity 3D Hand Shape Reconstruction via Scalable Graph Frequency Decomposition
CVPR 2023
Progressive Multi-View Human Mesh Recovery with Self-Supervision
AAAI 2023
Neural Voting Field for Camera-Space 3D Hand Pose Estimation
CVPR 2023
3D-Aware Facial Landmark Detection via Multi-View Consistent Training on Synthetic Data
CVPR 2023
Towards Generic Image Manipulation Detection with Weakly-Supervised Self-Consistency Learning
ICCV 2023
SOAR: Scene-debiasing Open-set Action Recognition
ICCV 2023
Open Set Video HOI detection from Action-Centric Chain-of-Look Prompting
ICCV 2023
NeuRBF: A Neural Fields Representation with Adaptive Radial Basis Functions
ICCV 2023
Uncertainty-aware State Space Transformer for Egocentric 3D Hand Trajectory Forecasting
ICCV 2023
Self-Supervised Distilled Learning for Multi-Modal Misinformation Identification
WACV 2023
Semantics-Depth-Symbiosis: Deeply Coupled Semi-Supervised Learning of Semantics and Depth
WACV 2023
AiATrack: Attention in Attention for Transformer Visual Tracking
ECCV 2022
PREF: Predictability Regularized Neural Motion Fields
ECCV 2022
Neural Correspondence Field for Object Pose Estimation
ECCV 2022
Efficient Video Instance Segmentation via Tracklet Query and Proposal
CVPR 2022
OVIS: Open-Vocabulary Visual Instance Search via Visual-Semantic Aligned Representation Learning
AAAI 2022
Learning Transferable Human-Object Interaction Detector With Natural Language Supervision
CVPR 2022
MixSTE: Seq2seq Mixed Spatio-Temporal Encoder for 3D Human Pose Estimation in Video
CVPR 2022
Stacked Homography Transformations for Multi-View Pedestrian Detection
ICCV 2021
Model-Based 3D Hand Reconstruction via Self-Supervised Learning
CVPR 2021
Track To Detect and Segment: An Online Multi-Object Tracker
CVPR 2021
Rethinking Soft Labels for Knowledge Distillation: A BiasβVariance Tradeoff Perspective
ICLR 2021
Robust Knowledge Transfer via Hybrid Forward on the Teacher-Student Model
AAAI 2021
ACSNet: Action-Context Separation Network for Weakly Supervised Temporal Action Localization
AAAI 2021
Weakly Supervised Temporal Action Localization Through Learning Explicit Subspaces for Action and Context
AAAI 2021
A Unified 3D Human Motion Synthesis Model via Conditional Variational Auto-Encoder
ICCV 2021
High Quality Disparity Remapping With Two-Stage Warping
ICCV 2021
Discovering Human Interactions With Large-Vocabulary Objects via Query and Multi-Scale Detection
ICCV 2021
Discovering Human Interactions With Novel Objects via Zero-Shot Learning
CVPR 2020
Two-Stream Consensus Network for Weakly-Supervised Temporal Action Localization
ECCV 2020
Learning Progressive Joint Propagation for Human Motion Prediction
ECCV 2020
Temporal Distinct Representation Learning for Action Recognition
ECCV 2020
Clustering Driven Deep Autoencoder for Video Anomaly Detection
ECCV 2020
Measuring Generalisation to Unseen Viewpoints, Articulations, Shapes and Objects for 3D Hand Pose Estimation under Hand-Object Interaction
ECCV 2020
Hand-Transformer: Non-Autoregressive Structured Modeling for 3D Hand Pose Estimation
ECCV 2020
Structure-Aware Human-Action Generation
ECCV 2020
Learning Diverse Stochastic Human-Action Generators by Learning Smooth Latent Transitions
AAAI 2020
Temporal-Context Enhanced Detection of Heavily Occluded Pedestrians
CVPR 2020
3DV: 3D Dynamic Voxel for Action Recognition in Depth Video
CVPR 2020
SO-HandNet: Self-Organizing Network for 3D Hand Pose Estimation With Semi-Supervised Learning
ICCV 2019
Bayesian Uncertainty Matching for Unsupervised Domain Adaptation
IJCAI 2019
A2J: Anchor-to-Joint Regression Network for 3D Articulated Pose Estimation From a Single Depth Image
ICCV 2019
PointCloud Saliency Maps
ICCV 2019
Exploiting Spatial-Temporal Relationships for 3D Pose Estimation via Graph Convolutional Networks
ICCV 2019
Temporal Structure Mining for Weakly Supervised Action Detection
ICCV 2019
Discriminative Feature Transformation for Occluded Pedestrian Detection
ICCV 2019
SPAGAN: Shortest Path Graph Attention Network
IJCAI 2019
Exploiting Local Feature Patterns for Unsupervised Domain Adaptation
AAAI 2019
Kervolutional Neural Networks
CVPR 2019
Joint Representative Selection and Feature Learning: A Semi-Supervised Approach
CVPR 2019
3D Hand Shape and Pose Estimation From a Single RGB Image
CVPR 2019
Conditional Generative Adversarial Network for Structured Domain Adaptation
CVPR 2018
Salience Guided Depth Calibration for Perceptually Optimized Compressive Light Field 3D Display
CVPR 2018
Depth-Based 3D Hand Pose Estimation: From Current Achievements to Future Goals
CVPR 2018
Bi-box Regression for Pedestrian Detection and Occlusion Estimation
ECCV 2018
Point-to-Point Regression PointNet for 3D Hand Pose Estimation
ECCV 2018
Weakly-supervised 3D Hand Pose Estimation from Monocular RGB Images
ECCV 2018
Hand PointNet: 3D Hand Pose Estimation Using Point Sets
CVPR 2018
Deformable Pose Traversal Convolution for 3D Action and Gesture Recognition
ECCV 2018
Product Quantization Network for Fast Image Retrieval
ECCV 2018
Multi-View Harmonized Bilinear Network for 3D Object Recognition
CVPR 2018
Recognizing Human Actions as the Evolution of Pose Estimation Maps
CVPR 2018
Multi-Label Learning of Part Detectors for Heavily Occluded Pedestrian Detection
ICCV 2017
Spatio-Temporal Naive-Bayes Nearest-Neighbor (ST-NBNN) for Skeleton-Based Action Recognition
CVPR 2017
Object Co-Skeletonization With Co-Segmentation
CVPR 2017
Is My Object in This Video? Reconstruction-based Object Search in Videos
IJCAI 2017
3D Convolutional Neural Networks for Efficient and Robust Hand Pose Estimation From Single Depth Images
CVPR 2017
HOPE: Hierarchical Object Prototype Encoding for Efficient Object Instance Search in Videos
CVPR 2017
Fried Binary Embedding for High-Dimensional Visual Features
CVPR 2017
Compressive Quantization for Fast Object Instance Search in Videos
ICCV 2017
Common Action Discovery and Localization in Unconstrained Videos
ICCV 2017
To Project More or to Quantize More: Minimize Reconstruction Bias for Learning Compact Binary Codes
IJCAI 2016
Robust 3D Hand Pose Estimation in Single Depth Images: From Single-View CNN to Multi-View CNNs
CVPR 2016
From Keyframes to Key Objects: Video Summarization by Representative Object Proposal Selection
CVPR 2016
Adaptive Exponential Smoothing for Online Filtering of Pixel Prediction Maps
ICCV 2015
Fast Action Proposals for Human Action Detection and Search
CVPR 2015
Multi-feature Spectral Clustering with Minimax Optimization
CVPR 2014
Topical Video Object Discovery from Key Frames by Modeling Word Co-occurrence Prior
CVPR 2013
Max-Margin Structured Output Regression for Spatio-Temporal Action Localization
NIPS 2012
Spatial Locality-Aware Sparse Coding and Dictionary Learning
ACML 2012