Ruimao Zhang
45 papers · 2016–2025 · 10 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+12 more ↓ Show less ↑
π Renaissance Researcher (8) π Interdisciplinary Bridge π Conference Polyglot (10) π Academic Marathon (9) πΊοΈ Taxonomy Completionist (70)
πΊοΈ
Taxonomy Completionist
(70)
π§
Keyword Pioneer
π£
Hot Topic Early Bird
π
Grand Slam
π€
Dynamic Duo
(13)
π§¬
Topic Evolution
π¬
Deep Specialist
(10)
π
Triple Crown
π₯
Unstoppable
(7)
β‘
Prolific Year
(11)
π
Century Club
(45)
ποΈ
Keyword Collector
(168)
Conferences
CVPR (14)
ICCV (7)
ECCV (5)
NIPS (5)
ICLR (4)
AAAI (2)
CORL (2)
ICML (2)
IJCAI (2)
MIDL (2)
Top co-authors
Keywords
semantic segmentation
(8)
convolutional neural network
(4)
multimodal large language model
(4)
knowledge distillation
(3)
pose estimation
(3)
image generation
(3)
multi-modal learning
(3)
point cloud
(3)
virtual try-on
(2)
image segmentation
(2)
visual prompt
(2)
diffusion model
(2)
neural network optimization
(2)
motion generation
(2)
image retrieval
(2)
3d vision
(2)
batch normalization
(2)
cross-modal learning
(2)
3d object detection
(2)
human parsing
(2)
Papers
ScaMo: Exploring the Scaling Law in Autoregressive Motion Generation Model
CVPR 2025
High-Dynamic Radar Sequence Prediction for Weather Nowcasting Using Spatiotemporal Coherent Gaussian Representation
ICLR 2025
DriveGEN: Generalized and Robust 3D Detection in Driving via Controllable Text-to-Image Diffusion Generation
CVPR 2025
RoboFactory: Exploring Embodied Agent Collaboration with Compositional Constraints
ICCV 2025
WorldSimBench: Towards Video Generation Models as World Simulators
ICML 2025
Ensuring Force Safety in Vision-Guided Robotic Manipulation via Implicit Tactile Calibration
CORL 2025
CDP: Towards Robust Autoregressive Visuomotor Policy Learning via Causal Diffusion
CORL 2025
HumanTOMATO: Text-aligned Whole-body Motion Generation
ICML 2024
KptLLM: Unveiling the Power of Large Language Model for Keypoint Comprehension
NIPS 2024
X4D-SceneFormer: Enhanced Scene Understanding on 4D Point Cloud Videos through Cross-Modal Knowledge Transfer
AAAI 2024
Open-World Human-Object Interaction Detection via Multi-modal Prompts
CVPR 2024
SmartEdit: Exploring Complex Instruction-based Image Editing with Multimodal Large Language Models
CVPR 2024
FreeMan: Towards Benchmarking 3D Human Pose Estimation under Real-World Conditions
CVPR 2024
MP5: A Multi-modal Open-ended Embodied System in Minecraft via Active Perception
CVPR 2024
SEED-Bench: Benchmarking Multimodal Large Language Models
CVPR 2024
F-HOI: Toward Fine-grained Semantic-Aligned 3D Human-Object Interactions
ECCV 2024
X-Pose: Detecting Any Keypoints
ECCV 2024
Enhancing Human-AI Collaboration Through Logic-Guided Reasoning
ICLR 2024
Discovering Intrinsic Spatial-Temporal Logic Rules to Explain Human Actions
NIPS 2023
Inherent Consistent Learning for Accurate Semi-supervised Medical Image Segmentation
MIDL 2023
Neural Interactive Keypoint Detection
ICCV 2023
SupFusion: Supervised LiDAR-Camera Fusion for 3D Object Detection
ICCV 2023
Semantic Human Parsing via Scalable Semantic Transfer Over Multiple Label Domains
CVPR 2023
Explicit Box Detection Unifies End-to-End Multi-Person Pose Estimation
ICLR 2023
Motion-X: A Large-scale 3D Expressive Whole-body Human Motion Dataset
NIPS 2023
Toward Unpaired Multi-modal Medical Image Segmentation via Learning Structured Semantic Consistency
MIDL 2023
Let Images Give You More: Point Cloud Cross-Modal Training for Shape Analysis
NIPS 2022
AMOS: A Large-Scale Abdominal Multi-Organ Benchmark for Versatile Medical Image Segmentation
NIPS 2022
Weakly Supervised Object Localization via Transformer with Implicit Spatial Calibration
ECCV 2022
2DPASS: 2D Priors Assisted Semantic Segmentation on LiDAR Point Clouds
ECCV 2022
End-to-End Dense Video Captioning With Parallel Decoding
ICCV 2021
Sparse Single Sweep LiDAR Point Cloud Segmentation via Learning Contextual Shape Priors from Scene Completion
AAAI 2021
InstanceRefer: Cooperative Holistic Understanding for Visual Grounding on Point Clouds Through Instance Multi-Level Contextual Referring
ICCV 2021
PointLIE: Locally Invertible Embedding for Point Cloud Sampling and Recovery
IJCAI 2021
Parser-Free Virtual Try-On via Distilling Appearance Flows
CVPR 2021
Towards Photo-Realistic Virtual Try-On by Adaptively Generating-Preserving Image Content
CVPR 2020
Exemplar Normalization for Learning Deep Representation
CVPR 2020
Towards Content-Independent Multi-Reference Super-Resolution: Adaptive Pattern Matching and Feature Aggregation
ECCV 2020
SSN: Learning Sparse Switchable Normalization via SparsestMax
CVPR 2019
Differentiable Learning-to-Group Channels via Groupable Convolutional Neural Networks
ICCV 2019
Once a MAN: Towards Multi-Target Attack via Learning Multi-Target Adversarial Network Once
ICCV 2019
Differentiable Learning-to-Normalize via Switchable Normalization
ICLR 2019
DeepFashion2: A Versatile Benchmark for Detection, Pose Estimation, Segmentation and Re-Identification of Clothing Images
CVPR 2019
Deep Structured Scene Parsing by Learning With Image Descriptions
CVPR 2016
Geometric Scene Parsing with Hierarchical LSTM
IJCAI 2016