Zhaoxiang Zhang
123 papers · 2016–2026 · 12 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+17 more ↓ Show less ↑
π§ Keyword Pioneer πΊοΈ Taxonomy Completionist (19) π Interdisciplinary Bridge π Renaissance Researcher (6) π£ Hot Topic Early Bird
π
Interdisciplinary Bridge
πΊοΈ
Taxonomy Completionist
(19)
π§
Keyword Pioneer
π
Conference Loyalist
(41)
π€
Dynamic Duo
(24)
π
Grand Slam
π₯
Mega-Team
(26)
π¬
Deep Specialist
(18)
π§¬
Topic Evolution
π
Keyword Champion
(4)
π
Trend Setter
β
The Questioner
π
Century Club
(121)
ποΈ
Keyword Collector
(520)
π
Conference Pioneer
β‘
Prolific Year
(8)
π₯
Unstoppable
(10)
Conferences
CVPR (41)
ICCV (22)
ECCV (16)
AAAI (12)
ACL (8)
ICLR (8)
IJCAI (7)
NIPS (5)
EMNLP (1)
ICML (1)
JMLR (1)
WACV (1)
Top co-authors
Research topics
Keywords
semantic segmentation
(15)
autonomous driving
(14)
object detection
(14)
domain adaptation
(9)
instance segmentation
(8)
convolutional neural network
(8)
3d object detection
(7)
unsupervised learning
(6)
weakly supervised learning
(6)
vision-language model
(5)
neural network
(5)
person re-identification
(5)
bird's eye view
(4)
pseudo label
(4)
world model
(4)
large language model
(4)
point cloud
(4)
transfer learning
(4)
depth estimation
(4)
feature learning
(4)
Papers
AdaField: Generalizable Surface Pressure Modeling with Physics-Informed Pre-training and Flow-Conditioned Adaptation
AAAI 2026
CriticLean: Critic-Guided Reinforcement Learning for Mathematical Formalization
ACL 2026
DexVLG: Dexterous Vision-Language-Grasp Model at Scale
ICCV 2025
Ross3D: Reconstructive Visual Instruction Tuning with 3D-Awareness
ICCV 2025
LayerAnimate: Layer-level Control for Animation
ICCV 2025
MIO: A Foundation Model on Multimodal Tokens
EMNLP 2025
FlexDrive: Toward Trajectory Flexibility in Driving Scene Gaussian Splatting Reconstruction and Rendering
CVPR 2025
FreeSim: Toward Free-viewpoint Camera Simulation in Driving Scenes
CVPR 2025
Better to Teach than to Give: Domain Generalized Semantic Segmentation via Agent Queries with Diffusion Model Guidance
ICML 2025
Reconstructive Visual Instruction Tuning
ICLR 2025
Enhancing End-to-End Autonomous Driving with Latent World Model
ICLR 2025
FreeVS: Generative View Synthesis on Free Driving Trajectory
ICLR 2025
FIRM: Flexible Interactive Reflection ReMoval
AAAI 2025
SceneX: Procedural Controllable Large-Scale Scene Generation
AAAI 2025
Top-Down Guidance for Learning Object-Centric Representations
IJCAI 2025
AutoGUI: Scaling GUI Grounding with Automatic Functionality Annotations from LLMs
ACL 2025
Activation Steering Decoding: Mitigating Hallucination in Large Vision-Language Models through Bidirectional Hidden State Intervention
ACL 2025
M2RC-EVAL: Massively Multilingual Repository-level Code Completion Evaluation
ACL 2025
Can Large Language Models Detect Errors in Long Chain-of-Thought Reasoning?
ACL 2025
OpenCoder: The Open Cookbook for Top-Tier Code Large Language Models
ACL 2025
C2KD: Cross-layer and Cross-head Knowledge Distillation for Small Language Model-based Recommendation
ACL 2025
CityGaussianV2: Efficient and Geometrically Accurate Reconstruction for Large-Scale Scenes
ICLR 2025
McEval: Massively Multilingual Code Evaluation
ICLR 2025
MTU-Bench: A Multi-granularity Tool-Use Benchmark for Large Language Models
ICLR 2025
Images as Noisy Labels: Unleashing the Potential of the Diffusion Model for Open-Vocabulary Semantic Segmentation
ICCV 2025
MCOP: Multi-UAV Collaborative Occupancy Prediction
ICCV 2025
DrivingGPT: Unifying Driving World Modeling and Planning with Multi-modal Autoregressive Transformers
ICCV 2025
End-to-End Driving with Online Trajectory Evaluation via BEV World Model
ICCV 2025
UIPro: Unleashing Superior Interaction Capability For GUI Agents
ICCV 2025
CSOT: Cross-Scan Object Transfer for Semi-Supervised LiDAR Object Detection
ECCV 2024
RoleAgent: Building, Interacting, and Benchmarking High-quality Role-Playing Agents from Scripts
NIPS 2024
OpenSatMap: A Fine-grained High-resolution Satellite Dataset for Large-scale Map Construction
NIPS 2024
VQ-Map: Bird's-Eye-View Map Layout Estimation in Tokenized Discrete Space via Vector Quantization
NIPS 2024
Voxel Mamba: Group-Free State Space Models for Point Cloud based 3D Object Detection
NIPS 2024
Fully Data-Driven Pseudo Label Estimation for Pointly-Supervised Panoptic Segmentation
AAAI 2024
Compositional Inversion for Stable Diffusion Models
AAAI 2024
RoleLLM: Benchmarking, Eliciting, and Enhancing Role-Playing Abilities of Large Language Models
ACL 2024
Enhancing Visual Continual Learning with Language-Guided Supervision
CVPR 2024
Driving into the Future: Multiview Visual Forecasting and Planning with World Model for Autonomous Driving
CVPR 2024
RCL: Reliable Continual Learning for Unified Failure Detection
CVPR 2024
PanoOcc: Unified Occupancy Representation for Camera-based 3D Panoptic Segmentation
CVPR 2024
Continual Forgetting for Pre-trained Vision Models
CVPR 2024
MemoNav: Working Memory Model for Visual Navigation
CVPR 2024
Robust Depth Enhancement via Polarization Prompt Fusion Tuning
CVPR 2024
HardMo: A Large-Scale Hardcase Dataset for Motion Capture
CVPR 2024
OneTrack: Demystifying the Conflict Between Detection and Tracking in End-to-End 3D Trackers
ECCV 2024
Open Vocabulary 3D Scene Understanding via Geometry Guided Self-Distillation
ECCV 2024
Point-supervised Panoptic Segmentation via Estimating Pseudo Labels from Learnable Distance
ECCV 2024
CityGaussian: Real-time High-quality Large-Scale Scene Rendering with Gaussians
ECCV 2024
DrivingDojo Dataset: Advancing Interactive and Knowledge-Enriched Driving World Model
NIPS 2024
Monocular Occupancy Prediction for Scalable Indoor Scenes
ECCV 2024
General Geometry-aware Weakly Supervised 3D Object Detection
ECCV 2024
Expanding Scene Graph Boundaries: Fully Open-vocabulary Scene Graph Generation via Visual-Concept Alignment and Retention
ECCV 2024
MixSup: Mixed-grained Supervision for Label-efficient LiDAR-based 3D Object Detection
ICLR 2024
3D Video Object Detection With Learnable Object-Centric Global Optimization
CVPR 2023
LMR: A Large-Scale Multi-Reference Dataset for Reference-Based Super-Resolution
ICCV 2023
FPR: False Positive Rectification for Weakly Supervised Semantic Segmentation
ICCV 2023
DDG-Net: Discriminability-Driven Graph Network for Weakly-supervised Temporal Action Localization
ICCV 2023
Sharpness-Aware Gradient Matching for Domain Generalization
CVPR 2023
Once Detected, Never Lost: Surpassing Human Performance in Offline LiDAR based 3D Object Detection
ICCV 2023
Bi-Directional Frame Interpolation for Unsupervised Video Anomaly Detection
WACV 2023
Hard Patches Mining for Masked Image Modeling
CVPR 2023
Robust Feature Rectification of Pretrained Vision Models for Object Recognition
AAAI 2023
BEVFormer v2: Adapting Modern Image Backbones to Bird's-Eye-View Recognition via Perspective Supervision
CVPR 2023
FrustumFormer: Adaptive Instance-Aware Resampling for Multi-View 3D Detection
CVPR 2023
Intrinsic Physical Concepts Discovery With Object-Centric Predictive Models
CVPR 2023
Graphics Capsule: Learning Hierarchical 3D Face Representations From 2D Images
CVPR 2023
SSF: Accelerating Training of Spiking Neural Networks with Stabilized Spiking Flow
ICCV 2023
Blind Video Deflickering by Neural Filtering With a Flawed Atlas
CVPR 2023
BAEFormer: Bi-Directional and Early Interaction Transformers for Bird's Eye View Semantic Segmentation
CVPR 2023
Informative Data Mining for One-Shot Cross-Domain Semantic Segmentation
ICCV 2023
Stereo Depth Estimation with Echoes
ECCV 2022
Deconfounding Physical Dynamics with Global Causal Relation and Confounder Transmission for Counterfactual Prediction
AAAI 2022
DATA: Domain-Aware and Task-Aware Self-Supervised Learning
CVPR 2022
Densely Constrained Depth Estimator for Monocular 3D Object Detection
ECCV 2022
RRSR:Reciprocal Reference-Based Image Super-Resolution with Progressive Feature Alignment and Selection
ECCV 2022
Pointly-Supervised Panoptic Segmentation
ECCV 2022
Self-Guided Hard Negative Generation for Unsupervised Person Re-Identification
IJCAI 2022
OBJECT DYNAMICS DISTILLATION FOR SCENE DECOMPOSITION AND REPRESENTATION
ICLR 2022
Sparse Instance Activation for Real-Time Instance Segmentation
CVPR 2022
Embracing Single Stride 3D Object Detector With Sparse Transformer
CVPR 2022
HP-Capsule: Unsupervised Face Part Discovery by Hierarchical Parsing Capsule Network
CVPR 2022
Implicit Sample Extension for Unsupervised Person Re-Identification
CVPR 2022
Remember the Difference: Cross-Domain Few-Shot Semantic Segmentation via Meta-Memory Transfer
CVPR 2022
Continual Stereo Matching of Continuous Driving Scenes With Growing Architecture
CVPR 2022
The Devil Is in the Details: Window-Based Attention for Image Compression
CVPR 2022
Towards Noiseless Object Contours for Weakly Supervised Semantic Segmentation
CVPR 2022
Uncertainty-Aware Pseudo Label Refinery for Domain Adaptive Semantic Segmentation
ICCV 2021
RangeDet: In Defense of Range View for LiDAR-Based 3D Object Detection
ICCV 2021
GAIA: A Transfer Learning System of Object Detection That Fits Your Needs
CVPR 2021
RefineMask: Towards High-Quality Instance Segmentation With Fine-Grained Features
CVPR 2021
Look Closer To Segment Better: Boundary Patch Refinement for Instance Segmentation
CVPR 2021
Unsupervised Object Detection With LIDAR Clues
CVPR 2021
Bottom-Up Human Pose Estimation via Disentangled Keypoint Regression
CVPR 2021
Learnable Graph Matching: Incorporating Graph Partitioning With Deep Feature Learning for Multiple Object Tracking
CVPR 2021
Group-Wise Semantic Mining for Weakly Supervised Semantic Segmentation
AAAI 2021
Distractor-Aware Fast Tracking via Dynamic Convolutions and MOT Philosophy
CVPR 2021
Clothing Status Awareness for Long-Term Person Re-Identification
ICCV 2021
Instance Guided Proposal Network for Person Search
CVPR 2020
Bi-Directional Interaction Network for Person Search
CVPR 2020
CIAN: Cross-Image Affinity Net for Weakly Supervised Semantic Segmentation
AAAI 2020
Cascading Convolutional Color Constancy
AAAI 2020
Context-Aware Attention Network for Image-Text Retrieval
CVPR 2020
Boosting Decision-based Black-box Adversarial Attacks with Random Sign Flip
ECCV 2020
Large-Scale Object Detection in the Wild From Imbalanced Multi-Labels
CVPR 2020
Learning Integral Objects With Intra-Class Discriminator for Weakly-Supervised Semantic Segmentation
CVPR 2020
Generalizing Person Re-Identification by Camera-Aware Invariance Learning and Cross-Domain Mixup
ECCV 2020
Employing Multi-Estimations for Weakly-Supervised Semantic Segmentation
ECCV 2020
Spectral Feature Transformation for Person Re-Identification
ICCV 2019
Improving Pedestrian Attribute Recognition With Weakly-Supervised Multi-Scale Attribute-Specific Localization
ICCV 2019
Human-Like Delicate Region Erasing Strategy for Weakly Supervised Detection
AAAI 2019
Scale-Aware Trident Networks for Object Detection
ICCV 2019
Sequence Level Semantics Aggregation for Video Object Detection
ICCV 2019
SimpleDet: A Simple and Versatile Distributed Framework for Object Detection and Instance Recognition
JMLR 2019
Attention-Aware Sampling via Deep Reinforcement Learning for Action Recognition
AAAI 2019
POD: Practical Object Detection With Scale-Sensitive Network
ICCV 2019
Hard-Aware Point-to-Set Deep Metric for Person Re-identification
ECCV 2018
Multi-task Layout Analysis for Historical Handwritten Documents Using Fully Convolutional Networks
IJCAI 2018
Deep Convolutional Neural Networks with Merge-and-Run Mappings
IJCAI 2018
Dynamic Multi-Task Learning with Convolutional Neural Network
IJCAI 2017
Diverse Neuron Type Selection for Convolutional Neural Networks
IJCAI 2017
Random Shifting for CNN: a Solution to Reduce Information Loss in Down-Sampling Layers
IJCAI 2017
GIFT: A Real-Time and Scalable 3D Shape Search Engine
CVPR 2016