Kai Xu

94 papers · 2017–2026 · 13 conferences · across top CS/AI conferences

Achievements

+15 more ↓

🐣 Hot Topic Early Bird 🌍 Conference Polyglot (13) 🧭 Keyword Pioneer 🌉 Interdisciplinary Bridge 🏃 Academic Marathon (8)

🧭 Keyword Pioneer 🐣 Hot Topic Early Bird 🐝 Cross-Pollinator (9) 🏠 Conference Loyalist (33) 🤝 Dynamic Duo (13) 👑 Triple Crown 🏆 Grand Slam 🔬 Deep Specialist (30) 🏆 Keyword Champion 🚀 Conference Pioneer 🗃️ Keyword Collector (394) 📈 Trend Setter ⚡ Prolific Year (14) 🔥 Unstoppable (9) 💎 Century Club (91)

Conferences

CVPR (33) AAAI (14) ICCV (12) ICLR (7) ICML (6) NIPS (6) ECCV (5) COLING (3) AISTATS (2) IJCAI (2) RSS (2) CORL (1) EMNLP (1)

Top co-authors

Chenyang Zhu (13) Renjiao Yi (9) Yulan Guo (9) Angela Yao (7) Akash Srivastava (7) Yifei Shi (6) Zheng Qin (5) Ruizhen Hu (5) Jiazhao Zhang (5) Xinwang Liu (4)

Research topics

Architectures (2)

Keywords

point cloud (13) neural network (7) variational autoencoder (5) point cloud registration (5) 3d reconstruction (5) deep reinforcement learning (4) depth estimation (4) neural radiance field (4) 3d gaussian splatting (4) image restoration (4) generative model (4) gaussian splatting (4) semantic segmentation (4) diffusion model (4) zero-shot learning (3) 6d pose estimation (3) scene understanding (3) edge detection (3) 3d vision (3) differentiable rendering (3)

Papers

AnchorHOI: Zero-shot Generation of 4D Human-Object Interaction via Anchor-based Prior Distillation AAAI 2026 Topology-Inspired Backward-Free Framework for Test-Time Adaptation in Medical Detection AAAI 2026 Unified Mixture-of-Experts Framework for Joint Cardiac and Vascular Ultrasound Analysis and Report Generation AAAI 2026 ArticulatedGS: Self-supervised Digital Twin Modeling of Articulated Objects using 3D Gaussian Splatting CVPR 2025 OnlineAnySeg: Online Zero-Shot 3D Segmentation by Visual Foundation Model Guided 2D Mask Merging CVPR 2025 Progressive Correspondence Regenerator for Robust 3D Registration CVPR 2025 CityEQA: A Hierarchical LLM Agent on Embodied Question Answering Benchmark in City Space EMNLP 2025 VasTSD: Learning 3D Vascular Tree-state Space Diffusion Model for Angiography Synthesis CVPR 2025 RestorGS: Depth-aware Gaussian Splatting for Efficient 3D Scene Restoration CVPR 2025 Hierarchically-Structured Open-Vocabulary Indoor Scene Synthesis with Pre-trained Large Language Model AAAI 2025 Physical-aware Neural Radiance Fields for Efficient Exposure Correction AAAI 2025 EvA: Erasing Spurious Correlations with Activations ICLR 2025 Unveiling the Secret Recipe: A Guide For Supervised Fine-Tuning Small LLMs ICLR 2025 An Efficient Dialogue Policy Agent with Model-Based Causal Reinforcement Learning COLING 2025 CogNav: Cognitive Process Modeling for Object Goal Navigation with LLMs ICCV 2025 Wave-MambaAD: Wavelet-driven State Space Model for Multi-class Unsupervised Anomaly Detection ICCV 2025 Self-supervised Learning of Hybrid Part-aware 3D Representations of 2D Gaussians and Superquadrics ICCV 2025 MonoMobility: Zero-Shot 3D Mobility Analysis from Monocular Videos ICCV 2025 Diagnosing Pretrained Models for Out-of-distribution Detection ICCV 2025 Curve-Aware Gaussian Splatting for 3D Parametric Curve Reconstruction ICCV 2025 A Constrained Optimization Approach for Gaussian Splatting from Coarsely-posed Images and Noisy Lidar Point Clouds ICCV 2025 PIN-WM: Learning Physics-INformed World Models for Non-Prehensile Manipulation RSS 2025 LaDi-WM: A Latent Diffusion-Based World Model for Predictive Manipulation CORL 2025 VideoDirector: Precise Video Editing via Text-to-Video Models CVPR 2025 Scaling for Training Time and Post-hoc Out-of-distribution Detection Enhancement ICLR 2024 Learning High-Frequency Functions Made Easy with Sinusoidal Positional Encoding ICML 2024 Enhancing Video Super-Resolution via Implicit Resampling-based Alignment CVPR 2024 GliDe with a CaPE: A Low-Hassle Method to Accelerate Speculative Decoding ICML 2024 Deep Demonstration Tracing: Learning Generalizable Imitator Policy for Runtime Imitation from a Single Demonstration ICML 2024 Learning Cross-hand Policies of High-DOF Reaching and Grasping ECCV 2024 InterFusion: Text-Driven Generation of 3D Human-Object Interaction ECCV 2024 DiffusionEdge: Diffusion Probabilistic Model for Crisp Edge Detection AAAI 2024 Learning Instance-Aware Correspondences for Robust Multi-Instance Point Cloud Registration in Cluttered Scenes CVPR 2024 Privacy without Noisy Gradients: Slicing Mechanism for Generative Model Training NIPS 2024 MV-Adapter: Multimodal Video Transfer Learning for Video Text Retrieval CVPR 2024 A Fast and High-quality Text-to-Speech Method with Compressed Auxiliary Corpus and Limited Target Speaker Corpus COLING 2024 Deep Reinforcement Learning-based Dialogue Policy with Graph Convolutional Q-network COLING 2024 Practical Hamiltonian Monte Carlo on Riemannian Manifolds via Relativity Theory ICML 2024 Synthetic Data Generation of Many-to-Many Datasets via Random Graph Generation ICLR 2023 Multi-Symmetry Ensembles: Improving Diversity and Generalization via Opposing Symmetries ICML 2023 3D-Aware Object Goal Navigation via Simultaneous Exploration and Identification CVPR 2023 Deep Graph-Based Spatial Consistency for Robust Non-Rigid Point Cloud Registration CVPR 2023 NEF: Neural Edge Fields for 3D Parametric Curve Reconstruction From Multi-View Images CVPR 2023 Weakly-Supervised Single-View Image Relighting CVPR 2023 BUFFER: Balancing Accuracy, Efficiency, and Generalizability in Point Cloud Registration CVPR 2023 DropIT: Dropping Intermediate Tensors for Memory-Efficient DNN Training ICLR 2023 SOCS: Semantically-Aware Object Coordinate Space for Category-Level 6D Object Pose Estimation under Large Shape Variations ICCV 2023 PlaneRecTR: Unified Query Learning for 3D Plane Recovery from a Single View ICCV 2023 Accurate and Fast Compressed Video Captioning ICCV 2023 2D3D-MATR: 2D-3D Matching Transformer for Detection-Free Registration Between Images and Point Clouds ICCV 2023 Multi-Resolution Monocular Depth Map Fusion by Self-Supervised Gradient-Based Composition AAAI 2023 AutoTransition: Learning to Recommend Video Transition Effects ECCV 2022 Efficient One-Pass Multi-View Subspace Clustering with Consensus Anchors AAAI 2022 Fusion Multiple Kernel K-means AAAI 2022 RIM-Net: Recursive Implicit Fields for Unsupervised Learning of Hierarchical Shape Structures CVPR 2022 Accelerating Video Object Segmentation With Compressed Video CVPR 2022 DisARM: Displacement Aware Relation Module for 3D Detection CVPR 2022 RayMVSNet: Learning Ray-Based 1D Implicit Fields for Accurate Multi-View Stereo CVPR 2022 Decoupling Makes Weakly Supervised Local Feature Better CVPR 2022 Geometric Transformer for Fast and Robust Point Cloud Registration CVPR 2022 Learning Efficient Online 3D Bin Packing on Packing Configuration Trees ICLR 2022 StablePose: Learning 6D Object Poses From Geometrically Stable Patches CVPR 2021 Learning Fine-Grained Segmentation of 3D Shapes Without Part Labels CVPR 2021 Objective-aware Traffic Simulation via Inverse Reinforcement Learning IJCAI 2021 A Bayesian-Symbolic Approach to Reasoning and Learning in Intuitive Physics NIPS 2021 Couplings for Multinomial Hamiltonian Monte Carlo AISTATS 2021 Targeted Neural Dynamical Modeling NIPS 2021 Online 3D Bin Packing with Constrained Deep Reinforcement Learning AAAI 2021 Learning in the Frequency Domain CVPR 2020 Fusion-Aware Point Convolution for Online Semantic 3D Scene Segmentation CVPR 2020 PQ-NET: A Generative Part Seq2Seq Network for 3D Shapes CVPR 2020 PIE-NET: Parametric Inference of Point Cloud Edges NIPS 2020 Generative Ratio Matching Networks ICLR 2020 Telescoping Density-Ratio Estimation NIPS 2020 Learning Canonical Shape Space for Category-Level 6D Object Pose and Size Estimation CVPR 2020 MLCVNet: Multi-Level Context VoteNet for 3D Object Detection CVPR 2020 Learning Part Generation and Assembly for Structure-Aware Shape Synthesis AAAI 2020 NeoNav: Improving the Generalization of Visual Navigation via Generating Next Expected Observations AAAI 2020 Toward A Thousand Lights: Decentralized Deep Reinforcement Learning for Large-Scale Traffic Signal Control AAAI 2020 MetaLight: Value-Based Meta-Reinforcement Learning for Traffic Signal Control AAAI 2020 Deep Differentiable Grasp Planner for High-DOF Grippers RSS 2020 AdaCoSeg: Adaptive Shape Co-Segmentation With Group Consistency Loss CVPR 2020 Scalable Spike Source Localization in Extracellular Recordings using Amortized Variational Inference NIPS 2019 Rescan: Inductive Instance Segmentation for Indoor RGBD Scans ICCV 2019 PartNet: A Recursive Part Decomposition Network for Fine-Grained and Hierarchical Shape Segmentation CVPR 2019 Shape2Motion: Joint Analysis of Motion Parts and Attributes From 3D Shapes CVPR 2019 Hierarchy Denoising Recursive Autoencoders for 3D Scene Layout Prediction CVPR 2019 Spatiotemporal CNN for Video Object Segmentation CVPR 2019 Variational Russian Roulette for Deep Bayesian Nonparametrics ICML 2019 Turing: A Language for Flexible Probabilistic Inference AISTATS 2018 Im2Struct: Recovering 3D Shape Structure From a Single RGB Image CVPR 2018 PlaneMatch: Patch Coplanarity Prediction for Robust RGB-D Reconstruction ECCV 2018 LAPRAN: A Scalable Laplacian Pyramid Reconstructive Adversarial Network for Flexible Compressive Sensing Reconstruction ECCV 2018 Bridging the Gap between Observation and Decision Making: Goal Recognition and Flexible Resource Allocation in Dynamic Network Interdiction IJCAI 2017