Xiaojuan Qi
110 papers · 2015–2026 · 9 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+15 more ↓ Show less ↑
π Academic Marathon (10) π Conference Polyglot (8) π§ Keyword Pioneer π Interdisciplinary Bridge π Cross-Pollinator (14)
π
Cross-Pollinator
(14)
π
Renaissance Researcher
(10)
πΊοΈ
Taxonomy Completionist
(122)
π
Conference Loyalist
(44)
π
Grand Slam
π¬
Deep Specialist
(30)
π€
Dynamic Duo
(26)
π
Triple Crown
π
Keyword Champion
(4)
π₯
Unstoppable
(11)
β‘
Prolific Year
(19)
π
Conference Pioneer
π
Century Club
(108)
ποΈ
Keyword Collector
(423)
β
The Questioner
(4)
Conferences
CVPR (44)
ICCV (18)
NIPS (16)
ECCV (15)
ICLR (7)
AAAI (6)
ICML (2)
ACL (1)
EMNLP (1)
Top co-authors
Research topics
Keywords
semantic segmentation
(18)
instance segmentation
(10)
point cloud
(10)
object detection
(10)
convolutional neural network
(8)
3d object detection
(8)
knowledge distillation
(8)
representation learning
(7)
contrastive learning
(7)
novel view synthesis
(6)
image segmentation
(5)
transfer learning
(5)
generative adversarial network
(4)
image synthesis
(4)
vision-language model
(4)
video generation
(4)
neural rendering
(4)
model compression
(4)
depth estimation
(4)
pseudo label
(4)
Papers
TRAC: Teacher-Guided Token Reward with Adaptive Calibration for Robust Policy Optimization
ACL 2026
ASSIST-3D: Adapted Scene Synthesis for Class-Agnostic 3D Instance Segmentation
AAAI 2026
Mixture Compressor for Mixture-of-Experts LLMs Gains More
ICLR 2025
SVG: 3D Stereoscopic Video Generation via Denoising Frame Matrix
ICLR 2025
UniScene: Unified Occupancy-centric Driving Scene Generation
CVPR 2025
Equipping Vision Foundation Model with Mixture of Experts for Out-of-Distribution Detection
ICCV 2025
DiST-4D: Disentangled Spatiotemporal Diffusion with Metric Depth for 4D Driving Scene Generation
ICCV 2025
"Principal Components" Enable A New Language of Images
ICCV 2025
Mixture-of-Scores: Robust Image-Text Data Valuation via Three Lines of Code
ICCV 2025
Aligning Effective Tokens with Video Anomaly in Large Language Models
ICCV 2025
VideoEspresso: A Large-Scale Chain-of-Thought Dataset for Fine-Grained Video Reasoning via Core Frame Selection
CVPR 2025
Deformable Radial Kernel Splatting
CVPR 2025
SliM-LLM: Salience-Driven Mixed-Precision Quantization for Large Language Models
ICML 2025
Learning from Neighbors: Category Extrapolation for Long-Tail Learning
CVPR 2025
ObjectMover: Generative Object Movement with Video Prior
CVPR 2025
A Data-Centric Revisit of Pre-Trained Vision Models for Robot Learning
CVPR 2025
Mitigating Hallucinations in Large Vision-Language Models by Self-Injecting Hallucinations
EMNLP 2025
Holistic Tokenizer for Autoregressive Image Generation
ICCV 2025
How Far are AI-generated Videos from Simulating the 3D Visual World: A Learned 3D Evaluation Approach
ICCV 2025
Data Pruning by Information Maximization
ICLR 2025
EscherNet: A Generative Model for Scalable View Synthesis
CVPR 2024
Let the Avatar Talk using Texts without Paired Training Data
ECCV 2024
EA-VTR: Event-Aware Video-Text Retrieval
ECCV 2024
Decoupled Kullback-Leibler Divergence Loss
NIPS 2024
BiLLM: Pushing the Limit of Post-Training Quantization for LLMs
ICML 2024
UniDream: Unifying Diffusion Priors for Relightable Text-to-3D Generation
ECCV 2024
Groma: Localized Visual Tokenization for Grounding Multimodal Large Language Models
ECCV 2024
Can OOD Object Detectors Learn from Foundation Models?
ECCV 2024
V-IRL: Grounding Virtual Intelligence in Real Life
ECCV 2024
Text-to-3D with Classifier Score Distillation
ICLR 2024
What Makes CLIP More Robust to Long-Tailed Pre-Training Data? A Controlled Study for Transferable Insights
NIPS 2024
Splatter a Video: Video Gaussian Representation for Versatile Processing
NIPS 2024
Spec-Gaussian: Anisotropic View-Dependent Appearance for 3D Gaussian Splatting
NIPS 2024
RegionPLC: Regional Point-Language Contrastive Learning for Open-World 3D Scene Understanding
CVPR 2024
SC-GS: Sparse-Controlled Gaussian Splatting for Editable Dynamic Scenes
CVPR 2024
SaCo Loss: Sample-wise Affinity Consistency for Vision-Language Pre-training
CVPR 2024
Classes Are Not Equal: An Empirical Study on Image Recognition Fairness
CVPR 2024
Total-Decom: Decomposed 3D Scene Reconstruction with Minimal Interaction
CVPR 2024
How to Make Cross Encoder a Good Teacher for Efficient Image-Text Retrieval?
CVPR 2024
CL-NeRF: Continual Learning of Neural Radiance Fields for Evolving Scene Representation
NIPS 2023
CoDet: Co-occurrence Guided Region-Word Alignment for Open-Vocabulary Object Detection
NIPS 2023
Command-Driven Articulated Object Understanding and Manipulation
CVPR 2023
IST-Net: Prior-Free Category-Level Pose Estimation with Implicit Space Transformation
ICCV 2023
Learning a Room with the Occ-SDF Hybrid: Signed Distance Function Mingled with Occupancy Aids Scene Representation
ICCV 2023
Parametric Classification for Generalized Category Discovery: A Baseline Study
ICCV 2023
MarS3D: A Plug-and-Play Motion-Aware Model for Semantic Segmentation on Multi-Scan 3D Point Clouds
CVPR 2023
PLA: Language-Driven Open-Vocabulary 3D Scene Understanding
CVPR 2023
VoxelNeXt: Fully Sparse VoxelNet for 3D Object Detection and Tracking
CVPR 2023
Understanding Imbalanced Semantic Segmentation Through Neural Collapse
CVPR 2023
LargeKernel3D: Scaling Up Kernels in 3D Sparse CNNs
CVPR 2023
Hybrid Neural Rendering for Large-Scale Scenes With Motion Blur
CVPR 2023
Edge Guided GANs with Contrastive Learning for Semantic Image Synthesis
ICLR 2023
ISS: Image as Stepping Stone for Text-Guided 3D Shape Generation
ICLR 2023
IS SYNTHETIC DATA FROM GENERATIVE MODELS READY FOR IMAGE RECOGNITION?
ICLR 2023
Learning Context-Aware Classifier for Semantic Segmentation
AAAI 2023
Context-Aware Transformer for 3D Point Cloud Automatic Annotation
AAAI 2023
Speech2Lip: High-fidelity Speech to Lip Generation by Learning from a Short Video
ICCV 2023
MGFN: Magnitude-Contrastive Glance-and-Focus Network for Weakly-Supervised Video Anomaly Detection
AAAI 2023
Texture Generation on 3D Meshes with Point-UV Diffusion
ICCV 2023
Data Pruning via Moving-one-Sample-out
NIPS 2023
Knowledge Distillation As Efficient Pre-Training: Faster Convergence, Higher Data-Efficiency, and Better Transferability
CVPR 2022
Spatial Pruned Sparse Convolution for Efficient 3D Object Detection
NIPS 2022
Prototypical VoteNet for Few-Shot 3D Point Cloud Object Detection
NIPS 2022
Self-Supervised Visual Representation Learning with Semantic Grouping
NIPS 2022
Unifying Voxel-based Representation with Transformer for 3D Object Detection
NIPS 2022
Towards Efficient 3D Object Detection with Knowledge Distillation
NIPS 2022
Rethinking Resolution in the Context of Efficient Video Recognition
NIPS 2022
TWIST: Two-Way Inter-Label Self-Training for Semi-Supervised 3D Instance Segmentation
CVPR 2022
Voxel Field Fusion for 3D Object Detection
CVPR 2022
Towards Implicit Text-Guided 3D Shape Generation
CVPR 2022
Slot-VPS: Object-Centric Representation Learning for Video Panoptic Segmentation
CVPR 2022
HINT: Hierarchical Neuron Concept Explainer
CVPR 2022
Progressive End-to-End Object Detection in Crowded Scenes
CVPR 2022
Video Demoireing With Relation-Based Temporal Consistency
CVPR 2022
Stratified Transformer for 3D Point Cloud Segmentation
CVPR 2022
Towards Efficient and Scale-Robust Ultra-High-Definition Image DemoirΓ©ing
ECCV 2022
DODA: Data-Oriented Sim-to-Real Domain Adaptation for 3D Semantic Segmentation
ECCV 2022
Multimodal Transformer for Automatic 3D Annotation and Object Detection
ECCV 2022
Re-Distributing Biased Pseudo Labels for Semi-Supervised Semantic Segmentation: A Baseline Investigation
ICCV 2021
Learning Geometry-Disentangled Representation for Complementary Understanding of 3D Object Point Cloud
AAAI 2021
Fully Convolutional Networks for Panoptic Segmentation
CVPR 2021
One Thing One Click: A Self-Training Approach for Weakly Supervised 3D Semantic Segmentation
CVPR 2021
3D-to-2D Distillation for Indoor Scene Parsing
CVPR 2021
PAConv: Position Adaptive Convolution With Dynamic Kernel Assembling on Point Clouds
CVPR 2021
ST3D: Self-Training for Unsupervised Domain Adaptation on 3D Object Detection
CVPR 2021
Aggregation With Feature Detection
ICCV 2021
Memory Selection Network for Video Propagation
ECCV 2020
CN: Channel Normalization For Point Cloud Recognition
ECCV 2020
Few-shot Action Recognition with Permutation-invariant Attention
ECCV 2020
Domain-invariant Stereo Matching Networks
ECCV 2020
Unifying Training and Inference for Panoptic Segmentation
CVPR 2020
ManiGAN: Text-Guided Image Manipulation
CVPR 2020
Global Texture Enhancement for Fake Face Detection in the Wild
CVPR 2020
An Adversarial Perturbation Oriented Domain Adaptation Approach for Semantic Segmentation
AAAI 2020
Lightweight Generative Adversarial Networks for Text-Guided Image Manipulation
NIPS 2020
3D Motion Decomposition for RGBD Future Dynamic Scene Synthesis
CVPR 2019
Improved Techniques for Training Adaptive Deep Networks
ICCV 2019
AGSS-VOS: Attention Guided Single-Shot Video Object Segmentation
ICCV 2019
Controllable Text-to-Image Generation
NIPS 2019
GeoNet: Geometric Neural Network for Joint Depth and Surface Normal Estimation
CVPR 2018
Semi-Parametric Image Synthesis
CVPR 2018
Referring Image Segmentation via Recurrent Refinement Networks
CVPR 2018
Image Inpainting via Generative Multi-column Convolutional Neural Networks
NIPS 2018
GAL: Geometric Adversarial Loss for Single-View 3D-Object Reconstruction
ECCV 2018
ICNet for Real-Time Semantic Segmentation on High-Resolution Images
ECCV 2018
3D Graph Neural Networks for RGBD Semantic Segmentation
ICCV 2017
Pyramid Scene Parsing Network
CVPR 2017
DCAN: Deep Contour-Aware Networks for Accurate Gland Segmentation
CVPR 2016
Multi-Scale Patch Aggregation (MPA) for Simultaneous Detection and Segmentation
CVPR 2016
Semantic Segmentation With Object Clique Potential
ICCV 2015