Xiu Li

83 papers · 2016–2026 · 11 conferences · across top CS/AI conferences

Achievements

+14 more ↓

🧭 Keyword Pioneer 🌍 Conference Polyglot (11) 🌉 Interdisciplinary Bridge 🗺️ Taxonomy Completionist (10) 🏃 Academic Marathon (9)

🏃 Academic Marathon (9) 🐝 Cross-Pollinator (11) 🌈 Renaissance Researcher (11) 👑 Triple Crown 🤝 Dynamic Duo (13) 🏆 Keyword Champion (2) 🔬 Deep Specialist (12) 🏆 Grand Slam 🧬 Topic Evolution 💎 Century Club (81) 📈 Trend Setter 🔥 Unstoppable (10) ⚡ Prolific Year (12) 🗃️ Keyword Collector (341)

Conferences

CVPR (19) AAAI (13) NIPS (13) ICCV (11) ECCV (8) ICLR (8) ICML (6) EMNLP (2) ACL (1) IJCAI (1) NAACL (1)

Top co-authors

Jiafei Lyu (14) Yachao Zhang (10) Kai Li (10) Zunnan Xu (9) Chunming He (8) Ronghui Li (7) Rui Yang (7) Yulun Zhang (6) Longxiang Tang (6) Zongqing Lu (6)

Research topics

Core AI (1)

Keywords

diffusion model (12) reinforcement learning (7) video generation (4) image generation (4) semantic segmentation (3) self-supervised learning (3) 3d vision (3) image restoration (3) motion generation (3) state space model (3) pseudo labeling (3) multi-modal learning (3) text-to-image generation (3) reward model (3) transformer architecture (3) adversarial learning (2) contrastive learning (2) object detection (2) weakly supervised learning (2) sample efficiency (2)

Papers

GenPRM: Scaling Test-Time Compute of Process Reward Models via Generative Reasoning AAAI 2026 Zo3T: Zero-Shot 3D-Aware Trajectory-Guided Image-to-Video Generation via Test-Time Training AAAI 2026 VLP: Vision-Language Preference Learning for Embodied Manipulation EMNLP 2025 TCPO: Thought-Centric Preference Optimization for Effective Embodied Decision-making EMNLP 2025 HunyuanPortrait: Implicit Condition Control for Enhanced Portrait Animation CVPR 2025 Reti-Diff: Illumination Degradation Image Restoration with Retinex-based Latent Diffusion Model ICLR 2025 X-NeMo: Expressive Neural Motion Reenactment via Disentangled Latent Attention ICLR 2025 InstantSwap: Fast Customized Concept Swapping across Sharp Shape Differences ICLR 2025 MagicArticulate: Make Your 3D Models Articulation-Ready CVPR 2025 GIVEPose: Gradual Intra-class Variation Elimination for RGB-based Category-Level Object Pose Estimation CVPR 2025 SkillMimic: Learning Basketball Interaction Skills from Demonstrations CVPR 2025 MVPortrait: Text-Guided Motion and Emotion Control for Multi-view Vivid Portrait Animation CVPR 2025 ART: Anonymous Region Transformer for Variable Multi-Layer Transparent Image Generation CVPR 2025 AToM: Aligning Text-to-Motion Model at Event-Level with GPT-4Vision Reward CVPR 2025 Dora: Sampling and Benchmarking for 3D Shape Variational Auto-Encoders CVPR 2025 Cross-Domain Offline Policy Adaptation with Optimal Transport and Dataset Constraint ICLR 2025 World Models with Hints of Large Language Models for Goal Achieving NAACL 2025 LoRA-Gen: Specializing Large Language Model via Online LoRA Generation ICML 2025 Taming Rectified Flow for Inversion and Editing ICML 2025 Densely Connected Parameter-Efficient Tuning for Referring Image Segmentation AAAI 2025 MultiBooth: Towards Generating All Your Concepts in an Image from Text AAAI 2025 SUMO: Search-Based Uncertainty Estimation for Model-Based Offline Reinforcement Learning AAAI 2025 Towards Efficient LLM Grounding for Embodied Multi-Agent Collaboration ACL 2025 Audio-visual Controlled Video Diffusion with Masked Selective State Spaces Modeling for Natural Talking Head Generation ICCV 2025 MaTe: Images Are All You Need for Material Transfer via Diffusion Transformer ICCV 2025 REPARO: Compositional 3D Assets Generation with Differentiable 3D Layout Alignment ICCV 2025 InterSyn: Interleaved Learning for Dynamic Motion Synthesis in the Wild ICCV 2025 A Plug-and-Play Physical Motion Restoration Approach for In-the-Wild High-Difficulty Motions ICCV 2025 FAFA: Frequency-Aware Flow-Aided Self-Supervision for Underwater Object Pose Estimation ECCV 2024 MambaTalk: Efficient Holistic Gesture Synthesis with Selective State Space Models NIPS 2024 ODRL: A Benchmark for Off-Dynamics Reinforcement Learning NIPS 2024 MambaTree: Tree Topology is All You Need in State Space Model NIPS 2024 Bridging the Divide: Reconsidering Softmax and Linear Attention NIPS 2024 COVE: Unleashing the Diffusion Feature Correspondence for Consistent Video Editing NIPS 2024 Real-world Image Dehazing with Coherence-based Pseudo Labeling and Cooperative Unfolding Network NIPS 2024 Follow Your Pose: Pose-Guided Text-to-Video Generation Using Pose-Free Videos AAAI 2024 Chain of Generation: Multi-Modal Gesture Synthesis via Cascaded Conditional Control AAAI 2024 Cross-Modal Match for Language Conditioned 3D Object Grounding AAAI 2024 Dual Mapping of 2D StyleGAN for 3D-Aware Image Generation and Manipulation (Student Abstract) AAAI 2024 STViT: Improving Self-Supervised Multi-Camera Depth Estimation with Spatial-Temporal Context and Adversarial Geometry Regularization (Student Abstract) AAAI 2024 Bridging the Gap: A Unified Video Comprehension Framework for Moment Retrieval and Highlight Detection CVPR 2024 Lodge: A Coarse to Fine Diffusion Network for Long Dance Generation Guided by the Characteristic Dance Primitives CVPR 2024 Using Human Feedback to Fine-tune Diffusion Models without Any Reward Model CVPR 2024 Efficient Diffusion Transformer with Step-wise Dynamic Attention Mediators ECCV 2024 GRA: Detecting Oriented Objects through Group-wise Rotating and Attention ECCV 2024 Realistic Human Motion Generation with Cross-Diffusion Models ECCV 2024 Mind the Interference: Retaining Pre-trained Knowledge in Parameter Efficient Continual Learning of Vision-Language Models ECCV 2024 SEABO: A Simple Search-Based Method for Offline Imitation Learning ICLR 2024 Strategic Preys Make Acute Predators: Enhancing Camouflaged Object Detectors by Generating Camouflaged Objects ICLR 2024 PEARL: Zero-shot Cross-task Preference Alignment and Robust Reward Learning for Robotic Manipulation ICML 2024 Cross-Domain Policy Adaptation by Capturing Representation Mismatch ICML 2024 Exploration and Anti-Exploration with Distributional Random Network Distillation ICML 2024 SAM-E: Leveraging Visual Foundation Model with Sequence Imitation for Embodied Manipulation ICML 2024 BATON: Aligning Text-to-Audio Model Using Human Preference Feedback IJCAI 2024 Camouflaged Object Detection With Feature Decomposition and Edge Reconstruction CVPR 2023 FLAG3D: A 3D Fitness Activity Dataset With Language Instruction CVPR 2023 SOC: Semantic-Assisted Object Cluster for Referring Video Object Segmentation NIPS 2023 CASR: Generating Complex Sequences with Autoregressive Self-Boost Refinement ICLR 2023 MeGraph: Capturing Long-Range Interactions by Alternating Local and Hierarchical Aggregation on Multi-Scaled Graph Hierarchy NIPS 2023 Data-Efficient Image Quality Assessment with Attention-Panel Decoder AAAI 2023 Adversarial Alignment for Source Free Object Detection AAAI 2023 Weakly-Supervised Concealed Object Segmentation with SAM-based Pseudo Labeling and Multi-scale Feature Grouping NIPS 2023 GPT4Tools: Teaching Large Language Model to Use Tools via Self-instruction NIPS 2023 FineDance: A Fine-grained Choreography Dataset for 3D Full Body Dance Generation ICCV 2023 BoxSnake: Polygonal Instance Segmentation with Box Supervision ICCV 2023 Degradation-Resistant Unfolding Network for Heterogeneous Image Fusion ICCV 2023 Neural Capture of Animatable 3D Human from Monocular Video ECCV 2022 ScalableViT: Rethinking the Context-Oriented Generalization of Vision Transformer ECCV 2022 Mildly Conservative Q-Learning for Offline Reinforcement Learning NIPS 2022 Double Check Your State Before Trusting It: Confidence-Aware Bidirectional Offline Model-Based Imagination NIPS 2022 OrdinalCLIP: Learning Rank Prompts for Language-Guided Ordinal Regression NIPS 2022 Rethinking Goal-Conditioned Supervised Learning and Its Connection to Offline RL ICLR 2022 Efficient Continuous Control with Double Actors and Regularized Critics AAAI 2022 A Self-Boosting Framework for Automated Radiographic Report Generation CVPR 2021 Frequency-Aware Spatiotemporal Transformers for Video Inpainting Detection ICCV 2021 Universal and Flexible Optical Aberration Correction Using Deep-Prior Based Deconvolution ICCV 2021 Self-Supervised Video Hashing via Bidirectional Transformers CVPR 2021 Disentangled Non-local Neural Networks ECCV 2020 4D Association Graph for Realtime Multi-Person Motion Capture Using Multiple Video Cameras CVPR 2020 Neighborhood Preserving Hashing for Scalable Video Retrieval ICCV 2019 Structure From Recurrent Motion: From Rigidity to Recurrency CVPR 2018 Scale-Aware Face Detection CVPR 2017 Joint Training of Cascaded CNN for Face Detection CVPR 2016