conftrace_

Xihui Liu

61 papers · 2017–2026 · 8 conferences · across top CS/AI conferences

Achievements

Jump to papers ↓
+14 more ↓ 🏃 Academic Marathon (8) 🌍 Conference Polyglot (7) 🌉 Interdisciplinary Bridge 🧭 Keyword Pioneer 🐝 Cross-Pollinator (10)
🌈 Renaissance Researcher (8) 🌍 Conference Polyglot (7) 🏃 Academic Marathon (8) 🏆 Grand Slam 👑 Triple Crown 🤝 Dynamic Duo (10) 🔬 Deep Specialist (17) 🧬 Topic Evolution 🏆 Keyword Champion (2) 🗃️ Keyword Collector (234) 🔥 Unstoppable (5) 🚀 Conference Pioneer 💎 Century Club (59) Prolific Year (17)

Conferences

CVPR (19) ICCV (14) NIPS (11) ECCV (7) ICML (3) WACV (3) AAAI (2) ICLR (2)

Research topics

Papers

Self-NPO: Data-Free Diffusion Model Enhancement via Truncated Diffusion Fine-Tuning AAAI 2026 GENMAC: Compositional Text-to-Video Generation with Multi-Agent Collaboration AAAI 2026 LLaVA-3D: A Simple yet Effective Pathway to Empowering LMMs with 3D Capabilities ICCV 2025 Bridging Continuous and Discrete Tokens for Autoregressive Visual Generation ICCV 2025 LiT: Delving into a Simple Linear Diffusion Transformer for Image Generation ICCV 2025 GameFactory: Creating New Games with Generative Interactive Videos ICCV 2025 V2PE: Improving Multimodal Long-Context Capability of Vision-Language Models with Variable Visual Position Encoding ICCV 2025 RoboFactory: Exploring Embodied Agent Collaboration with Compositional Constraints ICCV 2025 Moto: Latent Motion Token as the Bridging Language for Learning Robot Manipulation from Videos ICCV 2025 DreamCube: RGB-D Panorama Generation via Multi-plane Synchronization ICCV 2025 GigaTok: Scaling Visual Tokenizers to 3 Billion Parameters for Autoregressive Image Generation ICCV 2025 WorldSimBench: Towards Video Generation Models as World Simulators ICML 2025 T2V-CompBench: A Comprehensive Benchmark for Compositional Text-to-video Generation CVPR 2025 HMAR: Efficient Hierarchical Masked Auto-Regressive Image Generation CVPR 2025 T2ISafety: Benchmark for Assessing Fairness, Toxicity, and Privacy in Image Generation CVPR 2025 MBQ: Modality-Balanced Quantization for Large Vision-Language Models CVPR 2025 Parallelized Autoregressive Visual Generation CVPR 2025 MIDI: Multi-Instance Diffusion for Single Image to 3D Scene Generation CVPR 2025 UniMC: Taming Diffusion Transformer for Unified Keypoint-Guided Multi-Class Image Generation ICML 2025 Accelerating Auto-regressive Text-to-Image Generation with Training-free Speculative Jacobi Decoding ICLR 2025 PUMA: Empowering Unified MLLM with Multi-granular Visual Generation ICCV 2025 EmbodiedScan: A Holistic Multi-Modal 3D Perception Suite Towards Embodied AI CVPR 2024 DreamComposer: Controllable 3D Object Generation via Multi-View Conditions CVPR 2024 Point Transformer V3: Simpler Faster Stronger CVPR 2024 ScanReason: Empowering 3D Visual Grounding with Reasoning Capabilities ECCV 2024 TC4D: Trajectory-Conditioned Text-to-4D Generation ECCV 2024 FiT: Flexible Vision Transformer for Diffusion Model ICML 2024 PredBench: Benchmarking Spatio-Temporal Prediction across Diverse Disciplines ECCV 2024 GenArtist: Multimodal LLM as an Agent for Unified Image Generation and Editing NIPS 2024 Hierarchical Diffusion Autoencoders and Disentangled Image Manipulation WACV 2024 Shape-Guided Diffusion With Inside-Outside Attention WACV 2024 HyperHuman: Hyper-Realistic Human Generation with Latent Structural Diffusion ICLR 2024 Scene Graph Disentanglement and Composition for Generalizable Complex Image Generation NIPS 2024 4Diffusion: Multi-view Video Diffusion Model for 4D Generation NIPS 2024 LVD-2M: A Long-take Video Dataset with Temporally Dense Captions NIPS 2024 BEACON: Benchmark for Comprehensive RNA Tasks and Language Models NIPS 2024 HumanGaussian: Text-Driven 3D Human Generation with Gaussian Splatting CVPR 2024 Towards Large-scale 3D Representation Learning with Multi-dataset Point Prompt Training CVPR 2024 Masked Scene Contrast: A Scalable Framework for Unsupervised 3D Representation Learning CVPR 2023 Seeing is not always believing: Benchmarking Human and Model Perception of AI-Generated Images NIPS 2023 CorresNeRF: Image Correspondence Priors for Neural Radiance Fields NIPS 2023 OV-PARTS: Towards Open-Vocabulary Part Segmentation NIPS 2023 T2I-CompBench: A Comprehensive Benchmark for Open-world Compositional Text-to-image Generation NIPS 2023 GLeaD: Improving GANs With a Generator-Leading Task CVPR 2023 RIFormer: Keep Your Vision Backbone Effective but Removing Token Mixer CVPR 2023 Back to the Source: Diffusion-Driven Adaptation To Test-Time Corruption CVPR 2023 Learning Transferable Spatiotemporal Representations From Natural Script Knowledge CVPR 2023 DDP: Diffusion Model for Dense Visual Prediction ICCV 2023 More Control for Free! Image Synthesis With Semantic Diffusion Guidance WACV 2023 Point Transformer V2: Grouped Vector Attention and Partition-based Pooling NIPS 2022 Bridging Video-Text Retrieval With Multiple Choice Questions CVPR 2022 MILES: Visual BERT Pre-training with Injected Language Semantics for Video-Text Retrieval ECCV 2022 Open-Edit: Open-Domain Image Manipulation with Open-Vocabulary Instructions ECCV 2020 CAMP: Cross-Modal Adaptive Message Passing for Text-Image Retrieval ICCV 2019 Learning to Predict Layout-to-image Conditional Convolutions for Semantic Image Synthesis NIPS 2019 Improving Referring Expression Grounding With Cross-Modal Attention-Guided Erasing CVPR 2019 Improving Deep Visual Representation for Person Re-identification by Global and Local Image-language Association ECCV 2018 Show, Tell and Discriminate: Image Captioning by Self-retrieval with Partially Labeled Data ECCV 2018 HydraPlus-Net: Attentive Deep Features for Pedestrian Analysis ICCV 2017 Orientation Invariant Feature Embedding and Spatial Temporal Regularization for Vehicle Re-Identification ICCV 2017 Object Detection in Videos With Tubelet Proposal Networks CVPR 2017