Zicheng Liu

98 papers · 2013–2026 · 11 conferences · across top CS/AI conferences

Achievements

+17 more ↓

🗺️ Taxonomy Completionist (12) 🧭 Keyword Pioneer 🌉 Interdisciplinary Bridge 🌈 Renaissance Researcher (5) 🌍 Conference Polyglot (11)

🐝 Cross-Pollinator (10) 🌈 Renaissance Researcher (5) 🌉 Interdisciplinary Bridge 🌟 Keyword Trendsetter Combo (4) 🏠 Conference Loyalist (30) 🏆 Grand Slam 👑 Triple Crown 🤝 Dynamic Duo (43) 🔬 Deep Specialist (15) 🧬 Topic Evolution 🏆 Keyword Champion (2) 🗃️ Keyword Collector (345) ⚡ Prolific Year (21) ❓ The Questioner 💎 Century Club (95) 🔥 Unstoppable (9) 🚀 Conference Pioneer

Conferences

CVPR (30) ICLR (12) ECCV (9) ICML (9) NIPS (9) AAAI (8) ICCV (6) ACL (5) WACV (5) IJCAI (3) EMNLP (2)

Top co-authors

Lijuan Wang (43) Jianfeng Wang (21) Kevin Lin (19) Zhengyuan Yang (18) Stan Z. Li (17) Yinpeng Chen (17) Linjie Li (17) Siyuan Li (16) Chung-Ching Lin (13) Zhe Gan (13)

Keywords

diffusion model (10) image generation (9) multimodal learning (8) object detection (6) transfer learning (6) image captioning (5) zero-shot learning (5) vision-language model (5) large language model (5) generative model (5) image segmentation (4) knowledge distillation (4) video generation (4) representation learning (4) vector quantization (3) video understanding (3) efficient computing (3) model compression (3) semi-supervised learning (3) few-shot learning (3)

Papers

Reliable Use of Lemmas via Eligibility Reasoning and Section-Aware Reinforcement Learning ACL 2026 TrinityDNA: A Bio-Inspired Foundational Model for Efficient Long-Sequence DNA Modeling AAAI 2026 MergeDNA: Context-Aware Genome Modeling with Dynamic Tokenization Through Token Merging AAAI 2026 Conditional Text-to-Image Generation with Reference Guidance WACV 2026 CBGBench: Fill in the Blank of Protein-Molecule Complex Binding Graph ICLR 2025 DaCapo: Score Distillation as Stacked Bridge for Fast and High-quality 3D Editing CVPR 2025 SoftVQ-VAE: Efficient 1-Dimensional Continuous Tokenizer CVPR 2025 Taming LLMs with Gradient Grouping ACL 2025 Self-Taught Agentic Long Context Understanding ACL 2025 B-VLLM: A Vision Large Language Model with Balanced Spatio-Temporal Tokens ICCV 2025 Agent Laboratory: Using LLM Agents as Research Assistants EMNLP 2025 TTT-Bench: A Benchmark for Evaluating Reasoning Ability with Simple and Novel Tic-Tac-Toe-style Games EMNLP 2025 TaDA: Training-free recipe for Decoding with Adaptive KV Cache Compression and Mean-centering ACL 2025 Exploring Invariance in Images through One-way Wave Equations ICML 2025 Masked Autoencoders Are Effective Tokenizers for Diffusion Models ICML 2025 Tuning Timestep-Distilled Diffusion Model Using Pairwise Sample Optimization ICLR 2025 EVA: Geometric Inverse Design for Fast Protein Motif-Scaffolding with Coupled Flow ICLR 2025 Bridging the Gap between Database Search and \emph{De Novo} Peptide Sequencing with SearchNovo ICLR 2025 MergeVQ: A Unified Framework for Visual Generation and Representation with Disentangled Token Merging and Quantization CVPR 2025 MM-Vet: Evaluating Large Multimodal Models for Integrated Capabilities ICML 2024 VQDNA: Unleashing the Power of Vector Quantization for Multi-Species Genomic Sequence Modeling ICML 2024 Completing Visual Objects via Bridging Generation and Segmentation ICML 2024 Bring Metric Functions into Diffusion Models IJCAI 2024 MM-Narrator: Narrating Long-form Videos with Multimodal In-Context Learning CVPR 2024 Training Diffusion Models Towards Diverse Image Generation with Reinforcement Learning CVPR 2024 Segment and Caption Anything CVPR 2024 DisCo: Disentangled Control for Realistic Human Dance Generation CVPR 2024 Idea2Img: Iterative Self-Refinement with GPT-4V for Automatic Image Design and Generation ECCV 2024 LongVQ: Long Sequence Modeling with Vector Quantization on Structured Memory IJCAI 2024 IDOL: Unified Dual-Modal Latent Diffusion for Human-Centric Joint Video-Depth Generation ECCV 2024 PSC-CPI: Multi-Scale Protein Sequence-Structure Contrasting for Efficient and Generalizable Compound-Protein Interaction Prediction AAAI 2024 ORES: Open-Vocabulary Responsible Visual Synthesis AAAI 2024 PPFLOW: Target-Aware Peptide Design with Torsional Flow Matching ICML 2024 MPT: Mesh Pre-Training With Transformers for Human Pose and Mesh Reconstruction WACV 2024 Short-Long Convolutions Help Hardware-Efficient Linear Attention to Focus on Long Sequences ICML 2024 StrokeNUWA—Tokenizing Strokes for Vector Graphic Synthesis ICML 2024 MogaNet: Multi-order Gated Aggregation Network ICLR 2024 SemiReward: A General Reward Model for Semi-supervised Learning ICLR 2024 RDesign: Hierarchical Data-efficient Representation Learning for Tertiary Structure-based RNA Design ICLR 2024 Taming Diffusion Prior for Image Super-Resolution with Domain Shift SDEs NIPS 2024 GRiT: A Generative Region-to-text Transformer for Object Understanding ECCV 2024 Binary Latent Diffusion CVPR 2023 Learning 3D Photography Videos via Self-supervised Diffusion on Single Images IJCAI 2023 NUWA-XL: Diffusion over Diffusion for eXtremely Long Video Generation ACL 2023 TransMOT: Spatial-Temporal Graph Transformer for Multiple Object Tracking WACV 2023 MMPTRACK: Large-Scale Densely Annotated Multi-Camera Multiple People Tracking Benchmark WACV 2023 Deep Frequency Filtering for Domain Generalization CVPR 2023 Adaptive Human Matting for Dynamic Videos CVPR 2023 An Empirical Study of End-to-End Video-Language Transformers With Masked Visual Modeling CVPR 2023 ReCo: Region-Controlled Text-to-Image Generation CVPR 2023 LAVENDER: Unifying Video-Language Understanding As Masked Language Modeling CVPR 2023 Neural Voting Field for Camera-Space 3D Hand Pose Estimation CVPR 2023 PaintSeg: Painting Pixels for Training-free Segmentation NIPS 2023 Harnessing Hard Mixed Samples with Decoupled Regularizer NIPS 2023 OpenSTL: A Comprehensive Benchmark of Spatio-Temporal Predictive Learning NIPS 2023 Equivariant Similarity for Vision-Language Foundation Models ICCV 2023 Layer Grafted Pre-training: Bridging Contrastive Learning And Masked Image Modeling For Label-Efficient Representations ICLR 2023 Energy-Inspired Self-Supervised Pretraining for Vision Models ICLR 2023 Injecting Semantic Concepts Into End-to-End Image Captioning CVPR 2022 An Empirical Study of Training End-to-End Vision-and-Language Transformers CVPR 2022 Coarse-to-Fine Vision-Language Pre-training with Fusion in the Backbone NIPS 2022 Towards Reasonable Budget Allocation in Untargeted Graph Structure Attacks via Gradient Debias NIPS 2022 NUWA-Infinity: Autoregressive over Autoregressive Generation for Infinite Visual Synthesis NIPS 2022 An Empirical Study of GPT-3 for Few-Shot Knowledge-Based VQA AAAI 2022 OVIS: Open-Vocabulary Visual Instance Search via Visual-Semantic Aligned Representation Learning AAAI 2022 SwinBERT: End-to-End Transformers With Sparse Attention for Video Captioning CVPR 2022 Cross-Modal Representation Learning for Zero-Shot Action Recognition CVPR 2022 ELEVATER: A Benchmark and Toolkit for Evaluating Language-Augmented Visual Models NIPS 2022 Unsupervised Learning of Full-Waveform Inversion: Connecting CNN and Partial Differential Equation in a Loop ICLR 2022 Playing Lottery Tickets with Vision and Language AAAI 2022 An Intriguing Property of Geophysics Inversion ICML 2022 Lifelong Unsupervised Domain Adaptive Person Re-Identification With Coordinated Anti-Forgetting and Adaptation CVPR 2022 Mobile-Former: Bridging MobileNet and Transformer CVPR 2022 AutoMix: Unveiling the Power of Mixup for Stronger Classifiers ECCV 2022 Should All Proposals Be Treated Equally in Object Detection? ECCV 2022 UniTAB: Unifying Text and Box Outputs for Grounded Vision-Language Modeling ECCV 2022 "A Simple Approach and Benchmark for 21,000-Category Object Detection" ECCV 2022 Generalized Clustering and Multi-Manifold Learning With Geometric Structure Preservation WACV 2022 Scaling Up Vision-Language Pre-Training for Image Captioning CVPR 2022 End-to-End Semi-Supervised Object Detection With Soft Teacher ICCV 2021 VIVO: Visual Vocabulary Pre-Training for Novel Object Captioning AAAI 2021 Probabilistic Model Distillation for Semantic Correspondence CVPR 2021 End-to-End Human Pose and Mesh Reconstruction with Transformers CVPR 2021 Compressing Visual-Linguistic Model via Knowledge Distillation ICCV 2021 Stronger NAS with Weaker Predictors NIPS 2021 Mesh Graphormer ICCV 2021 MicroNet: Improving Image Recognition With Extremely Low FLOPs ICCV 2021 Revisiting Dynamic Convolution via Matrix Decomposition ICLR 2021 SEED: Self-supervised Distillation For Visual Representation ICLR 2021 Rethinking Classification and Localization for Object Detection CVPR 2020 Dynamic ReLU ECCV 2020 Dynamic Convolution: Attention Over Convolution Kernels CVPR 2020 Large Scale Incremental Learning CVPR 2019 Reinforced Temporal Attention and Split-Rate Transfer for Depth-Based Person Re-Identification ECCV 2018 Tensor-Based Human Body Modeling CVPR 2013 Semi-supervised Node Splitting for Random Forest Construction CVPR 2013 Probabilistic Graphlet Cut: Exploiting Spatial Structure Cue for Weakly Supervised Image Segmentation CVPR 2013 HON4D: Histogram of Oriented 4D Normals for Activity Recognition from Depth Sequences CVPR 2013