Zhongang Qi
27 papers · 2019–2025 · 7 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+9 more ↓ Show less ↑
π Interdisciplinary Bridge π Renaissance Researcher (5) π Academic Marathon (6) π Conference Polyglot (7) πΊοΈ Taxonomy Completionist (55)
π§
Keyword Pioneer
π£
Hot Topic Early Bird
π
Conference Polyglot
(7)
π€
Dynamic Duo
(22)
β
The Questioner
β‘
Prolific Year
(8)
π
Century Club
(27)
ποΈ
Keyword Collector
(135)
π₯
Unstoppable
(7)
Conferences
CVPR (8)
AAAI (7)
ICCV (6)
NIPS (3)
ECCV (1)
ICML (1)
IJCAI (1)
Top co-authors
Keywords
diffusion model
(5)
controllable generation
(3)
object detection
(3)
image synthesis
(3)
contrastive learning
(2)
image generation
(2)
spherical geometry
(2)
convolutional neural network
(2)
image-text retrieval
(2)
fine-grained understanding
(2)
transfer learning
(2)
video understanding
(2)
multimodal large language model
(2)
text-to-image generation
(2)
video captioning
(1)
mathematical reasoning
(1)
video generation
(1)
benchmark evaluation
(1)
disparity estimation
(1)
multimodal learning
(1)
Papers
VisionMath: Vision-Form Mathematical Problem-Solving
ICCV 2025
Mono2Stereo: A Benchmark and Empirical Study for Stereo Conversion
CVPR 2025
Taming Rectified Flow for Inversion and Editing
ICML 2025
CustomCrafter: Customized Video Generation with Preserving Motion and Concept Composition Abilities
AAAI 2025
Less is More: Empowering GUI Agent with Context-Aware Simplification
ICCV 2025
DOGR: Towards Versatile Visual Document Grounding and Referring
ICCV 2025
Mamba-3VL: Taming State Space Model for 3D Vision Language Learning
ICCV 2025
E.T. Bench: Towards Open-Ended Event-Level Video-Language Understanding
NIPS 2024
PhotoMaker: Customizing Realistic Human Photos via Stacked ID Embedding
CVPR 2024
How to Make Cross Encoder a Good Teacher for Efficient Image-Text Retrieval?
CVPR 2024
T2I-Adapter: Learning Adapters to Dig Out More Controllable Ability for Text-to-Image Diffusion Models
AAAI 2024
EA-VTR: Event-Aware Video-Text Retrieval
ECCV 2024
SphereDiffusion: Spherical Geometry-Aware Distortion Resilient Diffusion Model
AAAI 2024
SGAT4PASS: Spherical Geometry-Aware Transformer for PAnoramic Semantic Segmentation
IJCAI 2023
Exploiting Contextual Objects and Relations for 3D Visual Grounding
NIPS 2023
Tagging before Alignment: Integrating Multi-Modal Tags for Video-Text Retrieval
AAAI 2023
Accelerating the Training of Video Super-resolution Models
AAAI 2023
LayoutDiffusion: Controllable Diffusion Model for Layout-to-Image Generation
CVPR 2023
ViLEM: Visual-Language Error Modeling for Image-Text Retrieval
CVPR 2023
Order-Prompted Tag Sequence Generation for Video Tagging
ICCV 2023
MasaCtrl: Tuning-Free Mutual Self-Attention Control for Consistent Image Synthesis and Editing
ICCV 2023
BTS: A Bi-Lingual Benchmark for Text Segmentation in the Wild
CVPR 2022
Open-Book Video Captioning With Retrieve-Copy-Generate Network
CVPR 2021
Finding Discriminative Filters for Specific Degradations in Blind Super-Resolution
NIPS 2021
Visualizing Deep Networks by Optimizing with Integrated Gradients
AAAI 2020
ScaleNet - Improve CNNs through Recursively Rescaling Objects
AAAI 2020
PointConv: Deep Convolutional Networks on 3D Point Clouds
CVPR 2019