Xintao Wang
76 papers · 2018–2026 · 11 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+15 more ↓ Show less ↑
π Academic Marathon (7) π Conference Polyglot (11) π§ Keyword Pioneer π Interdisciplinary Bridge π Cross-Pollinator (11)
π
Cross-Pollinator
(11)
π
Renaissance Researcher
(6)
πΊοΈ
Taxonomy Completionist
(88)
π
Conference Loyalist
(22)
π§¬
Topic Evolution
π€
Dynamic Duo
(37)
π
Keyword Champion
(7)
π
Triple Crown
π
Grand Slam
π¬
Deep Specialist
(25)
β‘
Prolific Year
(12)
ποΈ
Keyword Collector
(279)
π₯
Unstoppable
(5)
β
The Questioner
π
Century Club
(74)
Conferences
CVPR (22)
AAAI (10)
NIPS (8)
ECCV (7)
ICCV (7)
ACL (6)
ICLR (6)
EMNLP (5)
ICML (3)
IJCAI (1)
NAACL (1)
Top co-authors
Keywords
diffusion model
(21)
video generation
(12)
image restoration
(8)
large language model
(8)
video super-resolution
(7)
role-playing agent
(6)
text-to-image generation
(6)
image super-resolution
(6)
text-to-video generation
(5)
image generation
(5)
generative adversarial network
(5)
image editing
(4)
convolutional neural network
(3)
video editing
(3)
latent space
(3)
attention mechanism
(3)
zero-shot learning
(3)
language model
(3)
image synthesis
(3)
motion control
(3)
Papers
Can LLMs Learn to Map the World from Local Descriptions?
ACL 2026
HumanLLM: Benchmarking and Improving LLM Anthropomorphism via Human Cognitive Patterns
ACL 2026
ReCamMaster: Camera-Controlled Generative Rendering from A Single Video
ICCV 2025
Guess What I am Thinking: A Benchmark for Inner Thought Reasoning of Role-Playing Language Agents
EMNLP 2025
Character is Destiny: Can Persona-assigned Language Models Make Personal Choices?
EMNLP 2025
Curse of Knowledge: Your Guidance and Provided Knowledge are biasing LLM Judges in Complex Evaluation
EMNLP 2025
PatchVSR: Breaking Video Diffusion Resolution Limits with Patch-wise Video Super-Resolution
CVPR 2025
StyleMaster: Stylize Your Video with Artistic Generation and Translation
CVPR 2025
SketchVideo: Sketch-based Video Generation and Editing
CVPR 2025
SynCamMaster: Synchronizing Multi-Camera Video Generation from Diverse Viewpoints
ICLR 2025
FullDiT: Video Generative Foundation Models with Multimodal Control via Full Attention
ICCV 2025
GameFactory: Creating New Games with Generative Interactive Videos
ICCV 2025
CoSER: Coordinating LLM-Based Persona Simulation of Established Roles
ICML 2025
Image Conductor: Precision Control for Interactive Video Synthesis
AAAI 2025
CustomCrafter: Customized Video Generation with Preserving Motion and Concept Composition Abilities
AAAI 2025
Anti-Diffusion: Preventing Abuse of Modifications of Diffusion-Based Models
AAAI 2025
Ground Every Sentence: Improving Retrieval-Augmented LLMs with Interleaved Reference-Claim Generation
NAACL 2025
BOOKWORLD: From Novels to Interactive Agent Societies for Story Creation
ACL 2025
3DTrajMaster: Mastering 3D Trajectory for Multi-Entity Motion in Video Generation
ICLR 2025
Teaching Large Language Models to Express Knowledge Boundary from Their Own Signals
ACL 2025
FreeNoise: Tuning-Free Longer Video Diffusion via Noise Rescheduling
ICLR 2024
ReVideo: Remake a Video with Motion and Content Control
NIPS 2024
VideoTetris: Towards Compositional Text-to-Video Generation
NIPS 2024
MiraData: A Large-Scale Video Dataset with Long Durations and Structured Captions
NIPS 2024
Follow Your Pose: Pose-Guided Text-to-Video Generation Using Pose-Free Videos
AAAI 2024
T2I-Adapter: Learning Adapters to Dig Out More Controllable Ability for Text-to-Image Diffusion Models
AAAI 2024
SphereDiffusion: Spherical Geometry-Aware Distortion Resilient Diffusion Model
AAAI 2024
InCharacter: Evaluating Personality Fidelity in Role-Playing Agents through Psychological Interviews
ACL 2024
Light Up the Shadows: Enhance Long-Tailed Entity Grounding with Concept-Guided Vision-Language Models
ACL 2024
Seeing and Hearing: Open-domain Visual-Audio Generation with Diffusion Latent Aligners
CVPR 2024
Rethinking the Objectives of Vector-Quantized Tokenizers for Image Synthesis
CVPR 2024
PhotoMaker: Customizing Realistic Human Photos via Stacked ID Embedding
CVPR 2024
SmartEdit: Exploring Complex Instruction-based Image Editing with Multimodal Large Language Models
CVPR 2024
Scaling Up to Excellence: Practicing Model Scaling for Photo-Realistic Image Restoration In the Wild
CVPR 2024
X-Adapter: Adding Universal Compatibility of Plugins for Upgraded Diffusion Model
CVPR 2024
DiffEditor: Boosting Accuracy and Flexibility on Diffusion-based Image Editing
CVPR 2024
EvalCrafter: Benchmarking and Evaluating Large Video Generation Models
CVPR 2024
VideoCrafter2: Overcoming Data Limitations for High-Quality Video Diffusion Models
CVPR 2024
MOFA-Video: Controllable Image Animation via Generative Motion Field Adaptions in Frozen Image-to-Video Diffusion Model
ECCV 2024
BrushNet: A Plug-and-Play Image Inpainting Model with Decomposed Dual-Branch Diffusion
ECCV 2024
DreamDiffusion: High-Quality EEG-to-Image Generation with Temporal Masked Signal Modeling and CLIP Alignment
ECCV 2024
Make a Cheap Scaling: A Self-Cascade Diffusion Model for Higher-Resolution Adaptation
ECCV 2024
DynamiCrafter: Animating Open-domain Images with Video Diffusion Priors
ECCV 2024
Evaluating Character Understanding of Large Language Models via Character Profiling from Fictional Works
EMNLP 2024
Capturing Minds, Not Just Words: Enhancing Role-Playing Language Models with Personality-Indicative Data
EMNLP 2024
ScaleCrafter: Tuning-free Higher-Resolution Visual Generation with Diffusion Models
ICLR 2024
Making LLaMA SEE and Draw with SEED Tokenizer
ICLR 2024
DragonDiffusion: Enabling Drag-style Manipulation on Diffusion Models
ICLR 2024
Unifying Image Processing as Visual Prompting Question Answering
ICML 2024
Accelerating the Training of Video Super-resolution Models
AAAI 2023
DeSRA: Detect and Delete the Artifacts of GAN-based Real-World Super-Resolution Models
ICML 2023
Inserting Anybody in Diffusion Models via Celeb Basis
NIPS 2023
OSRT: Omnidirectional Image Super-Resolution With Distortion-Aware Transformer
CVPR 2023
Activating More Pixels in Image Super-Resolution Transformer
CVPR 2023
Dream3D: Zero-Shot Text-to-3D Synthesis Using 3D Shape Prior and Text-to-Image Diffusion Models
CVPR 2023
Tune-A-Video: One-Shot Tuning of Image Diffusion Models for Text-to-Video Generation
ICCV 2023
MasaCtrl: Tuning-Free Mutual Self-Attention Control for Consistent Image Synthesis and Editing
ICCV 2023
FateZero: Fusing Attentions for Zero-shot Text-based Video Editing
ICCV 2023
MAPS-KB: A Million-Scale Probabilistic Simile Knowledge Base
AAAI 2023
Mitigating Artifacts in Real-World Video Super-resolution Models
AAAI 2023
Mix-of-Show: Decentralized Low-Rank Adaptation for Multi-Concept Customization of Diffusion Models
NIPS 2023
AnimeSR: Learning Real-World Super-Resolution Models for Animation Videos
NIPS 2022
Language Models as Knowledge Embeddings
IJCAI 2022
Metric Learning Based Interactive Modulation for Real-World Super-Resolution
ECCV 2022
Rethinking Alignment in Video Super-Resolution Transformers
NIPS 2022
VQFR: Blind Face Restoration with Vector-Quantized Dictionary and Parallel Decoder
ECCV 2022
Towards Real-World Blind Face Restoration With Generative Facial Prior
CVPR 2021
Finding Discriminative Filters for Specific Degradations in Blind Super-Resolution
NIPS 2021
Positional Encoding As Spatial Inductive Bias in GANs
CVPR 2021
BasicVSR: The Search for Essential Components in Video Super-Resolution and Beyond
CVPR 2021
Robust Reference-Based Super-Resolution via C2-Matching
CVPR 2021
Towards Vivid and Diverse Image Colorization With Generative Color Prior
ICCV 2021
Understanding Deformable Alignment in Video Super-Resolution
AAAI 2021
GLEAN: Generative Latent Bank for Large-Factor Image Super-Resolution
CVPR 2021
Deep Network Interpolation for Continuous Imagery Effect Transition
CVPR 2019
Recovering Realistic Texture in Image Super-Resolution by Deep Spatial Feature Transform
CVPR 2018