Sergey Tulyakov
88 papers · 2015–2025 · 11 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+15 more ↓ Show less ↑
π Academic Marathon (10) π Conference Polyglot (11) π§ Keyword Pioneer π Interdisciplinary Bridge π Cross-Pollinator (5)
π
Cross-Pollinator
(5)
π
Renaissance Researcher
(8)
πΊοΈ
Taxonomy Completionist
(83)
π
Conference Loyalist
(39)
π€
Dynamic Duo
(39)
π
Triple Crown
π
Keyword Champion
(20)
π
Grand Slam
π¬
Deep Specialist
(25)
π
Century Club
(88)
β‘
Prolific Year
(18)
ποΈ
Keyword Collector
(352)
π₯
Unstoppable
(8)
π
Trend Setter
β
The Questioner
Conferences
CVPR (39)
ICLR (11)
NIPS (11)
ICCV (10)
ECCV (8)
ICML (3)
WACV (2)
AAAI (1)
ACL (1)
EMNLP (1)
NAACL (1)
Top co-authors
Research topics
Keywords
video generation
(20)
diffusion model
(18)
image generation
(9)
novel view synthesis
(8)
multimodal learning
(7)
3d reconstruction
(6)
video diffusion
(5)
generative model
(5)
neural rendering
(5)
knowledge distillation
(5)
model compression
(4)
generative adversarial network
(4)
diffusion transformer
(4)
text-to-image generation
(4)
video synthesis
(4)
unsupervised learning
(4)
neural radiance field
(4)
volumetric rendering
(4)
text-to-video generation
(3)
3d generation
(3)
Papers
SnapGen: Taming High-Resolution Text-to-Image Models for Mobile Devices with Efficient Architectures and Training
CVPR 2025
Can Text-to-Video Generation help Video-Language Alignment?
CVPR 2025
4Real-Video: Learning Generalizable Photo-Realistic 4D Video Diffusion
CVPR 2025
Wonderland: Navigating 3D Scenes from a Single Image
CVPR 2025
Multi-subject Open-set Personalization in Video Generation
CVPR 2025
AC3D: Analyzing and Improving 3D Camera Control in Video Diffusion Transformers
CVPR 2025
SnapGen-V: Generating a Five-Second Video within Five Seconds on a Mobile Device
CVPR 2025
Omni-ID: Holistic Identity Representation Designed for Generative Tasks
CVPR 2025
Mind the Time: Temporally-Controlled Multi-Event Video Generation
CVPR 2025
Video Motion Transfer with Diffusion Transformers
CVPR 2025
DELTA: DENSE EFFICIENT LONG-RANGE 3D TRACKING FOR ANY VIDEO
ICLR 2025
GTR: Improving Large 3D Reconstruction Models through Geometry and Texture Refinement
ICLR 2025
Lightweight Predictive 3D Gaussian Splats
ICLR 2025
Scalable Ranked Preference Optimization for Text-to-Image Generation
ICCV 2025
T2Bs: Text-to-Character Blendshapes via Video Generation
ICCV 2025
AV-Link: Temporally-Aligned Diffusion Features for Cross-Modal Audio-Video Generation
ICCV 2025
MaskControl: Spatio-Temporal Control for Masked Motion Synthesis
ICCV 2025
Improving the Diffusability of Autoencoders
ICML 2025
I Think, Therefore I Diffuse: Enabling Multimodal In-Context Reasoning in Diffusion Models
ICML 2025
VD3D: Taming Large Video Diffusion Transformers for 3D Camera Control
ICLR 2025
TextCraftor: Your Text Encoder Can be Image Quality Controller
CVPR 2024
Snap Video: Scaled Spatiotemporal Transformers for Text-to-Video Synthesis
CVPR 2024
VIMI: Grounding Video Generation through Multi-modal Instruction
EMNLP 2024
Efficient Training with Denoised Neural Weights
ECCV 2024
UpFusion: Novel View Diffusion from Unposed Sparse View Observations
ECCV 2024
SceneTex: High-Quality Texture Synthesis for Indoor Scenes via Diffusion Priors
CVPR 2024
Towards Text-guided 3D Scene Composition
CVPR 2024
E$^2$GAN: Efficient Training of Efficient GANs for Image-to-Image Translation
ICML 2024
4Real: Towards Photorealistic 4D Scene Generation via Video Diffusion Models
NIPS 2024
AsCAN: Asymmetric Convolution-Attention Networks for Efficient Recognition and Generation
NIPS 2024
BitsFusion: 1.99 bits Weight Quantization of Diffusion Model
NIPS 2024
Panda-70M: Captioning 70M Videos with Multiple Cross-Modality Teachers
CVPR 2024
TC4D: Trajectory-Conditioned Text-to-4D Generation
ECCV 2024
MyVLM: Personalizing VLMs for User-Specific Queries
ECCV 2024
SF-V: Single Forward Video Generation Model
NIPS 2024
Hierarchical Patch Diffusion Models for High-Resolution Video Generation
CVPR 2024
Evaluating Very Long-Term Conversational Memory of LLM Agents
ACL 2024
HyperHuman: Hyper-Realistic Human Generation with Latent Structural Diffusion
ICLR 2024
Magic123: One Image to High-Quality 3D Object Generation Using Both 2D and 3D Diffusion Priors
ICLR 2024
SPAD: Spatially Aware Multi-View Diffusers
CVPR 2024
4D-fy: Text-to-4D Generation Using Hybrid Score Distillation Sampling
CVPR 2024
Control-NeRF: Editable Feature Volumes for Scene Rendering and Manipulation
WACV 2023
SnapFusion: Text-to-Image Diffusion Model on Mobile Devices within Two Seconds
NIPS 2023
LightSpeed: Light and Fast Neural Light Fields on Mobile Devices
NIPS 2023
Autodecoding Latent 3D Diffusion Models
NIPS 2023
DisCoScene: Spatially Disentangled Generative Radiance Fields for Controllable 3D-Aware Scene Synthesis
CVPR 2023
Make-a-Story: Visual Memory Conditioned Consistent Story Generation
CVPR 2023
SDFusion: Multimodal 3D Shape Completion, Reconstruction, and Generation
CVPR 2023
Invertible Neural Skinning
CVPR 2023
Affection: Learning Affective Explanations for Real-World Visual Data
CVPR 2023
Real-Time Neural Light Field on Mobile Devices
CVPR 2023
3DAvatarGAN: Bridging Domains for Personalized Editable Avatars
CVPR 2023
Unsupervised Volumetric Animation
CVPR 2023
ShapeTalk: A Language Dataset and Framework for 3D Shape Edits and Deformations
CVPR 2023
Rethinking Vision Transformers for MobileNet Size and Speed
ICCV 2023
Text2Tex: Text-driven Texture Synthesis via Diffusion Models
ICCV 2023
InfiniCity: Infinite-Scale City Synthesis
ICCV 2023
Discrete Contrastive Diffusion for Cross-Modal Music and Image Generation
ICLR 2023
3D generation on ImageNet
ICLR 2023
StyleGAN-V: A Continuous Video Generator With the Price, Image Quality and Perks of StyleGAN2
CVPR 2022
InOut: Diverse Image Outpainting via GAN Inversion
CVPR 2022
Layer Freezing & Data Sieving: Missing Pieces of a Generic Framework for Sparse Training
NIPS 2022
EfficientFormer: Vision Transformers at MobileNet Speed
NIPS 2022
Playable Environments: Video Manipulation in Space and Time
CVPR 2022
InfinityGAN: Towards Infinite-Pixel Image Synthesis
ICLR 2022
F8Net: Fixed-Point 8-bit Only Multiplication for Network Quantization
ICLR 2022
EpiGRAF: Rethinking training of 3D GANs
NIPS 2022
R2L: Distilling Neural Radiance Field to Neural Light Field for Efficient Novel View Synthesis
ECCV 2022
Quantized GAN for Complex Music Generation from Dance Videos
ECCV 2022
Cross-Modal 3D Shape Generation and Manipulation
ECCV 2022
Show Me What and Tell Me How: Video Synthesis via Multimodal Conditioning
CVPR 2022
A Good Image Generator Is What You Need for High-Resolution Video Synthesis
ICLR 2021
SMIL: Multimodal Learning with Severely Missing Modality
AAAI 2021
Task-Assisted Domain Adaptation With Anchor Tasks
WACV 2021
Flow Guided Transformable Bottleneck Networks for Motion Retargeting
CVPR 2021
Teachers Do More Than Teach: Compressing Image-to-Image Models
CVPR 2021
Playable Video Generation
CVPR 2021
Motion Representations for Articulated Animation
CVPR 2021
Neural Hair Rendering
ECCV 2020
Transformable Bottleneck Networks
ICCV 2019
Laplace Landmark Localization
ICCV 2019
Animating Arbitrary Objects via Deep Motion Transfer
CVPR 2019
3D Guided Fine-Grained Face Manipulation
CVPR 2019
Train One Get One Free: Partially Supervised Neural Network for Bug Report Duplicate Detection and Clustering
NAACL 2019
First Order Motion Model for Image Animation
NIPS 2019
MoCoGAN: Decomposing Motion and Content for Video Generation
CVPR 2018
Self-Adaptive Matrix Completion for Heart Rate Estimation From Face Videos Under Realistic Conditions
CVPR 2016
Regressing a 3D Face Shape From a Single Image
ICCV 2015