Computer Vision › Generation ›

Video Generation

1433 directly classified papers

Papers per year

Papers

Hierarchical Patch Diffusion Models for High-Resolution Video Generation CVPR 2024

EMAGE: Towards Unified Holistic Co-Speech Gesture Generation via Expressive Masked Audio Gesture Modeling CVPR 2024

Bipartite Graph Diffusion Model for Human Interaction Generation WACV 2024

SafeSora: Towards Safety Alignment of Text2Video Generation via a Human Preference Dataset NIPS 2024

Compressed 3D Gaussian Splatting for Accelerated Novel View Synthesis CVPR 2024

VideoBooth: Diffusion-based Video Generation with Image Prompts CVPR 2024

InstructVideo: Instructing Video Diffusion Models with Human Feedback CVPR 2024

DiffPerformer: Iterative Learning of Consistent Latent Guidance for Diffusion-based Human Video Generation CVPR 2024

Video Interpolation with Diffusion Models CVPR 2024

TSA2: Temporal Segment Adaptation and Aggregation for Video Harmonization WACV 2024

Animate Anyone: Consistent and Controllable Image-to-Video Synthesis for Character Animation CVPR 2024

PEEKABOO: Interactive Video Generation via Masked-Diffusion CVPR 2024

FRESCO: Spatial-Temporal Correspondence for Zero-Shot Video Translation CVPR 2024

Bidirectional Autoregessive Diffusion Model for Dance Generation CVPR 2024

AvatarOne: Monocular 3D Human Animation WACV 2024

Vript: A Video Is Worth Thousands of Words NIPS 2024

BOTH2Hands: Inferring 3D Hands from Both Text Prompts and Body Dynamics CVPR 2024

Snap Video: Scaled Spatiotemporal Transformers for Text-to-Video Synthesis CVPR 2024

VBench: Comprehensive Benchmark Suite for Video Generative Models CVPR 2024

PI3D: Efficient Text-to-3D Generation with Pseudo-Image Diffusion CVPR 2024

Scale-Adaptive Feature Aggregation for Efficient Space-Time Video Super-Resolution WACV 2024

Towards Real-World HDR Video Reconstruction: A Large-Scale Benchmark Dataset and A Two-Stage Alignment Network CVPR 2024

CCEdit: Creative and Controllable Video Editing via Diffusion Models CVPR 2024

Align Your Gaussians: Text-to-4D with Dynamic 3D Gaussians and Composed Diffusion Models CVPR 2024

LLM Knows Body Language, Too: Translating Speech Voices into Human Gestures ACL 2024