Research Explorer
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
About
Methodology
← Keywords
video prediction
140 papers
Explore in graph
Co-occurring keywords
video generation
(703)
recurrent neural network
(1790)
diffusion model
(3720)
representation learning
(6174)
self-supervised learning
(3751)
motion prediction
(135)
generative model
(2889)
unsupervised learning
(3255)
world model
(180)
convolutional neural network
(4216)
Papers
Show Me: Unifying Instructional Image and Video Generation with Diffusion Models
WACV 2026
RAPTOR: Real-Time High-Resolution UAV Video Prediction with Efficient Video Attention
AAAI 2026
H-GAR: A Hierarchical Interaction Framework via Goal-Driven Observation-Action Refinement for Robotic Manipulation
AAAI 2026
Human2Robot: Learning Robot Actions from Paired Human-Robot Videos
AAAI 2026
Unified Video Action Model
RSS 2025
Diffusion-Based Imaginative Coordination for Bimanual Manipulation
ICCV 2025
MoMaps: Semantics-Aware Scene Motion Generation with Motion Maps
ICCV 2025
DFDNet: Disentangling and Filtering Dynamics for Enhanced Video Prediction
AAAI 2025
STDD: Spatio-Temporal Dual Diffusion for Video Generation
CVPR 2025
GEM: A Generalizable Ego-Vision Multimodal World Model for Fine-Grained Ego-Motion, Object Dynamics, and Scene Composition Control
CVPR 2025
STLight: A Fully Convolutional Approach for Efficient Predictive Learning by Spatio-Temporal Joint Processing
WACV 2025
PhysGen3D: Crafting a Miniature Interactive World from a Single Image
CVPR 2025
Aether: Geometric-Aware Unified World Modeling
ICCV 2025
AID: Adapting Image2Video Diffusion Models for Instruction-guided Video Prediction
ICCV 2025
Top-Down Guidance for Learning Object-Centric Representations
IJCAI 2025
OCK: Unsupervised Dynamic Video Prediction with Object-Centric Kinematics
ICCV 2025
Disentangled World Models: Learning to Transfer Semantic Knowledge from Distracting Videos for Reinforcement Learning
ICCV 2025
SyncVP: Joint Diffusion for Synchronous Multi-Modal Video Prediction
CVPR 2025
MaskGWM: A Generalizable Driving World Model with Video Mask Reconstruction
CVPR 2025
Tracktention: Leveraging Point Tracking to Attend Videos Faster and Better
CVPR 2025
Learning from Streaming Video with Orthogonal Gradients
CVPR 2025
CAGE: Unsupervised Visual Composition and Animation for Controllable Video Generation
AAAI 2025
PredToken: Predicting Unknown Tokens and Beyond with Coarse-to-Fine Iterative Decoding
CVPR 2024
Vista: A Generalizable Driving World Model with High Fidelity and Versatile Controllability
NIPS 2024
iVideoGPT: Interactive VideoGPTs are Scalable World Models
NIPS 2024
<
1
2
3
4
5
6
>