GenVOG-DiT: A Transformer-Based Diffusion Model for Pose-Driven, Patient-Agnostic Nystagmus VOG Video Generation

Aimon Rahman; Kemar E. Green; Vishal M. Patel

2026 MIDL MIDL 2026

GenVOG-DiT: A Transformer-Based Diffusion Model for Pose-Driven, Patient-Agnostic Nystagmus VOG Video Generation

Abstract

Nystagmus, an involuntary eye movement indicative of neurological and vestibular disorders, is traditionally diagnosed using costly equipment or expert visual inspection: both of which limit accessibility in nonspecialist settings. Recent advances in computer vision and deep learning present an opportunity to automate the detection of nystagmus from standard video recordings. However, progress is hindered by the scarcity of publicly available video datasets due to privacy concerns surrounding ocular biometric data. In this work, we propose the use of synthetically generated eye movement videos to mitigate data limitations. Using video diffusion models, we simulate diverse clinically plausible nystagmus patterns without relying on real patient data, enabling scalable training while preserving privacy. We show that models trained on synthetic data generalize effectively to real-world settings and show potential for integration into telehealth applications. Our approach advances the development of accessible, generalizable, and privacy-aware diagnostic tools for eye movement disorders.

Authors

Aimon Rahman , Kemar E. Green , Vishal M. Patel

Topics

Deep Learning > Models > Diffusion Models Computer Vision > Generation > Video Generation Healthcare & Medicine > Clinical > Medical AI

Keywords

video generation privacy-preserving learning synthetic data generation diffusion model nystagmus detection

Download PDF

Related papers

OxEnsemble: Fair Ensembles for Low-Data Classification 2026

BETA: Resting-state fMRI Biotypes for tDCS Efficacy in Anxiety Among Older Adults At Risk For Alzheimer’s Disease 2026

Guideline-Informed MLLM Reasoning for Pathology-Aware Postoperative Prostate CTV Segmentation 2026

Scalable Detection of Undiagnosed ILD in Population Screening: A Multi-Cohort Study using 3D Foundation Models 2026

DIST-CLIP: Arbitrary Metadata and Image Guided MRI Harmonization via Disentangled Anatomy-Contrast Representations 2026