Method and Device for Generating Synthetic Video Data from a Text Prompt

Yumeng Li, Anna Khoreva, Dan Zhang

Sep 25, 2025

Abstract
A method for generating synthetic video data form a text prompt, particularly for providing video data for training and/or testing and/or verifying and/or validating a machine learning model. The method includes: providing an input text prompt descriptive for the content of the video data to be generated; decomposing the provided text prompt into at least two text sub-prompts by a large language model; generating a text embedding for each of the at least two text sub-prompts; and generating synthetic video data by a Video Diffusion Model based on the generated text embeddings.

Patent

Method and Device for Generating Synthetic Video Data from a Text Prompt

Yumeng Li

Applied Scientist