Official implementation of 'Puppet-Master: Scaling Interactive Video Generation as a Motion Prior for Part-Level Dynamics'
[arXiv] [Demo] [Project] [BibTeX]
- 2024-Sept-25 Training script released.
- 2024-Aug-17 Pre-trained checkpoints and demo released on Hugging Face. Check here for demo and here for code.
See the data
folder.
We provide a minimum viable training script to demonstrate how to use our dataset to fine-tune Stable Video Diffusion.
You can use the following command:
accelerate launch --num_processes 1 --mixed_precision fp16 train.py --config configs/train-puppet-master.yaml
To reduce the memory overhead, we cache all the latents and CLIP embeddings of the rendered frames.
Note this is only a working example. Our final model is trained using a combined dataset of Objaverse-Animation-HQ and Drag-a-Move.
We provide an interactive demo here. Check it out!
Our evaluation utilizes an unseen test set of Drag-a-Move, consisting of 100 examples.
The whole test set is provided in DragAMove-test-batches
folder.
The test examples can be read directly from the xxxxx.pkl
files and are in the same format as those loaded from DragVideoDataset
implemented in dataset.py
.
- Release pre-trained checkpoint & inference code.
- Release training code.
- Release Objaverse-Animation & Objaverse-Animation rendering script.
@article{li2024puppetmaster,
title = {Puppet-Master: Scaling Interactive Video Generation as a Motion Prior for Part-Level Dynamics},
author = {Li, Ruining and Zheng, Chuanxia and Rupprecht, Christian and Vedaldi, Andrea},
journal = {arXiv preprint arXiv:2408.04631},
year = {2024}
}