Could you share some information about how many training data you used, and the cost of training,thanks.
Closed this issue · 2 comments
Hi, thanks for your interest!
The model is trained on the HD subset of the OpenVid dataset, which contains 0.4M high-quality videos. The training is performed on 16 NVIDIA A100 GPUs, with a total batch size of 16. Framer is trained in two stages. Specifically, the UNet is first trained for 100k iterations, then the controlling branch is trained for 10k iterations.
Hi, thanks for your interest!
The model is trained on the HD subset of the OpenVid dataset, which contains 0.4M high-quality videos. The training is performed on 16 NVIDIA A100 GPUs, with a total batch size of 16. Framer is trained in two stages. Specifically, the UNet is first trained for 100k iterations, then the controlling branch is trained for 10k iterations.
Thanks for your reply. How great jop your team did!