/AnimateAnyone-reproduction

reproduction of AnimateAnyone

Primary LanguagePython

animate-anyone-reproduction

reproduction of AnimateAnyone using SVD

To Do list

  • piepline based on SVD
  • train V0.9 which can only generate 14 frames per ref-image
  • train animate-anyone like pipeline V1 which can generate arbitrary frames per ref-image
  • enhance face quality and time consistency(trick according to analyse animate anyone app cases)
  • release V1 inference code and model
  • train svd+cross-attn based animate-anyone V2
  • release V1.1

2024-04-18 update

  • we found ori SVD has problems with consistency over long distances, so we learn from the structure of reference-net to keep the consistency over long distances while using the powerful SVD pretrain.

svd+cross-attn cases

4.18.1.mp4
4.18.mp4

2024-02-25 update

  • V1 checkpoint can be download now.
  • We can not release V1.1 which is the latest version. But we will release V1.1 if we have V1.2. The released version will be one version behind the latest version.
  • we also provide testcase to reproduce V1 result as below.
  • the original result has bad quality on human face, so we use simswap to enhance face. More detials can be found in issue.
  • You should first download the SVD model, and then use the unet provided by us to replace the original unet.
  • we find that the model has a certain degree of generalization on apperance and temporal consistency, but lacks the ability to generalize poses. So V1 can have a better performance on UBC pose.
  • we only add 300 high quality videos to achieve V1.1 results, you can finetune by your own datset.
  • we do not have any plans to release the training script but svd-temporal-controlnet may work.

2024-02-05 update

  • because of the issue, we decide to release inference code in advance which is not well organized but works.
  • as for postprocess of face, you can use any video face swap framework to do that. More details can be found in issue.
  • our inference code mainly baed on svd-temporal-controlnet, you can also use training code to train your own model.
  • our dataset is only UBC, but it can generalize to other simple domains. we will continue collecting high quailty video data.

2024-01-25 update

  • according to analyse animate anyone app cases, we find there may be some tricks instead of training model. so we will update the case which has better face quality with free training.
  • the face enhance result shows below in the V1 part

V1.1 animate-anyone ref-image case

2.19.mp4

V1

cross-domain case

test_12.4.mp4

with face enhance

474d4434-cf9f-40a1-a63a-8474d38bbb09.mp4

ori result

test-_7_.mp4
test.9.mp4

v0.9

test-_4_.mp4
test-_2_.mp4