junjiehe96/UniPortrait

About Face Recognation Network Choice

Closed this issue · 16 comments

Hi, thanks for great work. I wonder that why did you use CurricularFace instead of Arcface?

Hi, thanks for your question. The reason we chose to use CurricularFace instead of Arcface is because most of the Arcface models available are from insightface, which are only licensed for non-commercial research purposes. The CurricularFace model we used is trained by our own team and is completely open source. You can find more information about our face recognition model here: https://modelscope.cn/models/iic/cv_ir101_facerecognition_cfglint (97.47@IJB-C(1E-4)). Additionally, our code will be fully open sourced in the future. Thank you for your interest in our work!

@junjiehe96 Ohh I see, why you did not use AdaFace etc. The thing I am looking for with this question is that is there any effect of using different face recognation models (with almost same accuracy) on image generation quality. Loss function or network architecture of course determines the space of identities, does and how the statistics of this space effects image generation? If you have any insights on that, I really want to hear from you.

@Oguzhanercan You are right, these factors may have some impact, but we have not done more detailed experiments yet. In fact, we initially used arcface, and the results generated by arcface and curricularface are indeed different, but from both quantitative and qualitative metrics, there is not a significant gap between two when compared with actual reference images. I hope this answer is helpful to you.

Yes, helpful for me. Thanks a lot. Is there any example that you can share that produced with arcface? If you cannot, The thing I wonder right is that InstantID outputs are not really realistic, UniPortrait performs well in my experiments (But when I use LCM Lora with InstantID, it becomes realistic, probably rectified flow effect). Is there any situation like this in your experiments?

@Oguzhanercan Seems not. And I don't think this is due to the differences in the face recognition models. The training data and unet model architecture/checkpoint used by InstantID (sdxl vs sd1.5) are all different from ours.

Thanks for answering

@junjiehe96 hi, as we talk about, I wonder that the effect of FR model on Image Generation Quality. And I am working on it right now, Is it possible to share CurricularFace params with me? Right now, in my test case, arcface (webface trained version) beats others. (The difference comes from models and which dataset they trained on). And I can see that the output of UniPortrait is really well. I wanna see the effect of CurricularFace.

Thanks @junjiehe96 , In my experiments, instead of using a face rec model trained on glintr360k, using a face recognition model trained on webface42m performs better in the manner of identity preservation. Since there are 2 months for code release, If you have computation, maybe you can train CurricularFace at webface and finetune the LDM with new face recognition model. This could be a reference to the effect of data scale of face recognition model on facial image generation. And if help needed, I am open to contribute.

Thx for sharing your experiment results, @Oguzhanercan . These findings are very helpful for us. Could you provide some more detailed comparison results? Also, thank you for offering to contribute. We welcome anyone to further improve UniPortrait.

Of course, for an improvement to InstantID, I trained it with arcface which trained on webface42m. Face similarity results on my test set (which contains 5.000 images) increased about %4. Default model was also arcface but trained on glintr360k.

We will definitely consider your suggestion and fine-tune the LDM using the new face recognition model.

If you have any question about it, you can contact me via oguzhanercancs@gmail.com . If you have chance to train CurricularFace on webface42m, I really wonder the training logs. And one last thing you can try is that, as far as i know, there is not any work on the effect of ViT based FR models for controlling diffusion models. ArcFace logs about ViT backbone are promising. Maybe a CurricularFace with ViT backbone.

Good luck with new experiments.

If you have any question about it, you can contact me via oguzhanercancs@gmail.com . If you have chance to train CurricularFace on webface42m, I really wonder the training logs. And one last thing you can try is that, as far as i know, there is not any work on the effect of ViT based FR models for controlling diffusion models. ArcFace logs about ViT backbone are promising. Maybe a CurricularFace with ViT backbone.

Good luck with new experiments.

Thanks for your insightful findings. Could you share the wights of arcface trained on webface42m, thanks.

@junjiehe96 hi, do you have any new experiments?

@junjiehe96 hi, do you have any new experiments?

Hi, we are currently working on extending our approach to more advanced diffusion models and more general subjects. We have released the inference code and will not be releasing more results in the near future. Perhaps you can try it on your own.