maum-ai/hififace

A few minor issues

ThereforeGames opened this issue · 5 comments

First, thanks for taking the time to implement this faceswapping model - seems like it has a lot of potential!

That said, I noticed a few issues while setting it up on my machine:

  • There's a typo on line 98 of hififace_inference.py: "args.args.output_image_path" has one too many "args."
  • The inference examples reference directories called "asset" and "inference_sample" while the provided code has "assets" and "inference_samples" instead.
  • More of a suggestion than an issue: the documentation on "Pre-Trained Models for ArcFace" could be improved. As I understand it, one must download the file called "ms1mv3_arcface_r100_fp16" and extract "backbone.pth" to the root of the hififace directory. Additionally, it must be renamed to "ms1mv3_arcface_r100_fp16_backbone.pth" in order for the inference script to function.

While I am able to produce images with hififace_inference, I must say that the resulting faces look quite strange and lack the visual fidelity shown in the examples. Is this because my images have not been properly aligned? And if so, is there a way to automate the alignment script and prepend it to the inference execution?

On a similar note, the inference seems to crash when target images have different dimensions. Is this a known limitation?

Thank you again for your hard work!

@WhiteSigility Thanks for all your works! First of all, I am very glad about there is someone really tried my code.
This will be very helpful to me and other users. Can you make a pull request for these issues? This is your contribution. If you don't, I'll just fix it.

About the inference results, we aligned the training dataset with the landmarks extracted from 3DDFA_v2. If you used other landmark extractors, it will affect the inference quality.
The alignment should be performed before the inference.

On a similar note, the inference seems to crash when target images have different dimensions. Is this a known limitation?

Can you tell me more details about this error?

Can you tell me more details about this error?

Sure - here's the error in detail:

Traceback (most recent call last):
  File "hififace_inference.py", line 93, in <module>
    output_img = net(source_img, target_img)
  File "T:\programs\anaconda3\envs\hififace\lib\site-packages\torch\nn\modules\module.py", line 1102, in _call_impl
    return forward_call(*input, **kwargs)
  File "T:\programs\hififace\hififace\hififace_pl.py", line 29, in forward
    i_r, _, _, _ = self.generator(source_img, target_img)
  File "T:\programs\anaconda3\envs\hififace\lib\site-packages\torch\nn\modules\module.py", line 1102, in _call_impl
    return forward_call(*input, **kwargs)
  File "T:\programs\hififace\hififace\model\hififace.py", line 257, in forward
    i_r, i_low, m_r, m_low = self.sff_module(i_target, z_enc, z_dec, id_vector)
  File "T:\programs\anaconda3\envs\hififace\lib\site-packages\torch\nn\modules\module.py", line 1102, in _call_impl
    return forward_call(*input, **kwargs)
  File "T:\programs\hififace\hififace\model\hififace.py", line 171, in forward
    z_fuse = m_low * z_dec + (1 - m_low) * z_enc
RuntimeError: The size of tensor a (112) must match the size of tensor b (114) at non-singleton dimension 3

In this case, one target image was 954x954 px and the other was 579x929 px.

I should also mention that I'm evaluating your code in Anaconda, haven't installed Docker or set up for training. I'm using Python v3.7. It's certainly possible this issue is unique to my setup!

@WhiteSigility The input image size should be a right rectangle. You should pre-process with 3DDFA_v2's landmarks, just like FFHQ did.


In this case, one target image was 954x954 px and the other was 579x929 px.

I should also mention that I'm evaluating your code in Anaconda, haven't installed Docker or set up for training. I'm using Python v3.7. It's certainly possible this issue is unique to my setup!

I have the same issue with @WhiteSigility.
I deploy this model on GG Colab: setup & download only pretrained for the single image inferencing task.
The image from the example folder works fine (assets/inference_samples/), but my custom image got the above error.

Things I have tried:

  • Image pair with different sizes.
  • Image pair with exact same size.
  • Identical image pair.

I wonder if any specific image size requirement for the inferencing task?

I got "ModuleNotFoundError: No module named 'lpips'" error when running the inference.py, and even if I git cloned the lpips repo in /hififace/model, it didn't work. would you please tell how to solve this problem?