mk-minchul/AdaFace

can't run success on inference.py, The error:IndexError: too many indices for array: array is 0-dimensional, but 3 were indexed

nicken opened this issue · 5 comments

i try run the demo of inference.py when i get code
but i get this error info:

Traceback (most recent call last):
  File "inference.py", line 41, in <module>
    input = to_input(aligned_rgb_img)
  File "inference.py", line 24, in to_input
    brg_img = ((np_img[:,:,::-1] / 255.) - 0.5) / 0.5
IndexError: too many indices for array: array is 0-dimensional, but 3 were indexed

i find get error because the face is none on the face detection and don't known why i can't get face, i only run the code in new env and don't have any change on code

i think maybe the env have different, this is my env:

Package                 Version
----------------------- -----------
absl-py                 1.1.0
aiohttp                 3.8.1
aiosignal               1.2.0
async-timeout           4.0.2
asynctest               0.13.0
attrs                   21.4.0
bcolz                   1.2.1
cachetools              5.2.0
certifi                 2022.6.15
charset-normalizer      2.0.12
cycler                  0.11.0
fonttools               4.33.3
frozenlist              1.3.0
fsspec                  2022.5.0
future                  0.18.2
google-auth             2.8.0
google-auth-oauthlib    0.4.6
graphviz                0.8.4
grpcio                  1.47.0
idna                    3.3
imageio                 2.19.3
importlib-metadata      4.12.0
joblib                  1.1.0
kiwisolver              1.4.3
Markdown                3.3.7
matplotlib              3.5.2
menpo                   0.11.0
multidict               6.0.2
mxnet                   1.9.1
networkx                2.6.3
numpy                   1.21.6
oauthlib                3.2.0
opencv-python           4.6.0.66
packaging               21.3
pandas                  1.3.5
Pillow                  9.1.1
pip                     22.1.2
prettytable             3.3.0
protobuf                3.19.4
pyasn1                  0.4.8
pyasn1-modules          0.2.8
pyDeprecate             0.3.1
pyparsing               3.0.9
python-dateutil         2.8.2
pytorch-lightning       1.4.4
pytz                    2022.1
PyWavelets              1.3.0
PyYAML                  6.0
requests                2.28.0
requests-oauthlib       1.3.1
rsa                     4.8
scikit-image            0.19.3
scikit-learn            1.0.2
scipy                   1.7.3
setuptools              62.6.0
six                     1.16.0
tensorboard             2.9.1
tensorboard-data-server 0.6.1
tensorboard-plugin-wit  1.8.1
threadpoolctl           3.1.0
tifffile                2021.11.2
torch                   1.8.1+cu111
torchaudio              0.8.1
torchmetrics            0.6.0
torchvision             0.9.1+cu111
tqdm                    4.64.0
typing_extensions       4.2.0
urllib3                 1.26.9
wcwidth                 0.2.5
Werkzeug                2.1.2
wheel                   0.37.1
yarl                    1.7.2
zipp                    3.8.0

Hi, it happens when the MTCNN face detector cannot find face properly.
MTCNN detects and aligns the image that is same as the training data. So this step is necessary to get the most performance.
One thing you may do is to tweak the parameter of MTCNN in line

self.thresholds = [0.6,0.7,0.9]

or use a more powerful face detector like https://github.com/serengil/retinaface (this one you have to check whether the alignment is same as MTCNN).
Another thing you could do if you need to force inference is just bypass the alignment step and resize the input to 112x112x3 and feed it to the model directly. Since AdaFace model is a convnet, it does not throw error even if the face is not aligned. Just make sure that the color channel is BGR.

Can I change the face detector code myself using retinaface instead of MTCNN if so where can I change it

Hi, it happens when the MTCNN face detector cannot find face properly. MTCNN detects and aligns the image that is same as the training data. So this step is necessary to get the most performance. One thing you may do is to tweak the parameter of MTCNN in line

self.thresholds = [0.6,0.7,0.9]

or use a more powerful face detector like https://github.com/serengil/retinaface (this one you have to check whether the alignment is same as MTCNN).
Another thing you could do if you need to force inference is just bypass the alignment step and resize the input to 112x112x3 and feed it to the model directly. Since AdaFace model is a convnet, it does not throw error even if the face is not aligned. Just make sure that the color channel is BGR.

OK thanks

Following on this, how do we bypass the detection phase? I understand that theres a align function in the class

    def align(self, img):
        _, landmarks = self.detect_faces(img, self.min_face_size, self.thresholds, self.nms_thresholds, self.factor)
        facial5points = [[landmarks[0][j], landmarks[0][j + 5]] for j in range(5)]
        warped_face = warp_and_crop_face(np.array(img), facial5points, self.refrence, crop_size=self.crop_size)
        return Image.fromarray(warped_face)

However, it still goes through face detection, my case is that I already have cropped face images and so therefore the bounding box would be the whole input image

Edit:
I think a better question would be, what does the bounding_box array represent in MTCNN.detect_faces ? what is each dimension of the array referring to? i.e (x,y) pair coordinates?

@mk-minchul

Hi @nicken, I used something that comment @mk-minchul, and I solved by the block of de "try: Except" into align.py file and call in there and evaluation model of retinaface or YOLO pretrained for example in prediction mode.

So into align.py at get_aligned_face function just change exception code to:

def get_aligned_face(image_path, rgb_pil_image=None):
    if rgb_pil_image is None:
        img = Image.open(image_path).convert('RGB')
    else:
        assert isinstance(rgb_pil_image, Image.Image), 'Face alignment module requires PIL image or path to the image'
        img = rgb_pil_image
    # find face
    try:
        bboxes, faces = mtcnn_model.align_multi(img, limit=1)
        face = faces[0]
    except Exception as e: #CHANGE CODE HERE!!!!
        width, height = img.size
        #Function to call YOLO or RetinaFace and obtain x, y , w and h [bbox in percentage % format ]
        results = model(img)  # this example use YOLO  created and instanced before using https://docs.ultralytics.com/tasks/detect/#val
        x, y, w, h= results2xywh(results) #send the biggest face identify at the image from YOLO format
        left = width*(x)
        top = height*(y)
        right = width*(w)
        bottom = height*(h)
        img2=img.crop((left, top, right, bottom))
        face=img2.resize((112,112))
        #print('Face detection Failed due to error.')
        #print(e)
        #face = None
    return face

You do not forget to reload the modules and import.

I hope to you help you @nicken. I have been trying to training with custom data an I found #75.
Did you found some solution? or even you got a training well passed script? THAT WOULD BE GREAT!!