qinenergy/webvision-2.0-benchmarks

How to load images of different format into the same data format

Opened this issue · 3 comments

I'm trying to implement a training project by myself on webvision data 2.0. I find that there are in total 11 image formats, namely

jpg, jpeg, png, gif, bmp, tif, jpe, tiff, ppm, jfif, mpo

I cannot find a unified way to load images of each format into the same data format. For example, the cv2.imread(), cannot deal with the gif format. And the Pillow Image Image.open(x).convert("RGB") can only load ppm image into grayscale images.

How is this problem solved in your project? Judging by your code here, you simply use cv2.imread. Won't it cause a problem when loading the gif images?

Thanks for help!

Hi!

We found that cv2 is rather robust (or modest) to data that it cannot deal with.

According to the documentation and this opencv issue:

If the image cannot be read (because of missing file, improper permissions, unsupported or invalid format), the function returns an empty matrix ( Mat::data==NULL ).

For example, for gif data, it will just give you None. GIF data is anyhow ill defined for image classification, as there are multiple frames in one gif sample.

!wget https://raw.githubusercontent.com/takerum/vat_tf/master/vat.gif
import cv2
print(cv2.imread("vat.gif")) #None printed

Thus we just ignore these samples in the iterator and print the filenames to keep a track.

#Line 212 in imagenet5k.py
if im is not None:
    yield [im, label]
else:
    print(fname, label)

You could also use a more aggressive try-except blocks to capture unknown issues of cv2 in case unexpected things happen.

corrupted_list = []
try:
    im = cv2.imread(fname, cv2.IMREAD_COLOR)
    if im is not None:
        yield [im, label]
    else:
        print(fname, label)
except:
    print("cv2 error", fname) # just for debugging

Hope this can help you!

Oh... It means in your project, you ignore these gif images, right.

Emm... will it have any influence on the results?

BTW, I double check the method Image.open(x).convert("RGB"). It seems this will work. Previously I found it will load ppm image into grayscale. But then I find the test image I use is grayscale itself. So... maybe that is a solution.

Yes. We ignore all GIF-encoeded images in this training recipe.

By the way, cv2.imread gives you BGR images. If you are going to use Image.open as well, you may need to convert it also to BGR.