Images in dimension (H=1024, W=2048), unable to use the notebook to generate dataset.

Question

Images in dimension (H=1024, W=2048), unable to use the notebook to generate dataset.

QingXIA233 opened this issue 4 years ago · 6 comments

Hi,

Your repository is super useful and interesting, good work! I try to use it to do object detection on Cityscapes images which are in the dimension of 1024x2048 instead of 512x512. I have perpared the desired images and corresponding annotations(PASCAL VOC format). According to the information given in the README.md file, I made the following changes:

IMAGE_H, IMAGE_W = 1024, 2048
GRID_H, GRID_W = 32, 64 # GRID size = IMAGE size / 32
LABELS = ('car', 'pedestrian')

I have put my images and labels in the path that is shown in your repository. Everything seems all right. But when I run
train_dataset = None
train_dataset= get_dataset(train_image_folder, train_annot_folder, LABELS, TRAIN_BATCH_SIZE)

It shows "ParseError: no element found: line 1, column 0".

Then I change to IMAGE_H, IMAGE_W = 2048, 1024, this error did not appear. However, the images and annotations are not correctly matched. Could you please give me some hint about this kind of issue? Thank you. The notebook built on the basis of yours with Cityscapes data is here: https://colab.research.google.com/drive/19g62IznotKZEgtNEowgyKmLcXIjy4OTJ?usp=sharing.

Answer 1 · 2020-10-22T08:14:07.000Z

Hi, I need time to investigate this problem. At first glance, it may be a problem of inversion of width and height somewhere in the code.

Answer 2 · 2020-10-22T09:40:23.000Z

Hi, thank you for replying. I solved the aforementioned problem of generating the dataset. The problem is that one of my labels (PASCAL VOC format) is empty that I didn't notice, thus the error occurs. However, after invistigating deeper, I still believe that there are some problems of the code about the width and height of the images. Three errors that I found out:

For Data Augmentation, the generated bounding boxes did not match the augmented images (shown as the figure below).
In your notebook, the 3.3. Process data to YOLO prediction format part, for testing the generator pipeline, the line of "img, detector_mask, matching_true_boxes, class_one_hot, true_boxes = next(train_gen)" shows error: index 36 is out of bounds for axis 1 with size 32. I think there is a problem somewhere that falsely inversed the height and width of the images or bounding box, but I did not succed finding out yet.
In your notebook, the 4.1. Loss function part, for testing the loss, the same error as error 2 shown above, the reasons could be the same.

I am not very familiar with your code and I am not very good at coding, but I really want to apply your work to more datasets including Cityscapes. If you could help me solve the problems mentioned above, that would help a lot. Thank you!

Answer 3 · 2020-10-22T10:06:26.000Z

Thank you for your observations! I have always used this code with square images... I will try to find the error as soon as possible.

Answer 4 · 2020-10-22T13:48:34.000Z

Thank you! I am looking forward to your updating. Or if you have any idea about solving the errors, please tell me, I'll change it by myself.

Answer 5 · 2020-10-23T07:18:07.000Z

I fixed the notebook with tensorflow 1. and added support for non-square images. You can try the Yolo_V2_tf_eager.ipynb notebook (Yolo_V2_tf_2.ipynb notebook with tensorflow 2. is not fixed yet).

Answer 6 · 2020-10-27T17:22:21.000Z

Thank you for the help! It's really useful. I'll close the issue now. Thanks.