PracticalDL/Practical-Deep-Learning-Book

TFReocrds for multi-label data

abdulsam opened this issue · 5 comments

I'm trying to make tfrecords for my dataset. How can I create tfrecords for a multilabel data with thousands of files for training and validation?
I tried this code storing-data-as-tfrecord.ipynb. This works fine for a single image.
I tried using this code also build_image_data.py, but this doesn't work for TensorFlow 2.x .

Consider experimenting with code/chapter-14/generate_tfrecord.py which is originally meant for object detection but can work with multi-label tasks as well. This code has been tested for TF 1.14 and in TF 2.x the tf.app has been removed, so you may have to make some edits.

I created tfrecords for my dataset, 1 tfrecord file for 100 images. But tfrecords occupying huge space on memory.

Do you mean that the created TFRecords are occupying more space than the total for 100 images? Are you using the newer tf.data to load the data?

I was doing something wrong. The created file itself was bigger in size. That's why after loading into memory it was occupying more space.

I'm closing this issue now, since there has been no activity in the past 2 weeks.