PracticalDL/Practical-Deep-Learning-Book

chapter-5/1-develop-tool.ipynb

Opened this issue · 0 comments

when train model (transfer learning)

# select the percentage of layers to be trained while using the transfer learning
# technique. The selected layers will be close to the output/final layers.
unfreeze_percentage = 0

learning_rate = 0.001

if training_format == "scratch":
    print("Training a model from scratch")
    model = scratch(train, val, learning_rate)
elif training_format == "transfer_learning":
    print("Fine Tuning the MobileNet model")
    model = transfer_learn(train, val, unfreeze_percentage, learning_rate)

i see messages like that:
Corrupt JPEG data: 65 extraneous bytes before marker 0xd9

i googled that and the problem is with corrupted images in dataset (cats and dogs). to fix that one needs to use code to clean dataset:

import os
num_skipped = 0
for folder_name in ("Cat", "Dog"):
    folder_path = os.path.join("PetImages", folder_name)
    for fname in os.listdir(folder_path):
        fpath = os.path.join(folder_path, fname)
        try:
            fobj = open(fpath, "rb")
            is_jfif = tf.compat.as_bytes("JFIF") in fobj.peek(10)
        finally:
            fobj.close()

        if not is_jfif:
            num_skipped += 1
            # Delete corrupted image
            os.remove(fpath)
print("Deleted %d images" % num_skipped)

before training.
see also - https://discuss.tensorflow.org/t/first-steps-in-keras-error/8049/11

but i have no idea how to clean it if i have dataset presented as tfrecords:

cats_vs_dogs-train.tfrecord-00000-of-00008
cats_vs_dogs-train.tfrecord-00001-of-00008
cats_vs_dogs-train.tfrecord-00002-of-00008
cats_vs_dogs-train.tfrecord-00003-of-00008
cats_vs_dogs-train.tfrecord-00004-of-00008
cats_vs_dogs-train.tfrecord-00005-of-00008
cats_vs_dogs-train.tfrecord-00006-of-00008
cats_vs_dogs-train.tfrecord-00007-of-00008

how you fix that and if is not will the model will be correct ?