Handling invalid image path or corrupted image files.
somso2e opened this issue · 0 comments
The problem:
In the current implementation, if a path in the annotation file provided to DatasetBuilder
does not exist or one of the files is somehow corrupted, the entire training comes to a halt and you lose all your progress.
Considering it could take hours to iterate through all images, this becomes very frustrating.
I came across this problem because a few images (around 50) got somehow corrupted while downloading the MJSynth dataset. I did try to clean them up as this solution suggested, but I'm still encountering weird nonsensical errors:
[30] try:
---> [31] model.fit(train_ds,
[32] epochs=EPOCHS,
[33] callbacks=callbacks,
[34] validation_data=val_ds,
[35] use_multiprocessing=True)
[36] except KeyboardInterrupt:
[37] pass
File c:\Users\somso\AppData\Local\Programs\Python\Python38\lib\site-packages\keras\utils\traceback_utils.py:70, in filter_traceback.<locals>.error_handler(*args, **kwargs)
67 filtered_tb = _process_traceback_frames(e.__traceback__)
68 # To get the full stack trace, call:
69 # `tf.debugging.disable_traceback_filtering()`
---> 70 raise e.with_traceback(filtered_tb) from None
71 finally:
72 del filtered_tb
File c:\Users\somso\AppData\Local\Programs\Python\Python38\lib\site-packages\tensorflow\python\eager\execute.py:54, in quick_execute(op_name, num_outputs, inputs, attrs, ctx, name)
52 try:
53 ctx.ensure_initialized()
---> 54 tensors = pywrap_tfe.TFE_Py_Execute(ctx._handle, device_name, op_name,
55 inputs, attrs, num_outputs)
56 except core._NotOkStatusException as e:
57 if name is not None:
InvalidArgumentError: Graph execution error:
2 root error(s) found.
(0) INVALID_ARGUMENT: jpeg::Uncompress failed. Invalid JPEG data or crop window.
[[{{node DecodeJpeg}}]]
[[IteratorGetNext]]
[[assert_equal_3/Assert/Assert/data_0/_4]]
(1) INVALID_ARGUMENT: jpeg::Uncompress failed. Invalid JPEG data or crop window.
[[{{node DecodeJpeg}}]]
[[IteratorGetNext]]
0 successful operations.
0 derived errors ignored. [Op:__inference_train_function_11983]
What could be done:
Catching the exceptions thrown while trying to load images and logging the catched exception in the terminal and skipping it.
I honestly tried to come up with a solution myself but I still cannot understand how DatasetBuilder works. lol
I'd be happy to make a PR myself if you have an idea of how to fix this.