Doodleverse/segmentation_gym

Issue with make_nd_datasets

sbosse12 opened this issue ยท 29 comments

Gym is having an issue with reshaping array size. It seems to be a problem with the target size. I have tried many different target sizes yet have had no luck. the error is below as well as config file.

(gym) C:\Users\sbosse\segmentation_gym>python make_nd_dataset.py
C:/Users/sbosse/segmentation_gym/my_seggym_datasets/2022_DE_Coast/npzForModel
C:/Users/sbosse/segmentation_gym/my_seggym_datasets/config/2022_DeCoast_watermask_nadir_2class_batch6.json
Using GPU
C:/Users/sbosse/segmentation_gym/my_seggym_datasets/2022_DE_Coast/FromDoodler/labels/labels
C:/Users/sbosse/segmentation_gym/my_seggym_datasets/2022_DE_Coast/FromDoodler/images/images
Found 500 image and 500 label files
joblib.externals.loky.process_executor._RemoteTraceback:
"""
Traceback (most recent call last):
File "C:\ProgramData\Anaconda3\envs\gym\lib\site-packages\joblib\externals\loky\process_executor.py", line 428, in _process_worker
r = call_item()
File "C:\ProgramData\Anaconda3\envs\gym\lib\site-packages\joblib\externals\loky\process_executor.py", line 275, in call
return self.fn(*self.args, **self.kwargs)
File "C:\ProgramData\Anaconda3\envs\gym\lib\site-packages\joblib_parallel_backends.py", line 620, in call
return self.func(*args, **kwargs)
File "C:\ProgramData\Anaconda3\envs\gym\lib\site-packages\joblib\parallel.py", line 288, in call
return [func(*args, **kwargs)
File "C:\ProgramData\Anaconda3\envs\gym\lib\site-packages\joblib\parallel.py", line 288, in
return [func(*args, **kwargs)
File "C:\Users\sbosse\segmentation_gym\make_nd_dataset.py", line 75, in do_resize_label
result = scale(lab,TARGET_SIZE[0],TARGET_SIZE[1])
File "C:\Users\sbosse\segmentation_gym\make_nd_dataset.py", line 53, in scale
return np.array(tmp).reshape((nR,nC))
ValueError: cannot reshape array of size 13284432 into shape (1719,2576)
"""

The above exception was the direct cause of the following exception:
"TARGET_SIZE": [1719,2576],
"MODEL": "resunet",
"NCLASSES": 2,
"BATCH_SIZE": 6,
"N_DATA_BANDS": 3,
"DO_TRAIN": true,
"PATIENCE": 25,
"MAX_EPOCHS": 200,
"VALIDATION_SPLIT": 0.75,
"FILTERS":6,
"KERNEL":7,
"STRIDE":2,
"LOSS": "dice",
"DROPOUT":0.1,
"DROPOUT_CHANGE_PER_LAYER":0.0,
"DROPOUT_TYPE":"standard",
"USE_DROPOUT_ON_UPSAMPLING":false,
"ROOT_STRING": "DE_Coast_water_mask",
"FILTER_VALUE": 3,
"DOPLOT": true,
"USEMASK": false,
"RAMPUP_EPOCHS": 20,
"SUSTAIN_EPOCHS": 0.0,
"EXP_DECAY": 0.9,
"START_LR": 1e-7,
"MIN_LR": 1e-7,
"MAX_LR": 1e-5,
"AUG_ROT": 0,
"AUG_ZOOM": 0.0,
"AUG_WIDTHSHIFT": 0.05,
"AUG_HEIGHTSHIFT": 0.05,
"AUG_HFLIP": true,
"AUG_VFLIP": true,
"AUG_LOOPS": 3,
"AUG_COPIES": 2,
"TESTTIMEAUG": false,
"SET_GPU": "0",
"do_crf": true,
"SET_PCI_BUS_ID": true

Traceback (most recent call last):
File "C:\Users\sbosse\segmentation_gym\make_nd_dataset.py", line 251, in
w = Parallel(n_jobs=-2, verbose=0, max_nbytes=None)(delayed(do_resize_label)(os.path.normpath(lfile), TARGET_SIZE) for lfile in label_files)
File "C:\ProgramData\Anaconda3\envs\gym\lib\site-packages\joblib\parallel.py", line 1098, in call
self.retrieve()
File "C:\ProgramData\Anaconda3\envs\gym\lib\site-packages\joblib\parallel.py", line 975, in retrieve
self._output.extend(job.get(timeout=self.timeout))
File "C:\ProgramData\Anaconda3\envs\gym\lib\site-packages\joblib_parallel_backends.py", line 567, in wrap_future_result
return future.result(timeout=timeout)
File "C:\ProgramData\Anaconda3\envs\gym\lib\concurrent\futures_base.py", line 446, in result
return self.__get_result()
File "C:\ProgramData\Anaconda3\envs\gym\lib\concurrent\futures_base.py", line 391, in __get_result
raise self._exception
ValueError: cannot reshape array of size 13284432 into shape (1719,2576)

Hi Stephen, I don't think a "TARGET_SIZE" of [1719,2576] is supported. Could you try [768, 1024].

Also, did you update to the latest doodleverse-utils? If not, from within an activated gym conda env, pip install doodleverse-utils -U (U stands for upgrade, and it should upgrade to the latest version 0.0.10)

Also note that with the new changes (should be the last major changes for a while!), NCLASSES=2 means 'class and no class' (replaces the former NCLASSES=1)

Hi Dan,

I've tried [768, 1024] and [768, 768]. No luck. I'll run that update though and see if that helps

Looks like NCLASSES should be 3? The error is

"ValueError: cannot reshape array of size 13284432 into shape (1719,2576)"

from the code that resizes the labels.

1719 x 2576 x 3 = 13284432

I may be wrong that "TARGET_SIZE" of [1719,2576] is not supported (I have not tried), but I was under the impression odd dimensions such as 1719 would not work

Is that correct if I'm only doing water masking?

I had [768, 1024] target size originally, but it gave me that error so thought I would try the dimensions of the imagery I'm training. Still got the error, so tried [768, 768] and no good.

This is puzzling. You are correct that NCLASSES=2 for water/nowater. Have you verified you are using the latest doodleverse-utils version?

If so, can you zip up 10 pairs of images and labels and post them here? Perhaps the target size is a red herring and the true problem is something else ....

Here ya go! I updated seg gym last night, and when I ran that doodleverse utils update it said:
Requirement already satisfied: doodleverse-utils in c:\programdata\anaconda3\envs\gym\lib\site-packages (0.0.10)
Requirement already satisfied: versioneer in c:\programdata\anaconda3\envs\gym\lib\site-packages (from doodleverse-utils) (0.26)

By that, I took it the updates were already made.

array_issue_test.zip
"TARGET_SIZE": [768, 1024],
"MODEL": "resunet",
"NCLASSES": 2,
"BATCH_SIZE": 6,
"N_DATA_BANDS": 3,
"DO_TRAIN": true,
"PATIENCE": 25,
"MAX_EPOCHS": 200,
"VALIDATION_SPLIT": 0.75,
"FILTERS":6,
"KERNEL":7,
"STRIDE":2,
"LOSS": "dice",
"DROPOUT":0.1,
"DROPOUT_CHANGE_PER_LAYER":0.0,
"DROPOUT_TYPE":"standard",
"USE_DROPOUT_ON_UPSAMPLING":false,
"ROOT_STRING": "DE_Coast_water_mask",
"FILTER_VALUE": 3,
"DOPLOT": true,
"USEMASK": false,
"RAMPUP_EPOCHS": 20,
"SUSTAIN_EPOCHS": 0.0,
"EXP_DECAY": 0.9,
"START_LR": 1e-7,
"MIN_LR": 1e-7,
"MAX_LR": 1e-5,
"AUG_ROT": 0,
"AUG_ZOOM": 0.0,
"AUG_WIDTHSHIFT": 0.05,
"AUG_HEIGHTSHIFT": 0.05,
"AUG_HFLIP": true,
"AUG_VFLIP": true,
"AUG_LOOPS": 3,
"AUG_COPIES": 2,
"TESTTIMEAUG": false,
"SET_GPU": "0",
"do_crf": true,
"SET_PCI_BUS_ID": true

Hi @sbosse12 - I see the images in the zip, but not the actual labels.. instead of labels (which should be greyscale images, super dark) i see the colorized labels. Do you have the greyscale masks?

If these are the labels you are using, I would guess that make_nd_datasets is failing because it expects a label to be 1 band image (greyscale) vs this 3 band image..

After converting to greyscale (B&W), I am getting this message below. (attached images w/ bw labels)
array_issue_test.zip

it's trying to resize the images and labels and put them in resized label/image folders. It populates the image folder with new size [768, 1024], but not the label folder. Which is then causing it to crash.

(gym) C:\Users\sbosse\segmentation_gym>python make_nd_dataset.py
C:/Users/sbosse/segmentation_gym/my_seggym_datasets/2022_DE_Coast/npzForModel
C:/Users/sbosse/segmentation_gym/my_seggym_datasets/config/2022_DeCoast_watermask_nadir_2class_batch6.json
Using GPU
C:/Users/sbosse/segmentation_gym/my_seggym_datasets/2022_DE_Coast/FromDoodler/labels/labels
C:/Users/sbosse/segmentation_gym/my_seggym_datasets/2022_DE_Coast/FromDoodler/images/images
Found 500 image and 500 label files
C:/Users/sbosse/segmentation_gym/my_seggym_datasets/2022_DE_Coast/FromDoodler/resized_labels/resized_labels already exists: skipping the image resizing step
0 label files
1 sets of 500 image files
Creating non-augmented subset
Version: 2.6.0
Eager mode: True
2022-10-18 09:02:28.129971: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX AVX2
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-10-18 09:02:28.694129: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1510] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 8964 MB memory: -> device: 0, name: NVIDIA GeForce RTX 2080 Ti, pci bus id: 0000:3b:00.0, compute capability: 7.5
Traceback (most recent call last):
File "C:\Users\sbosse\segmentation_gym\make_nd_dataset.py", line 406, in
dataset = tf.data.Dataset.list_files(filenames, shuffle=False)
File "C:\ProgramData\Anaconda3\envs\gym\lib\site-packages\tensorflow\python\data\ops\dataset_ops.py", line 1229, in list_files
assert_not_empty = control_flow_ops.Assert(
File "C:\ProgramData\Anaconda3\envs\gym\lib\site-packages\tensorflow\python\util\dispatch.py", line 206, in wrapper
return target(*args, **kwargs)
File "C:\ProgramData\Anaconda3\envs\gym\lib\site-packages\tensorflow\python\util\tf_should_use.py", line 247, in wrapped
return _add_should_use_warning(fn(*args, **kwargs),
File "C:\ProgramData\Anaconda3\envs\gym\lib\site-packages\tensorflow\python\ops\control_flow_ops.py", line 160, in Assert
raise errors.InvalidArgumentError(
tensorflow.python.framework.errors_impl.InvalidArgumentError: Expected 'tf.Tensor(False, shape=(), dtype=bool)' to be true. Summarized data: b'No files matched pattern: '

Hi @sbosse12 - the issue here is that the label names do not correspond to the image names (hence the No files matched pattern error message). I see this in the folder you sent:
the image is: 2022-0610-152545-DSC02130-N7251F.jpg
and the label is: 2022-0610-152545-DSC02130-N7251F_label2022-09-06-10-48_Enter-user-ID.jpg
The labels seems to have duplicated part of the string in the naming..
one idea is that you can strip off the 2nd part of the name in all the labels (here it would be 2022-09-06-10-48_Enter-user-ID).. but i am not sure how you converted the RGBs to grayscale and if that is going to lead to problems.. Specifically, its not that the image itself is grayscale, its that each label is 'label encoded' - class 1 is all 0 values, class 2 is all 1 values, etc...
if you converted the RGB to grayscale just with, say, imagemagick, then the labels might not have specific, needed mapping of pixel values...

SO - i think we should just back up for a moment - I am guessing you are working from Doodled output right?
If this is the case, can i ask that you follow this workflow and see how it goes:

  1. activate the doodler conda environment
  2. navigate back into the doodler directory
  3. navigate into /utils/
  4. run `gen_images_and_labels.py', and select the folder of doodler results you want to process (you may need to collect all your doodler output into a single folder
  5. Once the script is done, navigate to the results folder you selected.
  6. the output of this script is several new folders in the doodler results folder. Two will be 'images' and 'labels', and they will be named correctly..
  7. copy/paste those two folders into your Gym directory, and try using those with Make_datasets..

this is an 'official' way to get images and labels from doodler -> gym, and should work..

https://doodleverse.github.io/dash_doodler/docs/tutorial-extras/next-steps

Hi Evan,

I did a small test run where I removed the extra portion of the file name including date and time of doodle, so that each label image has the name filename_label.jpg. This also did not work. I used irfan view to convert the imagery. In the advance batch conversion options, I selected change color depth then 2 colors (B/W).

I'll give the doodler utility's a try though and let you know!

yep, that is what i would expect - that it would not work.. The labels are not just normal greyscale images, but have that special 'label' encoding..

Yes, please use the doodler pipeline and report back!

@sbosse12 any update?

If you have B/W images, that means you probably have pixels that are 0 and 255. This can be dealt with using "REMAP_CLASSES": {"0":0, "255":1} which will reclassify the 255 as 1 ...

But the Doodler workflow (utils/`gen_images_and_labels.py') is probably best overall - simpler, well tested, and no extra steps involving 3rd party software

Hi Dan and Evan,

I generated a new set of image/label files (a few attached in zip)
array_issue_test.zip

I am getting an error now where the folders are not getting populated to the resized folders, and therefore not being recognized to make the training dataset.

(gym) C:\Users\sbosse\segmentation_gym>python make_nd_dataset.py
C:/Users/sbosse/segmentation_gym/my_seggym_datasets/2022_DE_Coast/npzForModel
C:/Users/sbosse/segmentation_gym/my_seggym_datasets/config/2022_DeCoast_watermask_nadir_2class_batch6.json
Using GPU
C:/Users/sbosse/segmentation_gym/my_seggym_datasets/2022_DE_Coast/FromDoodler/labels/labels
C:/Users/sbosse/segmentation_gym/my_seggym_datasets/2022_DE_Coast/FromDoodler/images/images
Found 500 image and 500 label files
C:/Users/sbosse/segmentation_gym/my_seggym_datasets/2022_DE_Coast/FromDoodler/resized_labels/resized_labels already exists: skipping the image resizing step
0 label files
1 sets of 0 image files
Creating non-augmented subset
Version: 2.6.0
Eager mode: True
2022-10-20 11:02:15.364899: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX AVX2
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-10-20 11:02:15.923135: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1510] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 8964 MB memory: -> device: 0, name: NVIDIA GeForce RTX 2080 Ti, pci bus id: 0000:3b:00.0, compute capability: 7.5
Traceback (most recent call last):
File "C:\Users\sbosse\segmentation_gym\make_nd_dataset.py", line 406, in
dataset = tf.data.Dataset.list_files(filenames, shuffle=False)
File "C:\ProgramData\Anaconda3\envs\gym\lib\site-packages\tensorflow\python\data\ops\dataset_ops.py", line 1229, in list_files
assert_not_empty = control_flow_ops.Assert(
File "C:\ProgramData\Anaconda3\envs\gym\lib\site-packages\tensorflow\python\util\dispatch.py", line 206, in wrapper
return target(*args, **kwargs)
File "C:\ProgramData\Anaconda3\envs\gym\lib\site-packages\tensorflow\python\util\tf_should_use.py", line 247, in wrapped
return _add_should_use_warning(fn(*args, **kwargs),
File "C:\ProgramData\Anaconda3\envs\gym\lib\site-packages\tensorflow\python\ops\control_flow_ops.py", line 160, in Assert
raise errors.InvalidArgumentError(
tensorflow.python.framework.errors_impl.InvalidArgumentError: Expected 'tf.Tensor(False, shape=(), dtype=bool)' to be true. Summarized data: b'No files matched pattern: '

Hi @sbosse12 ,

First, please upgrade gym (or just download the new version of makedatastes) - i made a very small change (1ccde41)

Second, with the labels, images, and configs you provide, using a fresh install of gym, make dataset works for me:

DE_Coast_water_masknoaug_ex11

My thinking is that, since the folders exist, the entire resizing operation is being skipped:

C:/Users/sbosse/segmentation_gym/my_seggym_datasets/2022_DE_Coast/FromDoodler/resized_labels/resized_labels already exists: skipping the image resizing step

However, from the next line in the output, it seems like the resize folders on your machine are empty:

C:/Users/sbosse/segmentation_gym/my_seggym_datasets/2022_DE_Coast/FromDoodler/resized_labels/resized_labels already exists: skipping the image resizing step
0 label files
1 sets of 0 image files

The easiest solution right now is for you to just delete the two folders resized_images and resized_labels. Then make_datasets will remake these folders and remake the images..

can you try this fix and report back?

I'll try the update, whenever I delete them and run it says the folders don't exist and therefore skips the step again

after the update, and deleting resize folders I got this

(gym) C:\Users\sbosse\segmentation_gym>python make_nd_dataset.py
C:/Users/sbosse/segmentation_gym/my_seggym_datasets/2022_DE_Coast/npzForModel
C:/Users/sbosse/segmentation_gym/my_seggym_datasets/config/2022_DeCoast_watermask_nadir_2class_batch6.json
Using GPU
C:/Users/sbosse/segmentation_gym/my_seggym_datasets/2022_DE_Coast/FromDoodler/labels/labels
C:/Users/sbosse/segmentation_gym/my_seggym_datasets/2022_DE_Coast/FromDoodler/images/images
Found 500 image and 500 label files
joblib.externals.loky.process_executor._RemoteTraceback:
"""
Traceback (most recent call last):
File "C:\ProgramData\Anaconda3\envs\gym\lib\site-packages\joblib\externals\loky\process_executor.py", line 428, in _process_worker
r = call_item()
File "C:\ProgramData\Anaconda3\envs\gym\lib\site-packages\joblib\externals\loky\process_executor.py", line 275, in call
return self.fn(*self.args, **self.kwargs)
File "C:\ProgramData\Anaconda3\envs\gym\lib\site-packages\joblib_parallel_backends.py", line 620, in call
return self.func(*args, **kwargs)
File "C:\ProgramData\Anaconda3\envs\gym\lib\site-packages\joblib\parallel.py", line 288, in call
return [func(*args, **kwargs)
File "C:\ProgramData\Anaconda3\envs\gym\lib\site-packages\joblib\parallel.py", line 288, in
return [func(*args, **kwargs)
File "C:\Users\sbosse\segmentation_gym\make_nd_dataset.py", line 104, in do_resize_image
imsave(fdirout+os.sep+f.split(os.sep)[-1].replace('.jpg','.png'), result.astype('uint8'), check_contrast=False, compression=0)
File "C:\ProgramData\Anaconda3\envs\gym\lib\site-packages\skimage\io_io.py", line 143, in imsave
return call_plugin('imsave', fname, arr, plugin=plugin, **plugin_args)
File "C:\ProgramData\Anaconda3\envs\gym\lib\site-packages\skimage\io\manage_plugins.py", line 207, in call_plugin
return func(*args, **kwargs)
File "C:\ProgramData\Anaconda3\envs\gym\lib\site-packages\imageio\v2.py", line 238, in imwrite
with imopen(uri, "wi", **imopen_args) as file:
File "C:\ProgramData\Anaconda3\envs\gym\lib\site-packages\imageio\core\imopen.py", line 118, in imopen
request = Request(uri, io_mode, format_hint=format_hint, extension=extension)
File "C:\ProgramData\Anaconda3\envs\gym\lib\site-packages\imageio\core\request.py", line 248, in init
self._parse_uri(uri)
File "C:\ProgramData\Anaconda3\envs\gym\lib\site-packages\imageio\core\request.py", line 412, in _parse_uri
raise FileNotFoundError("The directory %r does not exist" % dn)
FileNotFoundError: The directory 'C:\Users\sbosse\segmentation_gym\my_seggym_datasets\2022_DE_Coast\FromDoodler\resized_images\resized_images' does not exist
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "C:\Users\sbosse\segmentation_gym\make_nd_dataset.py", line 248, in
w = Parallel(n_jobs=-2, verbose=0, max_nbytes=None)(delayed(do_resize_image)(os.path.normpath(f), TARGET_SIZE) for f in files)
File "C:\ProgramData\Anaconda3\envs\gym\lib\site-packages\joblib\parallel.py", line 1098, in call
self.retrieve()
File "C:\ProgramData\Anaconda3\envs\gym\lib\site-packages\joblib\parallel.py", line 975, in retrieve
self._output.extend(job.get(timeout=self.timeout))
File "C:\ProgramData\Anaconda3\envs\gym\lib\site-packages\joblib_parallel_backends.py", line 567, in wrap_future_result
return future.result(timeout=timeout)
File "C:\ProgramData\Anaconda3\envs\gym\lib\concurrent\futures_base.py", line 446, in result
return self.__get_result()
File "C:\ProgramData\Anaconda3\envs\gym\lib\concurrent\futures_base.py", line 391, in __get_result
raise self._exception
FileNotFoundError: The directory 'C:\Users\sbosse\segmentation_gym\my_seggym_datasets\2022_DE_Coast\FromDoodler\resized_images\resized_images' does not exist

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "C:\Users\sbosse\segmentation_gym\make_nd_dataset.py", line 250, in
w = Parallel(n_jobs=-2, verbose=0, max_nbytes=None)(delayed(do_resize_image)(os.path.normpath(f), TARGET_SIZE) for f in files.squeeze())
AttributeError: 'list' object has no attribute 'squeeze'

(gym) C:\Users\sbosse\segmentation_gym>

Ok I got it to work. It appears you need to have a resize folder created, however, you must only create the first level (my_datasets/resized_images compared to my_datasets/resized_images/resized_images, which is what gym suggests needs to exist as shown above). Gym will create a second folder with the same name within it containing all the resized files.

hmm.. I'm a bit curious as to why there are duplicate 'resized_images` directories?

FileNotFoundError: The directory 'C:\Users\sbosse\segmentation_gym\my_seggym_datasets\2022_DE_Coast\FromDoodler\resized_images\resized_images' does not exist

and for that matter, resized image and label direcotries..

C:/Users/sbosse/segmentation_gym/my_seggym_datasets/2022_DE_Coast/FromDoodler/labels/labels
C:/Users/sbosse/segmentation_gym/my_seggym_datasets/2022_DE_Coast/FromDoodler/images/images

can you confirm that your directory structure looks exactly like this:
https://github.com/Doodleverse/segmentation_gym/wiki/3_Directory-Structure-and-Tests

oh, i see your comment above...

just to clarify - you should not have any folder named resized_images (or nested folders).. and gym will work.. but if you got it working, then that is good ...

I will close this issue if its working for you....

C:/Users/sbosse/segmentation_gym/my_seggym_datasets/2022_DE_Coast/FromDoodler/labels/labels
C:/Users/sbosse/segmentation_gym/my_seggym_datasets/2022_DE_Coast/FromDoodler/images/images

These are how my folders are organized, nesting the image/label folders was the suggested structure for previous versions of Gym as well as the sample dataset provided on git hub
image

FWIW: no nesting is needed now, on any directories... the WIki link i sent has the current suggested directory structure..

But yes I have a full dataset of augmented and nonaugmented npz's
Thank's y'all! Might want to update that downloadable sample dataset zip folder.

image

This one

I'm catching up with this thread. Thanks for working through this!

I know I still need to redo the sample dataset. I'll try to get to it in the next few days.

What exactly was the problem here? That @sbosse12 a) wasn't using the latest version of Gym, and b) had already created the 'resized' folders (even though there is nothing in the instructions that says to do this?)

Sorry, just catching up and want to make sure I understand .... thinking about what changes need to be made to the documentation

I believe the largest issue was using nested folders. Using the nested folders was giving Gym a problem when attempting to create the resized files.