Error in training script

Question

Error in training script

Closed this issue 9 months ago · 2 comments

Hello Benjamin,

I have encountered an error while attempting to execute the script for example training. The error message is as follows:

ValueError Traceback (most recent call last)
in
117 steps_per_epoch=steps_per_epoch,
118 regression_metric=regression_metric,
--> 119 work_with_residual_channel=work_with_residual_channel)

~/SynthSR/SynthSR/training.py in training(labels_dir, model_dir, prior_means, prior_stds, path_generation_labels, segmentation_label_list, segmentation_label_equivalency, segmentation_model_file, fs_header_segnet, relative_weight_segmentation, prior_distributions, images_dir, path_generation_classes, FS_sort, batchsize, input_channels, output_channel, target_res, output_shape, flipping, padding_margin, scaling_bounds, rotation_bounds, shearing_bounds, translation_bounds, nonlin_std, simulate_registration_error, data_res, thickness, randomise_res, downsample, blur_range, build_reliability_maps, bias_field_std, bias_shape_factor, n_levels, nb_conv_per_level, conv_size, unet_feat_count, feat_multiplier, dropout, activation, lr, lr_decay, epochs, steps_per_epoch, regression_metric, work_with_residual_channel, loss_cropping, checkpoint, model_file_has_different_lhood_layer)
339 batch_norm=-1,
340 activation=activation,
--> 341 input_model=labels_to_image_model)
342 print("PASA UNET")
343
~/SynthSR/ext/neuron/models.py in unet(nb_features, input_shape, nb_levels, conv_size, nb_labels, name, prefix, feat_mult, pool_size, use_logp, padding, dilation_rate_mult, activation, skip_n_concatenations, use_residuals, final_pred_activation, nb_conv_per_level, add_prior_layer, add_prior_layer_reg, layer_nb_feats, conv_dropout, batch_norm, input_model)
182 conv_dropout=conv_dropout,
183 batch_norm=batch_norm,
--> 184 input_model=input_model)
185
186 print("PASA ENCODER")

~/SynthSR/ext/neuron/models.py in conv_enc(nb_features, input_shape, nb_levels, conv_size, name, prefix, feat_mult, pool_size, dilation_rate_mult, padding, activation, layer_nb_feats, use_residuals, nb_conv_per_level, conv_dropout, batch_norm, input_model)
420 print("A VER: ",convL(nb_lvl_feats, conv_size, **conv_kwargs, name=name).dict)
421 print("LAST TENSOR: ",last_tensor)
--> 422 print("prueba 2: ",convL(nb_lvl_feats, conv_size, **conv_kwargs, name=name)(last_tensor))
423
424 last_tensor = convL(nb_lvl_feats, conv_size, **conv_kwargs, name=name)(last_tensor)

/mnt/workspace/bllancao/miniconda3/envs/py36/lib/python3.6/site-packages/keras/backend/tensorflow_backend.py in symbolic_fn_wrapper(*args, **kwargs)
73 if _SYMBOLIC_SCOPE.value:
74 with get_graph().as_default():
---> 75 return func(*args, **kwargs)
76 else:
77 return func(*args, **kwargs)

/mnt/workspace/bllancao/miniconda3/envs/py36/lib/python3.6/site-packages/keras/engine/base_layer.py in call(self, inputs, **kwargs)
461 'You can build it manually via: '
462 'layer.build(batch_input_shape)')
--> 463 self.build(unpack_singleton(input_shapes))
464 self.built = True
465
/mnt/workspace/bllancao/miniconda3/envs/py36/lib/python3.6/site-packages/keras/layers/convolutional.py in build(self, input_shape)
130 channel_axis = -1
131 if input_shape[channel_axis] is None:
--> 132 raise ValueError('The channel dimension of the inputs '
133 'should be defined. Found None.')
134 input_dim = input_shape[channel_axis]

ValueError: The channel dimension of the inputs should be defined. Found None.

Upon investigating the code, I discovered that the vector "last_tensor" has the following dimensions for the example script: "last_tensor: Tensor("image_out/Identity:0", shape=(1, 128, 128, 128, 4), dtype=float32)." It seems that there might be a compatibility issue with libraries or something similar. I have attempted to resolve this by using Python versions 3.6.2, 3.6.5, and 3.7, but the error persists.

Have you encountered a similar issue before, or do you have any insights into resolving this matter? Your assistance would be greatly appreciated.

Thank you for your time and consideration.

Best regards,

Benjamín

Answer 1 · 2024-03-25T16:20:01.000Z

Dear Dr. Billot,
I am running into the same error when trying to train my own model. I have tried to fix it, but to no avail. Are there any intentions of fixing this issue soon? Otherwise I would use another SR model.
Does anybody know a workaround?
Thank you very much and best regards!

Answer 2 · 2024-03-28T23:15:36.000Z

Hi,
first of all thank you for your patience!

I was able to pinpoint the issue. It was caused by keras, which sometimes has a hard time keeping track of tensor's shape, so out of laziness it replaces them with None. So I had to remind keras of the actual shape of tensors by adding some lines:

image._keras_shape = tuple(image.get_shape().as_list())

which you may have seen a lot in the rest of the labels_to_image.
Anyway, it was not related to library versions after all.

Let me know if you run into any more problems!
Best,
benjamin