rishizek/tensorflow-deeplab-v3

Troubles after training on my own data set

Lokyr opened this issue · 11 comments

Lokyr commented

Hello,

First, thank you for your implementation. It's very helpful.

I try to train the network with my own images but I face to several troubles:

  • Predictions have not the same size than input. I can observe a reduction. For exemple, my RGB input has a size of 512x512(x3) and my output has only a size of 430x418. Is it normal ?
  • Predictions are very strange. When I trained with only 1 or 2 epochs, I have a result (bad results, may be a random result). When I trained with more epochs, I observed a black image with a white frame as output (and the final train_mean_IoU is 1). It is like the network don't succeed to train with my ground truths and consider a black image as a target.

I checked my data format and it doesn't seem have issues with it. My images are RGB 8 bit images 512x512x3, exactly like images from PASCAL VOC 2012. On the other hand, my ground truths are one channel images, 8 bits, with 3 labels (0, 1 and 2). I used create_pascal_tf_record.py to create my record files with my images as explained.

I try to train the network with only 10 images. But it doesn't matter now because I want to go to an over learning. A problem with my data is that class are very unbalanced and I didn't find a way to weight them in deepLab.

I had exactly the same problem with TensorFlow Unet implementation, but not with CAFFE implementation of FCN-8s. I am looking for a solution for two weeks without success.

Do you have an idea ?

Thank you.

Hi @Lokyr , thank you for your interest in the repo.

Regarding your first question (predictions have not the same size as input), if your are talking about the ouput image produced by my inference.py code, it is because of the implementation. The predictions of deepLab and also other semantic segmentation model should produce the same size as input image. For the inference.py code specific, the predicted image size is same as input up to here, but when saving image using plt.imshow() and plt.savefig() here, the image size is automatically changed. You can refer to here to know how to save the prediction with exact size as input.

Regarding your second question, I first thought you are using incorrect ground truths with 3 channel images like this issue, but seems like you are correctly using 1 channel ground truths (but still I'm suspecting this could be the cause). Did you experiment with training PASCAL VOC dataset? If you have the same issue with PASCAL VOC dataset, you may have wrong data preprocessing somewhere.

Lokyr commented

Thank you for your reply.

Sorry, I didn't check closed subjects. My apologies.

After test, I had the same issue with images from PASCAL VOC 2012. Reading your link, I think I found my problem. As you suspected, it probably comes from my ground truths.

As segmentationClass from PASCAL VOC, they were labeled on one channel but I used color table to visualise more easily. It seems it caused troubles during the training althought the good image format, even if I don't understand exactly why. Yet, I didn't succeed to have correct results.

My problem may also come from unbalanced data. Is there already a way to weight the various class during the training ?

Hi @Lokyr ,

I'm glad to hear that you may have some hit for your problem.

Regarding unbalanced data, I've never tried but there is a way to give different weight to each class. You may refer to here and here.

Lokyr commented

Hi @rishizek ,

I succeeded to weight my class during the training process. Unfortunately, the conclusion is always the same: a black image as prediction. However, the final mean_iu is always 0.4815225, not matter how many epochs I use or what parameters I chose (batch size, momentum, initial learning rate,...). It means ground truths seem correct (IoU is correctly computed) but I don't know why it fails during training. Beside, there is a huge difference between mean_iu at training iterations (0.80 or more) and at global steps for evaluations (0.4815225). I don't understand why.

I checked data in the record file and all seems good (labels are not empty). I tested without preprocess too (random scale and crop) and result doesn't change. Because of limits of my laptop, I can't test on all PASCAL VOC data, but I have the same problem when I try to train on 1 image from it (I used augmented data as you explained in another subject).

May be my Tensorflow is wrong and I should reinstall it.

Update: I tried to reinstall TensorFlow, even with 1.6 version, and no change.

I have a problem as same with you

@Lokyr
I have the same problem and I guess it because my data is heavily unbalanced more towards the background with label 0.
But I don't know how to apply the weighted loss to adjust the weights for each class.
Have you found any solution?

Lokyr commented

@Sam813
Unfortunately no. I had to give up Deeplab architecture and chose Unet architecture which seems to work better. I would like understand what it was wrong to use it in the futur, but I failed to find why and I have no time right now for this. During the next month, I will work again with Deep learning segmentation. I will update this thread if I succeed to solve the issue.

If you want to try a weighting of your class, you can use the following code from a link of rishizek (here):

` #Add weights for unbalanced class in cross entropy function (adapt the current code directly following the number of class and weighting)
class_weights = tf.constant([0.03,0.97]) # Here, I weighted the first label (my background class) with 0.03 and the other class with 0.97
weights = tf.gather(class_weights, valid_labels)

cross_entropy = tf.losses.sparse_softmax_cross_entropy(
logits=valid_logits, labels=valid_labels, weights=weights)`

Change the cross entropy calculation in deeplabv3_model_fn function from deeply_model.py.
But, as much as I failed to have good results, I can not say if it works or not. Can you let me know if you succeed ?

@Lokyr
Thanks for the answer,
I get back to work with deeplab, I will try this and update you here.

@Lokyr I have the very same problem as yours number 2. I have two classes that were originally marked as 0 and 255, and the network kept predicting all black with a slight frame of white (some padding issues?) Pixel accuracy was very close to 1, and mean IoU was exactly one half of the accuracy (because it somewhy kept considering targets as all black). At a first glance I thought that my issue appeared because I was using 255 as one of the labels, therefore there might have been some problem with _IGNORE_LABEL=255. However, later I've changed 255 to 200 to avoid this, but I still get the very same result :( I have even converted all files to match the extensions to the ones mentioned in this repo (jpg for images and png for labels), but still no luck. If I catch where my problem is I'll mention it here.

P.S. Everything with the provided VOC dataset seems to work just fine, so I guess I messed up the preprocessing step at some point. Maybe something with tf record? I'm not too proficient with it, so something might have slipped from me..

@Lokyr @Sam813 for me an issue was that my labels were not consecutive - e.g., if you have a two-class segmentation then your labels should be 0 and 1, but I had them 0 and 200. Changing the labels solved my problem, maybe it can help you as well

Lokyr commented

@AlexDenis Thank you for your help and your suggestion. Unfortunately, I checked my datas several times (my raw data and my record data coming from raw data) and all is okay. In addition, it works with other networks as UNet.

I try again to use DeepLab and I have now the same problem that previously. The training is good (I have a mean IoU of 0.98 on my training set, probably a lot of over-learning) but when I want to infer my evaluate or test set, I have only black images. I tried to use the training set as evaluation and test sets (which should provide very good segmentations) but the result is the same.

Does anyone have the same problem?