hellochick/PSPNet-tensorflow

i trained the new model,and modified the Num_class to 3,original 19, however, i got the results img fullfilled with pink color after i ran the inference, i don't know why?

sainttelant opened this issue · 10 comments

as i mentioned above, i retrained the model with the input imgs 1280x720, and classified into 3 NUMS_classes, trained models based on the pretrained models. finally, it generated new *.cptk files, however, it exported all pink color images with every test images, no any segement edges , i don;t know why ? any knows?

i modified the train.py here:
restore_var=[vv for vv in restore_var if not (vv.name.startswith("conv6/weights")|vv.name.startswith("conv6/biases"))]

and
BATCH_SIZE = 1
DATA_DIRECTORY = ''
DATA_LIST_PATH = './list/train_seg.txt'
IGNORE_LABEL = 255
INPUT_SIZE = '1280,720'
LEARNING_RATE = 1e-3
MOMENTUM = 0.9
NUM_CLASSES = 3
NUM_STEPS = 60001
POWER = 0.9
RANDOM_SEED = 1234
WEIGHT_DECAY = 0.0001
RESTORE_FROM = './'
SNAPSHOT_DIR = './model/'
SAVE_NUM_IMAGES = 4
SAVE_PRED_EVERY = 50

Could you please tell me what flags you used when training?

and removed the labels i didn't demand,only keeps the initial three labels, namely,
label_colours = [
(128, 64, 128), (244, 35, 231), (69, 69, 69)
# 0 = road, 1 = sidewalk, 2 = building
] in tools.py

and keeps other codes same as original codes

@hellochick
the loss finally stoped at 0.091, and finished at 60000.cpkt

@hellochick i 've confused when i modified
restore_var=[vv for vv in restore_var if not (vv.name.startswith("conv6/weights")|vv.name.startswith("conv6/biases"))] Here,
restore parameters are excluded the conv6 layers, whether i trained the model from the scratch or partly retrain the pre checkpoints, only aquired some parameters restored from Premodel?

@hellochick ,HI,guy, could you plz remind me how to configure the following network parameters when i changed the input imgs with 1280x720 size(width:1280)? i checked the original training imgs which are 720x720(SIZE), i think avg_pool, and the following codes should be modified corresponding to different training datasets, it that correct?

however, i tried to use the original codes to train my datasets without any modification of follows, it still could be trained , and finally, the loss converged to 0.092, at last , i ran the inference code to evaluate the test imgs,and got the unexpected results, segmentation failed with a pink color (indeed corresponding to Road label) img generated, i don't know why? any help will be appreciated so much, thanks in advance.

self.feed('conv5_3/relu')
             .avg_pool(40, 40, 40, 40, name='conv5_3_pool1')
             .conv(1, 1, 512, 1, 1, biased=False, relu=False, name='conv5_3_pool1_conv')
             .batch_normalization(relu=True, name='conv5_3_pool1_conv_bn')
             .resize_bilinear(shape, name='conv5_3_pool1_interp'))

        (self.feed('conv5_3/relu')
             .avg_pool(30, 30, 30, 30, name='conv5_3_pool2')
             .conv(1, 1, 512, 1, 1, biased=False, relu=False, name='conv5_3_pool2_conv')
             .batch_normalization(relu=True, name='conv5_3_pool2_conv_bn')
             .resize_bilinear(shape, name='conv5_3_pool2_interp'))

#52 (comment)
This is wrong i believe, @sainttelant . I tried the same with a larger number of classes for my model. This code will load the pre-trained weights except conv6 parameters into your computational graph and train accordingly. However while testing, my model still expects the initial number of classes used, i.e. 19 instead of 22 after loading the finetuned checkpoint. I get the error "Key conv6/biases not found in checkpoint". So i wonder how exactly are you able to generate the segmented image in this case.
I have changed the num_classes in inference.py file too.

@narendoraiswamy , yes, i think if you increased the num_classes to 22, you must have the annotation pics according with 22 color_labels, i wonder to know what 's the size of your input images for training?

@sainttelant. Yes. I have made that change too. But i do not think that the input size will have any effect on the problem you are facing. But just in case, i am using images of size [1080, 1980, 3]. But have you made any other change than the ones mentioned above? Since your model is not supposed to have conv6 parameters at all if you have followed the same process and your inference wont happen with an error thrown.

@sainttelant Could you please tell me what flags you used when training? my loss arround 2.0