j96w/DenseFusion

poor segmentation results

Closed this issue · 2 comments

using this training script and training for 102 epochs I am getting this as an output of my segnet NN model that is inserted into the eval_ycb.sh script

test_copy
where bonding boxes are calculated as

            image = cv2.imread(label_img)
            copy = image.copy()
            gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
            thresh = cv2.threshold(gray,0,255,cv2.THRESH_OTSU + cv2.THRESH_BINARY)[1]
            cnts = cv2.findContours(thresh, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
            cnts = cnts[0] if len(cnts) == 2 else cnts[1]

            ROI_number = 0
            ROI = image[cmin:cmax,rmin:rmax]
            cv2.imwrite('ROI_{}.png'.format(ROI_number), ROI)
            cv2.rectangle(copy,(cmin,rmin),(rmax,cmax),(100,36,12),4)
            ROI_number += 1
            for c in cnts:
                x,y,w,h = cv2.boundingRect(c)
                ROI = image[y:y+h, x:x+w]
                cv2.imwrite('ROI_{}.png'.format(ROI_number), ROI)
                cv2.rectangle(copy,(x,y),(x+w,y+h),(36,255,12),2)
                ROI_number += 1

The desired bounding box is shown in blue and the determined bounding boxes are in green

The expected mask image YCB_Video_Dataset/data/0048/000001-label.png is
000001-label

So, there is a large difference between the two images and there are many bounding boxes, none of which makes sense. my code for running the segmentation model is

rgb = np.transpose(rgb, (2, 0, 1))
rgb = norm(torch.from_numpy(rgb.astype(np.float32))).cuda()
img_out = torch.nn.functional.softmax(segmentor(rgb.unsqueeze(0)), dim=1)
img_out_2 = img_out.cpu().data.numpy()

@j96w I would appreciate any advice that you may have. Thank you very much

I figured it out, I needed to look at the first channel as

seg_results = np.argmax(img_out_2[0,:,:,:], axis = 0)==1

not sure why, but I need to save the image and load it

cv2.imwrite(seg_img, seg_results.astype(np.uint8))
image = cv2.imread(seg_img)

test_copy

I figured it out, I needed to look at the first channel as

seg_results = np.argmax(img_out_2[0,:,:,:], axis = 0)==1

not sure why, but I need to save the image and load it

cv2.imwrite(seg_img, seg_results.astype(np.uint8))
image = cv2.imread(seg_img)

test_copy

Hi,

Could you share the trained model of segmentation? Due to limited calculation power, I trained a model with 0.1189 loss by 3 days. And I am not sure if it is enough for this application. Thank you.

Regard,