jzbontar/mc-cnn

what is the preprocessing?

sooyeonshin opened this issue · 3 comments

can i ask you about your work?

I don't understand about preprocessing with kitti data set.

their are several outs exist like x0.bin, x1.bin, metadata, nnz_tr, nnz_te...

look's like x0.bin is the image pixel data and metadata is the image imformation.
but I can't understand what is tr, te, nnz_tr and nnz_te.

nnz_tr = torch.FloatTensor(23e6, 4)?

what is 23e6 meaning?

  • te: a list of size 40. Contains the indices of the validation set.
  • tr: a list of size 154 (194 - 40). Contains the indices of the training set.
  • nnz_{tr, te}: Contains the coordinates at which the ground truth is defined (ground truth disparity is not available at all locations of an image). The array is populated in the make_dataset2 function. 23e6 is just a big number. I know there are not more than 23e6 ground truth disparities (I have an assert for that in the C code).
Qurey commented

@sooyeonshin @jzbontar
I don't understand about preprocessing with middlebury data set
in the preprocess_mb.py
mask = cv2.imread('tmp/mask.png', 0)
disp0[mask != 255] = 0
y, x = np.nonzero(mask == 255).
what is the function of mask and what is mask meaning?

The mask stores information about which areas are occluded. Try opening the generated mask.png and it will make sense.