alibaba/cascade-stereo

Too many background pixels of tanks and temples

Opened this issue · 18 comments

Hi. Thanks very much for your great work and releasing code. I use your pretrained model to test on tanks and temple dataset, the final fused model has too many background pixels such as sky in LIghtHouse. But in your paper, the model is quite clean. So how to remove such background pixels?

Hi. Thanks very much for your great work and releasing code. I use your pretrained model to test on tanks and temple dataset, the final fused model has too many background pixels such as sky in LIghtHouse. But in your paper, the model is quite clean. So how to remove such background pixels?

Hi. Do you also fuse the depth maps using the fusibile method provided by Gipuma? It seems that for datasets provided in Tanks and Temples, it takes more than 11G memory of GPU to fuse the depth maps with resolution of 1920x1080.

tatsy commented

For reference, I consumed about 22GB memory by Fusibile for processing Francis, Lighthouse and some other image sets in Tanks and Temples.

@tatsy I used fusibile to produce the 3D models on Tanks and Temples. However, for some scenes such as Lighthouse, Francis, Playground, there are too many noise pixels in the background. Did you have the same problem? Can you tell me your settings for CasMVSNet and fusibile to produce a good result?

Thanks,
Khang

tatsy commented

Hi @TruongKhang,

I tested both the fusibile and python fusion. As far as I tested, the amount of noise is a bit less by python fusion than by fusibile. I also found that we can reduce the noise by increasing the number of views for the MVSNet part and also by increasing the number of consistent views for the depth fusion.

However, I have not yet completely solved the problem of much noise in the background, and have not yet succeeded in reproducing the results for Tanks & Temples reported in their paper.

I'd really appreciate for it if you would share us your knowledge to solve the problem.

Hi @tatsy ,
I've been testing with both the provided python fusion and the fusibile fusion. As you said, the python fusion is much better than fusibile fusion for the Tanks & Temples dataset. Also, we need to run with a short depth range which is already provided in the dataset. I think the result is acceptable now.

tatsy commented

Hi @TruongKhang,

Thank you for sharing your knowledge. Yes, I got almost the same results as those you got. I used the short depth range provided by the MVSNet repo, too. Maybe, the mean score for the intermediate image set will be around 50 or slightly more than that. Is it also true for you?

@tatsy , did you try to submit those results to the Tank & Temples website? And you got the lower accuracy as reported in the paper? I haven't tried to submit yet.

I get terrible point cloud results on tank and temples dataset using pre-trained model. Almost nothing can be reconstruct. I used
--dataset=general_eval --batch_size=1 --testpath=$TESTPATH --testlist=$TESTLIST --loadckpt $CKPT_FILE --outdir $save_results_dir --interval_scale 1.06 --max_h=2048 --max_w=2048${@:2}
What other params should I set?

@YuhsiHu , which fusion method did you use? I suggest you should change the number of consistent views.

Hi @TruongKhang,

Thank you for sharing your knowledge. Yes, I got almost the same results as those you got. I used the short depth range provided by the MVSNet repo, too. Maybe, the mean score for the intermediate image set will be around 50 or slightly more than that. Is it also true for you?
Hi@tatsy how to set the short depth range?

@YuhsiHu , which fusion method did you use? I suggest you should change the number of consistent views.

@TruongKhang the number of consistent views , how many is better?

@ChenLiufeng It is changed manually for each scene from my testing. It is higher than 5 views. And I think in their submission, they use both the fusibile fusion and the basic fusion provided in their source code.

@ChenLiufeng It is changed manually for each scene from my testing. It is higher than 5 views. And I think in their submission, they use both the fusibile fusion and the basic fusion provided in their sou

@ChenLiufeng It is changed manually for each scene from my testing. It is higher than 5 views. And I think in their submission, they use both the fusibile fusion and the basic fusion provided in their source code.
@TruongKhang Thank you for your applying. I change ' --max_h=2048 --max_w=2048' ,test on family. I can't see the statue in .ply. Can you show me the details? What can I do to improve my results?

@ChenLiufeng did you use the camera parameters with a short-range depth? Can you show me a picture of your result?
There are some parameters you should notice:

  1. --filter_method: you can set to "gipuma" to use fusibile fusion. If you want to use the basic fusion then just ignore this parameter.
  2. If you use the basic fusion, you can try to change the parameter --thres_view. Here I recommend setting it to 5. You can test with the other values.
  3. If you use the fusibile fusion, you should notice two parameters --disp_threshold and num_consistent. When you increase disp_threshold, you should increase num_consistent. Here you can test with disp_threshold = 0.4 and num_consistent=6.
    I also run several other MVS methods, the confidence parameter --prob_threshold or --conf sometimes has a good impact to remove outliers in the background. But in this CasMVSNet, it doesn't affect much!
    Hope it help. Let me know if you have any problems!

@tatsy sorry to bother you. i use the preprocessed data that MVSnet offered,is it different from yours? what's the short depth range data that you mentioned.
thanks a lot.

请问我按照casmvsnet流程测试T&T数据集出现以下效果该怎么有效解决,filter-method默认normal
最终融合:
image
ply_local文件夹点云:
image
深度图:
image

请问我按照casmvsnet流程测试T&T数据集出现以下效果该怎么有效解决,filter-method默认normal 最终融合: image ply_local文件夹点云: image 深度图: image

你好,我也遇到了这样的问题,请问你解决了吗