Kitware/Danesfield

VisSat Error-after bundle cameras

Closed this issue · 12 comments

Hi! I'm trying to run the generate_point_cloud section of the code using VisSat for the Jacksonville Core3D data. But
it seems to error out in self.run_colmap_sfm_perspective() because the after_bundle_cameras data never populates. I'm using the docker imager. Here's the full error:

Traceback (most recent call last):
  File "/VisSatSatelliteStereo/stereo_pipeline.py", line 511, in <module>
    pipeline.run()
  File "/VisSatSatelliteStereo/stereo_pipeline.py", line 115, in run
    self.run_colmap_sfm_perspective()
  File "/VisSatSatelliteStereo/stereo_pipeline.py", line 346, in run_colmap_sfm_perspective
    colmap_sfm_perspective.run_sfm(work_dir, sfm_dir, init_camera_file, weight)
  File "/VisSatSatelliteStereo/colmap_sfm_perspective.py", line 95, in run_sfm
    after_bundle_params = after_bundle_cameras[img_name]
KeyError: '0000_WV03_14OCT05_160138-P1BS-500648062040_01_P001.png'
Traceback (most recent call last):
  File "/danesfield/tools/generate_point_cloud.py", line 51, in <module>
    main(sys.argv[1:])
  File "/danesfield/tools/generate_point_cloud.py", line 32, in main
    subprocess.run(cmd_args, check=True)
  File "/opt/conda/envs/core3d/lib/python3.6/subprocess.py", line 418, in run
    output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command '['/bin/bash', '-c', 'source /opt/conda/etc/profile.d/conda.sh && conda activate vissat                 && python3 /VisSatSatelliteStereo/stereo_pipeline.py --config_file /home/derrickbonafilia/config.json']' returned non-zero exit status 1.
ERROR:root:---- Error on step: VisSat. Aborting! ----

and here's the config for the VisSat:

{'dataset_dir': '/home/derrickbonafilia/spacenet/WV3/PAN/', 'work_dir': './', 'bounding_box': {'zone_number': 17, 'hemisphere': 'N', 'ul_easting': 435525, 'ul_northing': 3355525, 'width': 1402.0, 'height': 1448.0}, 'steps_to_run': {'clean_data': True, 'crop_image': True, 'derive_approx': True, 'choose_subset': True, 'colmap_sfm_perspective': True, 'inspect_sfm_perspective': False, 'reparam_depth': True, 'colmap_mvs': True, 'aggregate_2p5d': True, 'aggregate_3d': True}, 'alt_min': -30.0, 'alt_max': 120.0}

Thanks for any help you can give!

Are you running it with nvidia-docker?

Yes, I'm running it with:

nvidia-docker run -it --rm --gpus all --shm-size 8G\
     -v /$DATA/CORE3D:/mnt -v $HOME:/home/$USER -v /$WORK:/work\
     kitware/danesfield /bin/bash

I've haven't been running this myself, so I'm not sure if others have encountered this. @jacobderosa or @borovik135 have you seen this issue before?

I focused mostly on the imageless pipeline, thus no recollection of this or similar error.

I don't know exactly what the issue is but I suspect something went wrong during the derive_approx step or during the SIFT matching early in the colmap_sfm step. Check the log files

thanks for the log file pointer. Looks like I get through the derive_approx Feature Extraction and Exhaustive Feature Matching stages without error, but then I get to loading databases and get a SIGSEGV. Not sure if you have any insight into what might be happening here? I'm input the data inside Satellite-Images/Jacksonville/WV3/PAN to the pipeline, is that correct?

Here's the log output leading up to the error.
`Running subprocess: colmap feature_extractor --database_path /home/derrickbonafilia/work/colmap/sfm_perspective/database.db
--image_path /home/derrickbonafilia/work/colmap/sfm_perspective/images --Ima
geReader.camera_model PERSPECTIVE --SiftExtraction.max_image_size 10000
--SiftExtraction.estimate_affine_shape 0 --SiftExtraction.domain_size_pooling 1
--SiftExtraction.max_num_features 25000 --SiftExtraction.num_threads 32
--SiftExtraction.use_gpu 1 --SiftExtraction.gpu_index 0,1,2,3

==============================================================================

Feature extraction

==============================================================================

Subprocess finished
Running subprocess: colmap exhaustive_matcher --database_path /home/derrickbonafilia/work/colmap/sfm_perspective/database.db
--SiftMatching.guided_matching 1 --SiftMatching.num_threads
6 --SiftMatching.max_error 3 --SiftMatching.max
_num_matches 30000 --SiftMatching.gpu_index 0,1,2,3

==============================================================================

Exhaustive feature matching

==============================================================================

Elapsed time: 0.013 [minutes]

Subprocess finished
Running subprocess: colmap point_triangulator --Mapper.ba_refine_principal_point 1 --databa
se_path /home/derrickbonafilia/work/colmap/sfm_perspective/database.db --image_path /home/d
errickbonafilia/work/colmap/sfm_perspective/images --input_path /home/derrickbonafilia/work
/colmap/sfm_perspective/tri --output_path /home/derrickbonafilia/work/colmap/sfm_perspectiv
e/tri --Mapper.filter_min_tri_angle 4.99 --Map
per.init_max_forward_motion 1e20 --Mapper.tri_min_angle 5.00
--Mapper.tri_merge_max_reproj_error 32.0 --Mapper.tri_complete_max_reproj_er
ror 32.0 --Mapper.filter_max_reproj_error 32.0
--Mapper.extract_colors 1 --Mapper.ba_refine_focal_length 0
--Mapper.ba_refine_extra_params 0 --Mapper.max_extra_param 1e20
--Mapper.ba_local_num_images 6 --Mapper.ba_local_max_num_it
erations 100 --Mapper.ba_global_images_ratio 1.0000001
--Mapper.ba_global_max_num_iterations 100 --Mapper.tri_ignore_two_view_tracks 1

==============================================================================

Loading database

==============================================================================

Loading cameras... 26 in 0.000s

Loading matches... 0 in 0.000s

Loading images... 0 in 0.000s (connected 0)

Building correspondence graph... in 0.000s (ignored 0)

Elapsed time: 0.000 [minutes]

*** Aborted at 1633535662 (unix time) try "date -d @1633535662" if you are using GNU date ***

PC: @ 0x557e8b25afbd (unknown)

*** SIGSEGV (@0x0) received by PID 1997 (TID 0x7f85b5ae7900) from PID 0; stack trace: ***

@     0x7f85b408c980 (unknown)

@     0x557e8b25afbd (unknown)

@     0x557e8b233140 (unknown)

@     0x557e8b2200dd (unknown)

@     0x7f85b0e09bf7 __libc_start_main

@     0x557e8b229f7a (unknown)

`

Could it be that we are running out of RAM with --SiftExtraction.max_image_size 10000? default is 3200. Exhaustive feature matching is O(N^2). One may be better off on RAM and speed with more economical and faster vocab_tree_matcher.

I'm not convinced that it's finding the images to process at all. Is there any thing in /home/derrickbonafilia/work/colmap/sfm_perspective/images after processing? Can you confirm the names of the images you put in Satellite-Images/Jacksonville/WV3/PAN?

Here's everything in /home/derrickbonafilia/work/colmap/sfm_perspective/images :

0000_WV03_14OCT05_160138-P1BS-500648062040_01_P001.png  0013_WV03_15APR26_162435-P1BS-501504472050_01_P001.png
0001_WV03_14OCT05_160149-P1BS-500648061080_01_P001.png  0014_WV03_15MAY01_160357-P1BS-500648062030_01_P001.png
0002_WV03_14OCT11_155720-P1BS-500648061020_01_P001.png  0015_WV03_15MAY02_161943-P1BS-500648061030_01_P001.png
0003_WV03_14OCT18_160722-P1BS-500648062090_01_P001.png  0016_WV03_15MAY14_160906-P1BS-501504473050_01_P001.png
0004_WV03_14OCT30_155732-P1BS-500648061040_01_P001.png  0017_WV03_15MAY21_161849-P1BS-500648061010_01_P001.png
0005_WV03_14DEC14_160402-P1BS-500648062060_01_P001.png  0018_WV03_15JUN15_161248-P1BS-500648061100_01_P001.png
0006_WV03_14DEC27_161109-P1BS-500648062070_01_P001.png  0019_WV03_15JUL05_162954-P1BS-500648062020_01_P001.png
0007_WV03_15JAN21_161243-P1BS-500648061050_01_P001.png  0020_WV03_15SEP25_163525-P1BS-501504473070_01_P001.png
0008_WV03_15JAN21_161253-P1BS-500648062050_01_P001.png  0021_WV03_15NOV01_161954-P1BS-500648062080_01_P001.png
0009_WV03_15JAN21_161308-P1BS-501504474040_01_P001.png  0022_WV03_15NOV01_162034-P1BS-500648061060_01_P001.png
0010_WV03_15JAN27_160845-P1BS-500648062010_01_P001.png  0023_WV03_15DEC21_161108-P1BS-501504473080_01_P001.png
0011_WV03_15FEB15_161208-P1BS-500648061070_01_P001.png  0024_WV03_16FEB11_163042-P1BS-501504472070_01_P001.png
0012_WV03_15APR19_161439-P1BS-501504474050_01_P001.png  0025_WV03_16FEB18_164007-P1BS-501504472090_01_P001.png

The types of files in the folder are:

27JAN15WV031100015JAN27160845-P1BS-500648062010_01_P001_________AAE_0AAAAABPABS0.NTF
27JAN15WV031100015JAN27160845-P1BS-500648062010_01_P001_________AAE_0AAAAABPABS0.rm
27JAN15WV031100015JAN27160845-P1BS-500648062010_01_P001_________AAE_0AAAAABPABS0.tar
27JAN15WV031100015JAN27160845-P1BS-500648062010_01_P001_________AAE_0AAAAABPABS0.tif
27JAN15WV031100015JAN27160845-P1BS-500648062010_01_P001_________AAE_0AAAAABPABS0.vrt
27JAN15WV031100015JAN27160845-P1BS-500648062010_01_P001_________AAE_0AAAAABPABS0_lv1.tif

for each of the various dates/images in the Jacksonville repo

Wondering if maybe there's a CUDA issue happening somewhere. If there a suggested/required CUDA version for this? Or if there's some implied hardware requirements I might not be meeting?

Inside the container we are using CUDA 10.0. I'm not sure what sort of requirements that puts on the version of CUDA or the Nvidia driver running on the host.

It looks like the issue is this --SiftExtraction.num_threads 32 parameter in the Feature extraction. Deleted that and now it seems to be working.

Thanks to all of you for all the help on this and your responsiveness.