-
Name: Arvind Kanesan Rathna (ak4728) and Kyle Coelho (kc3415)
-
CUDA has been used for implementation. Therefore, make sure the CUDA environment is activated:
$ cudaEnv
- 3D and 2D RANSAC benchmarking has been done using
3d_python/python_ransac_3d.py
,3d_python/ransac_pycuda_3d_level4.py
,python_ransac.py
,ransac_pycuda_level4.py
. Hence, if the reader want to understand the implementation, it is important to read the code in these files. - To reproduce the benchmarks graphs for the 3D version of RANSAC as shown in the presentation and report, run the below command:
- First graph generated is named
ransac_3d_samples_cuda.png
by default. This plots the execution time for serial and cuda as we vary the size of the dataset. - Second graph generated is named
ransac_3d_models_cuda.png
by default. This plots the execution time for serial and cuda as we vary the number of RANSAC models. - Third graph generated is named
ransac_3d_const_mem_cuda.png
by default. This evaluates the effect of constant memory on execution time. - By setting the boolean
plot_cuda_mem
to True, we can choose to also print the execution of split-up between memory transfer and computation time for CUDA. - It has been verified that the outputs between serial and CUDA match perfectly. Randomization has been taken into account be setting an initial seed value.
- First graph generated is named
$ cd 3d_python/
$ python 3d_benchmarking.py
- To profile the code using NSight, first set the flags
n_samples_test
andransac_iterations_test
toFalse
in3d_benchmarking.py
. Then run:- In NSight, we can view the profiling information for SM%, Mem%, etc, as we vary the block size.
$ cd 3d_python/
$ nv-nsight-cu-cli -o metrics python 3d_benchmarking.py > output.txt
$ nv-nsight-cu metrics.nsight-cuprof-report
- In a similar vein, we can run the benchmarks for the 2D version of RANSAC and generate similar plots as shown in the report:
$ python benchmarking.py
- It is possible to individually run any of the .py files below and view the sample outputs:
# For example,
$ python ransac_pycuda_level4.py
3d_python/python_ransac_3d.py
- Serial implementation of RANSAC for 3D datapoints3d_python/ransac_pycuda_3d_level4.py
- Fully parallelized of RANSAC for 3D datapoints. This version is called level 4 in the report and presentation.3d_python/kernel_3d_ransac.cu
- The CUDA kernels for 3D RANSAC3d_python/ransac_pycuda_3d_level4_constant.py
- Fully parallelized of RANSAC for 3D datapoints alongwith using constant memory.3d_python/ransac_pycuda_3d_level2.py
- Level 2 parallelized of RANSAC for 3D datapoints. More description in the report/presentation.3d_python/ransac_pycuda_3d_level1.py
- Level 1 parallelized of RANSAC for 3D datapoints. More description in the report/presentation.
python_ransac.py
- Serial implementation of RANSAC for 2D datapointsransac_pycuda_level4.py
- Fully parallelized of RANSAC for 2D datapoints. This version is called level 4 in the report and presentation.kernel_ransac.cu
- The CUDA kernels for 2D RANSACransac_pycuda_level3.py
- Level 3 parallelized of RANSAC for 2D datapoints. More description in the report/presentation.ransac_pycuda_level2.py
- Level 2 parallelized of RANSAC for 2D datapoints. More description in the report/presentation.ransac_pycuda_level1.py
- Level 1 parallelized of RANSAC for 2D datapoints. More description in the report/presentation.