neka-nat/cupoch

The result of using ClusterDBSCAN() is wrong, it treats each point as a cluster

six9326 opened this issue · 4 comments

Hello author, when I run cupoch::geometry::PointCloud::ClusterDBSCAN(), I found the following problem

  1. DBSCAN does not work correctly, when running examples/python/basic/clustering.py, it treats each point as a cluster

ginger@cv-nx16g:~/tools/cupoch/cupoch-git-jetson_nano/examples/python/basic$ python3 clustering.py
cube has 10000 points
[2023-06-25 11:42:43.086] [debug] GLFW init.
[2023-06-25 11:42:43.569] [debug] Add geometry and update bounding box to [(-0.9961, -0.9983, -0.9984) - (0.9980, 0.9986, 0.9999)]
[2023-06-25 11:43:21.937] [debug] Precompute Neighbours
Precompute Neighbours[========================================] 100%
cube has 10000 clusters
[2023-06-25 11:43:25.660] [debug] Add geometry and update bounding box to [(-0.9961, -0.9983, -0.9984) - (0.9980, 0.9986, 0.9999)]
torus has 10000 points
[2023-06-25 11:43:57.621] [debug] Add geometry and update bounding box to [(-7.4783, -7.4570, -2.4999) - (7.4883, 7.4562, 2.4999)]
[2023-06-25 11:44:29.363] [debug] Precompute Neighbours
Precompute Neighbours[========================================] 100%
torus has 10000 clusters
[2023-06-25 11:44:33.077] [debug] Add geometry and update bounding box to [(-7.4783, -7.4570, -2.4999) - (7.4883, 7.4562, 2.4999)]
shapes has 100000 points
[2023-06-25 11:44:49.955] [debug] Add geometry and update bounding box to [(-5.4992, -5.4916, -1.6180) - (5.6177, 1.6180, 1.6179)]
[2023-06-25 11:45:17.007] [debug] Precompute Neighbours
Precompute Neighbours[========================================] 100%
shapes has 100000 clusters
[2023-06-25 11:45:55.474] [debug] Add geometry and update bounding box to [(-5.4992, -5.4916, -1.6180) - (5.6177, 1.6180, 1.6179)]
[2023-06-25 11:46:44.137] [debug] Read geometry::PointCloud: 196133 vertices.

[2023-06-25 11:46:44.138] [debug] [RemoveNoneFinitePoints] 196133 nan points have been removed.
fragment has 196133 points
[2023-06-25 11:46:44.587] [debug] Add geometry and update bounding box to [(0.5586, 0.8320, 0.5666) - (3.9661, 2.4275, 2.5586)]
[2023-06-25 11:47:59.574] [debug] Precompute Neighbours
Precompute Neighbours[========================================] 100%
fragment has 196133 clusters
[2023-06-25 11:49:33.930] [debug] Add geometry and update bounding box to [(0.5586, 0.8320, 0.5666) - (3.9661, 2.4275, 2.5586)]
[2023-06-25 11:50:02.710] [debug] GLFW destruct.

  1. I also tried ClusterDBSCAN (C++ compiled static library libcupoch_geometry.a, libcupoch_knn.a, etc.), which also treats each point as a kind of clustering

I20230625 10:51:57.111202 25880 libs2.h:904] [235] -0.0129876, 0.164009, 0.402
I20230625 10:51:57.111220 25880 libs2.h:904] [236] -0.0391514, -0.0884331, 0.568
I20230625 10:51:57.111239 25880 libs2.h:904] [237] -0.0215065, 0.159529, 0.404
I20230625 10:51:57.111258 25880 libs2.h:904] [238] -0.131906, 0.12261, 0.419
I20230625 10:51:57.111276 25880 libs2.h:904] [239] -0.235723, 0.0184304, 0.481
I20230625 10:51:57.111295 25880 libs2.h:904] [240] -0.229842, 0.0364145, 0.469
I20230625 10:51:57.111315 25880 libs2.h:904] [241] 0.12799, 0.0398112, 0.496
I20230625 10:51:57.111333 25880 libs2.h:904] [242] -0.0607308, -0.127435, 0.59
I20230625 10:51:57.111351 25880 libs2.h:904] [243] -0.22705, 0.0557289, 0.456
I20230625 10:51:57.111369 25880 libs2.h:904] [244] -0.215828, -0.00704491, 0.499
I20230625 10:51:57.111388 25880 libs2.h:904] [245] -0.188384, 0.0246328, 0.479
I20230625 10:51:57.111407 25880 libs2.h:907] dbscan_cupoch, minPoints = 5, epsilon = 0.05, points_cpu.size(0) = 246
I20230625 10:52:00.948673 25880 libs2.h:1421] [DEBUG] dbscan_cupoch_labels = 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 ...

But using Open3d's ClusterDBSCAN function by the same data and the same arg, the result is correct

I20230625 10:52:13.390372 25880 libs2.h:980] [235] -0.0129876, 0.164009, 0.402
I20230625 10:52:13.390399 25880 libs2.h:980] [236] -0.0391514, -0.0884331, 0.568
I20230625 10:52:13.390426 25880 libs2.h:980] [237] -0.0215065, 0.159529, 0.404
I20230625 10:52:13.390453 25880 libs2.h:980] [238] -0.131906, 0.12261, 0.419
I20230625 10:52:13.390480 25880 libs2.h:980] [239] -0.235723, 0.0184304, 0.481
I20230625 10:52:13.390508 25880 libs2.h:980] [240] -0.229842, 0.0364145, 0.469
I20230625 10:52:13.390535 25880 libs2.h:980] [241] 0.12799, 0.0398112, 0.496
I20230625 10:52:13.390563 25880 libs2.h:980] [242] -0.0607308, -0.127435, 0.59
I20230625 10:52:13.390589 25880 libs2.h:980] [243] -0.22705, 0.0557289, 0.456
I20230625 10:52:13.390615 25880 libs2.h:980] [244] -0.215828, -0.00704491, 0.499
I20230625 10:52:13.390642 25880 libs2.h:980] [245] -0.188384, 0.0246328, 0.479
I20230625 10:52:13.390668 25880 libs2.h:982] dbscan_open3d, minPoints = 5, epsilon = 0.05, points_cpu.size(0) = 246
I20230625 10:52:13.408399 25880 libs2.h:1451] [DEBUG] dbscan_open3d_labels = -1 0 -1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 -1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ...

  1. When running examples/python/basic/benchmarks.py, DBSCAN(Gpu) takes more time than DBSCAN(Cpu)

ginger@cv-nx16g:~/user/lz/tools/cupoch/cupoch-git-jetson_nano/examples/python/basic$ python3 benchmarks.py
read_point_cloud (CPU) [sec]: 0.12378454208374023
PointCloud with 196133 points.
read_point_cloud (GPU) [sec]: 0.12287688255310059
geometry::PointCloud with 196133 points.
transform (CPU) [sec]: 0.007102251052856445
transform (GPU) [sec]: 0.0012631416320800781
estimate_normals (CPU) [sec]: 1.6187536716461182
estimate_normals (GPU) [sec]: 0.25005269050598145
voxel_down_sample (CPU) [sec]: 0.26772594451904297
voxel_down_sample (GPU) [sec]: 0.044162750244140625
remove_radius_outlier (CPU) [sec]: 17.790186643600464
remove_radius_outlier (GPU) [sec]: 0.09036445617675781
remove_statistical_outlier (CPU) [sec]: 1.132619857788086
remove_statistical_outlier (GPU) [sec]: 0.1464691162109375
registration_icp (CPU) [sec]: 153.40958786010742
registration_icp (GPU) [sec]: 0.18308281898498535
cluster_dbscan (CPU) [sec]: 2.1327126026153564
cluster_dbscan (GPU) [sec]: 92.28520059585571

  1. Run in the following environments respectively, all of which are the above results:
    (1). Ubuntu 20.04, NVIDIA GeForce GTX 1070, cupoch-0.2.7, open3d-devel-linux-x86_64-cxx11-abi-cuda-0.16.0, cuda 11.6
    (2). Ubuntu 20.04, Nvidia Jetson NX 16g, cupoch-0.2.7.3, open3d-0.16.0, cuda 11.4

  2. When using the C++ library, what is the corresponding relationship between the cupoch version and the open3d version? I couldn't find a perfectly fitting correspondence. I compiled successfully by using the same spdlog and fmt for the same version, and manually modifying a small amount of open3d/utiliti/logging.h

Thanks for reporting!
It looks like a bug got in the way and fixed.
Please try the latest master.

I try the latest master, It looks like still have a bug.

  1. I make some simulated data, and dbscan_open3d work correctly. There is only one cluster labeled as zero, -1 is noise

[Open3D DEBUG] Precompute neighbors.
[Open3D DEBUG] Done Precompute neighbors.
[Open3D DEBUG] Compute Clusters
[Open3D DEBUG] Done Compute Clusters: 1
I20230626 14:58:15.528476 829179 libs2.h:1481] [DEBUG] dbscan_open3d_labels.size = 238
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 -1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 -1 0 0 0 0 0 0 0 0 0 0 0 0 -1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 -1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 -1 -1 0 0

  1. But dbscan_cupoch return wrong labels:

I20230626 14:57:48.525573 829179 libs2.h:1429] [DEBUG] dbscan_cupoch_labels.size = 238
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 0 0 0 0 3 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 4 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 5 6 0 0

It seems set the increased cluster ID incorrectly , they should be -1

Thanks!
I made the change to make the noise -1.
Please try the latest master again.

Please support this repository with a star if you like.

thanks very much! The results of DBScan_cupoch are the same as DBScan_open3d now !
But speed of DBScan_cupoch(13.8 ms) is slower than DBScan_open3d(7.7 ms)

dbscan_open3d, minPoints = 5, epsilon = 0.05, points.size = 221, time_used = 13.8 ms
dbscan_open3d_labels.size = 221
0 0 0 0 0 0 0 0 0 -1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 -1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 -1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 -1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 -1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 -1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

dbscan_cupoch, minPoints = 5, epsilon = 0.05, points.size = 221, time_used = 7.7 ms
dbscan_cupoch_labels.size = 221
0 0 0 0 0 0 0 0 0 -1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 -1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 -1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 -1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 -1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 -1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0