KDTree built with MSVC (Microsoft Visual C++) does not achieve 100% accuracy
Closed this issue · 3 comments
Environment
- crvs/KDTree (latest)
- gcc 9.4.0
- MSVC 14.40.33810
Issue
KDTree built with gcc is 100% accurate, but when built with MSVC (Visual Studio 2022), the accuracy somehow drops.
I reproduced this phenomenon in error_test.cpp.
(The original code aborts the test if 100% accuracy is not achieved, but I modified it to continue.)
> . \bin\Release\error_test.exe
Total number of iterations ran: 30
Accuracy (tested with 50 datasets per iter): 92.9333%. Total Number of correct queries: 46 / 50
Total query time: { bruteForce: 0.0052009, kdTree: 0.0023472 }
Accuracy (tested with 500 datasets per iter): 96.8733%. Total Number of correct queries: 484 / 500
Total query time: { bruteForce: 0.491479, kdTree: 0.0470063 }
Accuracy (tested with 1000 datasets per iter): 97.17%. Total Number of correct queries: 971 / 1000
Total query time: { bruteForce: 1.98447, kdTree: 0.120078 }
Minimal reproducible example
I modified toy_test.cpp to create a simple example
Strangely, the error does not always occur on the same input. That is, a single query may give the correct result, but some of the results may be wrong after ~100 runs.
(In the lucky case, the test program does not produce errors. If you want to reproduce it, please try running it several times.)
#include <iostream>
#include <vector>
#include "KDTree.hpp"
int main() {
pointVec points;
points.push_back(point_t{ 0.0, 0.0 });
points.push_back(point_t{ 1.0, 0.0 });
points.push_back(point_t{ 0.0, 1.0 });
points.push_back(point_t{ 1.0, 1.0 });
points.push_back(point_t{ 0.5, 0.5 });
KDTree tree(points);
point_t pt { 0.8, 0.2 };
point_t ground_truth { 1.0, 0.0 };
std::cout << "query point: " << pt[0] << ", " << pt[1] << std::endl;
std::cout << "ground truth: " << ground_truth[0] << ", " << ground_truth[1] << std::endl;
int correct = 0, incorrect = 0;
for(int i = 0; i < 100; ++i)
tree.nearest_point(pt) == ground_truth ? ++correct : ++incorrect;
std::cout << "correct: " << correct << ", incorrect: " << incorrect << std::endl;
}
> .\bin\Release\toy_test.exe
correct: 79, incorrect: 21
I am quite confused because the KDTree implementation will unlikely include any random elements.
I will try to figure out why by reading your code.
I think I have finally found the possible cause.
In toy_test, the program starts with (0.5, 0.5) and then considers whether (0.5, 0.5) or (1.0, 0.0) is closer to (0.8, 0.2).
Of course the latter is closer, so the program tries to insert the data for (1.0, 0.0) into k_nearest_buffer
.
Whether the insert is actually done or not is determined by insert_it == k_nearest_buffer.end() || insert_it->first ! = std::next(insert_it)->first
. In my understanding, this condition prevents the same point from being added.
Lines 137 to 140 in a1baa5b
I printed out the memory addresses of insert_it->first
and std::next(insert_it)->first
and found something interesting. Both of the following outputs are at the point where there is only (0.5, 0.5) in k_nearest_buffer
and the program is trying to insert (1.0, 0.0).
- correct output(MSVC)
insert_it->first: 0000021F11CAEFC0
std::next(insert_it)->first: 0000021F11CAEF00
k_nearest_buffer.end()->first: 0000021F11CAEF00
insert_it->first != std::next(insert_it)->first: true
- incorrect output(MSVC)
insert_it->first: 0000021F11CAEFC0
std::next(insert_it)->first: 0000021F11CAEFC0
k_nearest_buffer.end()->first: 0000021F11CAEFC0
insert_it->first != std::next(insert_it)->first: false
std::next(insert_it)->first
changes with each execution and in the unlucky case matches insert_it->first
. As a result, although (1.0, 0.0) should be added, the program did not do so and output an incorrect answer.
Since k_nearest_buffer
is 1 in length, next
of the 0th element matches k_nearest_buffer.end()
(lines 2 and 3 of the output above). Attempting to access iterator end() cause undefined behavior (from cppreference). Thus when you compare something to std::next(insert_it)->first
without checking if it is an end iterator, there is no guarantee that you will get the correct result.
For your reference, here is a similar output by a program compiled with gcc. The result of k_nearest_buffer.end()->first
is very different from the normal memory address. This may be the reason why 100% accuracy was obtained with gcc.
insert_it->first: 0x561817160720
std::next(insert_it)->first: 0x1
k_nearest_buffer.end()->first: 0x1
insert_it->first != std::next(insert_it)->first: true
In summary, if insert_it
points to the last element of k_nearest_buffer
, the contents of std::next(insert_it)
should not be accessed. Adding a condition to check this particular condition will ensure that KDTree won't error if built with MSVC.
@@ -135,6 +135,7 @@ void KDTree::node_query_(
std::upper_bound(k_nearest_buffer.begin(), k_nearest_buffer.end(),
node_distance, detail::compare_node_distance);
if (insert_it == k_nearest_buffer.end() ||
+ std::next(insert_it) == k_nearest_buffer.end() ||
insert_it->first != std::next(insert_it)->first) {
k_nearest_buffer.insert(insert_it, node_distance);
}
Thanks for the bug report!
On closer inspection, i think the condition is just wrong insert_it->first
and std::next(insert_it)->first
should never be the same so that should not do anything. the condition that should be there is instead
if ((insert_it != k_nearest_buffer.end()) ||
(k_nearest_buffer.size() < num_nearest))
Can you try to replace it and check that it solves your problem? I unfortunately don't have access to MSVC.