tzaeschke/phtree-cpp

VC 2019 : Vector iterators incompatible when trying to relocate

Closed this issue · 25 comments

Hi,
I have the following small piece of code:

#include "phtree_multimap.h"
#include <iostream>
#include <chrono>

using namespace improbable::phtree;

int main() {
	auto tree = PhTreeMultiMapD<2, int>();
	std::vector<PhPointD<2>> vecPos;
	int dim = 1000;

	int num = 10;
	for (int i = 0; i < num; ++i) {
		PhPointD<2> p = { (double)(rand() % dim), (double)(rand() % dim) };
		vecPos.push_back(p);
		tree.emplace(p, i);
	}

	for (int i = 0; i < num; ++i) {
		PhPointD<2> p = vecPos[i];
		PhPointD<2> newp = { (double)(rand() % dim), (double)(rand() % dim) };
		tree.relocate(p, newp, i);
	}

	return 0;
}

When I try to run it I get the following:
image

Call Stack is:
PHTree.exe!std::_Vector_const_iterator<std::_Vector_val<std::_Simple_types<std::pair<unsigned int,int>>>>::_Compat(const std::_Vector_const_iterator<std::_Vector_val<std::_Simple_types<std::pair<unsigned int,int>>>> & _Right) Line 190 C++
PHTree.exe!std::_Vector_const_iterator<std::_Vector_val<std::_Simple_types<std::pair<unsigned int,int>>>>::operator==(const std::_Vector_const_iterator<std::_Vector_val<std::_Simple_types<std::pair<unsigned int,int>>>> & _Right) Line 154 C++
PHTree.exe!improbable::phtree::operator==(const improbable::phtree::b_plus_tree_hash_set<int,std::hash,std::equal_to>::bpt_iterator & left, const improbable::phtree::b_plus_tree_hash_set<int,std::hash,std::equal_to>::bpt_iterator & right) Line 740 C++

PHTree.exe!improbable::phtree::PhTreeMultiMap<2,int,improbable::phtree::SimplePointConverter<2,double,__int64,improbable::phtree::ScalarConverterIEEE>,improbable::phtree::b_plus_tree_hash_set<int,std::hash,std::equal_to>,1,improbable::phtree::QueryPoint>::relocate(const std::array<double,2> & old_key, const std::array<double,2> & new_key, const int & value, bool always_erase) Line 446 C++
PHTree.exe!main() Line 25 C++
[External Code]

Let me know if you need more info.

Thanks for reporting.
I just checked and it works fine with GCC. Is that an option for you?
I will now try testing this with VC 2019.

As an aside, there will be a new release soon (next week?) that has numerous performance improvements, including a complete rewrite of relocate() which should be considerably faster.

Hi,

Thanks for replying. Unfortunately gcc is not an option at this point. There may be an edge case (undefined behavior? copy elision?) that a compiler handles well where another doesn't. Please do let me know how it went with VC2019. Even a quick workaround can help as I am stumped at the moment.

This spatial indexing method you have been working on looks very promising, especially in scenarios where lots of position updates occur in realtime. I devised a 2D spatial hashing scheme where a key is computed from all the quadrants/subquadrants a point belongs in, and object lists are stored at each level. In essense, there is no data structure, just a map of sets. However, in the worst case scenario, when an object moves far, far away, a lot of insert/erase operations must be called, so that was a big bottleneck.

I wish you good luck on all your projects and I hope you may tolerate a few questions down the road, there is little documentation (on a global level) on this at the moment.

Apologies, misclicked 'Close'

I admit I have trouble compiling this with VS 2019. Did you make any changes to the CMake files?

I am specifically stuck with fatal error LNK1104: cannot open file 'phtree\phtree.lib'. I cannot even find the library file, I assume it never gets created.
However, the output looks like it does get created:

>------ Build started: Project: CMakeLists, Configuration: Debug ------
  [1/3] Linking CXX static library phtree\phtree.lib
  [2/3] Building CXX object examples\CMakeFiles\Example.dir\example.cc.obj
  ...   (lots of warnings
    [3/3] Linking CXX executable examples\Example.exe
  FAILED: examples/Example.exe 
  cmd.exe /C "cd . && "C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\Common7\IDE\CommonExtensions\Microsoft\CMake\CMake\bin\cmake.exe" -E vs_link_exe --intdir=examples\CMakeFiles\Example.dir --rc=C:\PROGRA~2\WI3CF2~1\10\bin\100190~1.0\x64\rc.exe --mt=C:\PROGRA~2\WI3CF2~1\10\bin\100190~1.0\x64\mt.exe --manifests  -- C:\PROGRA~1\MICROS~3\2022\COMMUN~1\VC\Tools\MSVC\1430~1.307\bin\Hostx64\x64\link.exe /nologo examples\CMakeFiles\Example.dir\example.cc.obj  /out:examples\Example.exe /implib:examples\Example.lib /pdb:examples\Example.pdb /version:0.0 /machine:x64 /debug /INCREMENTAL /subsystem:console  phtree\phtree.lib  kernel32.lib user32.lib gdi32.lib winspool.lib shell32.lib ole32.lib oleaut32.lib uuid.lib comdlg32.lib advapi32.lib && cd ."
  LINK Pass 1: command "C:\PROGRA~1\MICROS~3\2022\COMMUN~1\VC\Tools\MSVC\1430~1.307\bin\Hostx64\x64\link.exe /nologo examples\CMakeFiles\Example.dir\example.cc.obj /out:examples\Example.exe /implib:examples\Example.lib /pdb:examples\Example.pdb /version:0.0 /machine:x64 /debug /INCREMENTAL /subsystem:console phtree\phtree.lib kernel32.lib user32.lib gdi32.lib winspool.lib shell32.lib ole32.lib oleaut32.lib uuid.lib comdlg32.lib advapi32.lib /MANIFEST /MANIFESTFILE:examples\CMakeFiles\Example.dir/intermediate.manifest examples\CMakeFiles\Example.dir/manifest.res" failed (exit code 1104) with the following output:
C:\work\githubVS\phtree-cpp\out\build\x64-Debug\LINK : fatal error LNK1104: cannot open file 'phtree\phtree.lib'
  ninja: build stopped: subcommand failed.

or sometimes just

>------ Build started: Project: CMakeLists, Configuration: Debug ------
  [1/2] Linking CXX static library phtree\phtree.lib
  [2/2] Linking CXX executable examples\Example.exe
  FAILED: examples/Example.exe 
  cmd.exe /C "cd . && "C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\Common7\IDE\CommonExtensions\Microsoft\CMake\CMake\bin\cmake.exe" -E vs_link_exe --intdir=examples\CMakeFiles\Example.dir --rc=C:\PROGRA~2\WI3CF2~1\10\bin\100190~1.0\x64\rc.exe --mt=C:\PROGRA~2\WI3CF2~1\10\bin\100190~1.0\x64\mt.exe --manifests  -- C:\PROGRA~1\MICROS~3\2022\COMMUN~1\VC\Tools\MSVC\1430~1.307\bin\Hostx64\x64\link.exe /nologo examples\CMakeFiles\Example.dir\example.cc.obj  /out:examples\Example.exe /implib:examples\Example.lib /pdb:examples\Example.pdb /version:0.0 /machine:x64 /debug /INCREMENTAL /subsystem:console  phtree\phtree.lib  kernel32.lib user32.lib gdi32.lib winspool.lib shell32.lib ole32.lib oleaut32.lib uuid.lib comdlg32.lib advapi32.lib && cd ."
  LINK Pass 1: command "C:\PROGRA~1\MICROS~3\2022\COMMUN~1\VC\Tools\MSVC\1430~1.307\bin\Hostx64\x64\link.exe /nologo examples\CMakeFiles\Example.dir\example.cc.obj /out:examples\Example.exe /implib:examples\Example.lib /pdb:examples\Example.pdb /version:0.0 /machine:x64 /debug /INCREMENTAL /subsystem:console phtree\phtree.lib kernel32.lib user32.lib gdi32.lib winspool.lib shell32.lib ole32.lib oleaut32.lib uuid.lib comdlg32.lib advapi32.lib /MANIFEST /MANIFESTFILE:examples\CMakeFiles\Example.dir/intermediate.manifest examples\CMakeFiles\Example.dir/manifest.res" failed (exit code 1104) with the following output:
C:\work\githubVS\phtree-cpp\out\build\x64-Debug\LINK : fatal error LNK1104: cannot open file 'phtree\phtree.lib'
  ninja: build stopped: subcommand failed.

I already tried adding set(CMAKE_WINDOWS_EXPORT_ALL_SYMBOLS 1) but that did not help.

If you have more general questions feel free to ask them here: https://discord.gg/h5EWEg7G

To be honest, cmake didn't produce any libraries and generally I couldn't make it work. So as a last resort, I just copied the entire PH-tree folder to my project and included the files. Since there were not any complaints from VS and basic operations like point insertion seemed to work properly, I never went back to cmake.

Please do let me know if there is a way to get things working in VS.

I usually develop with Bazel. I provide cmake only as concession do other development environments. It works fine on command line, but I should also make it work with Visual Studio or CLion, otherwise it's pointless.

There is a possible workaround you could try. The PhTreeMultimap requires a collection type a "buckets" to store multiple entries with the same coordinate. The workaround is to try using a different container type as buckets:

PhTreeMultiMapD<DIM, payload_t, ConverterIEEE<DIM>, std::set<payload_t>>;

or

PhTreeMultiMapD<DIM, payload_t, ConverterIEEE<DIM>, std::unordered_set<payload_t>>;

I think std::map may be faster.

I also noticed that you are using the tree for 2D coordinates. This should work perfectly fine but (depending on your dataset!) the PhTree works better with 3D or more.

Thanks for taking time to tackle this. I updated the declaration with what you posted above, but now there is the following error:
image
Which, incidentally, had also popped up when I tried to do a simple point removal.

It is true that I intend to use the tree for 2D coordinates but I would really like to benchmark its efficiency concerning real-time updates. If this is indeed a strong point for this specific data structure, it will definitely be worth it.

I think I found the culprit.
In node.h, line 196, in the function ReplaceNodeWithDataFromEntry(Entry&& other), the line

node.~unique_ptr();

needs to be replaced with

node.reset();

Could you give this a try?

It seems to be working now. I will keep you updated with my findings over at Discord. Thanks!

Thanks for reporting this and thanks for testing!
I will actually keep this open until I have provided a fix.

Small note, as it is right now, there is a memory leak (probably due to the quick fix above). Memory usage climbs up too fast.

Just a heads up: I have (more or less) identifies a potential leak.

Two points though:

  • Independent of the leak that I am now looking at: the current implementation exposes a memory leak if the relocation fails, e.g. if the source coordinate does not exist. This is already fixed in the new implementation that has not been released yet.
  • Performance: the current implementation of relocate() (and especially the new one) work faster if the coordinate moves only by a small amount, i.e. if the neighbor coordinates stay more or less the same. A relocate() across the whole area is basically as fast (or as slow) as separate emplace()/erase(). This is something to consider when doing performance tests.

Another possibly VS2019 related issue, simply quering for a closest neighbor:

for (auto it = tree.begin_knn_query(1, { 100.0, 100.0 }); it != tree.end(); ++it) {
    std::cout << it << std::endl;
}

Causes this build fail:

Build started...
1>------ Build started: Project: PHTree, Configuration: Release x64 ------
1>Source.cpp
1>C:\Users\Fagota\source\repos\PHTree\PHTree\Source.cpp(21,21): error C2672: 'improbable::phtree::PhTreeMultiMap<2,int,improbable::phtree::SimplePointConverter<2,double,improbable::phtree::scalar_64_t,improbable::phtree::ScalarConverterIEEE>,std::unordered_set<int,std::hash<int>,std::equal_to<int>,std::allocator<int>>,true,improbable::phtree::QueryPoint>::begin_knn_query': no matching overloaded function found
1>C:\Users\Fagota\source\repos\PHTree\PHTree\Source.cpp(21,21): error C2783: 'auto improbable::phtree::PhTreeMultiMap<2,int,improbable::phtree::SimplePointConverter<2,double,improbable::phtree::scalar_64_t,improbable::phtree::ScalarConverterIEEE>,std::unordered_set<int,std::hash<int>,std::equal_to<int>,std::allocator<int>>,true,improbable::phtree::QueryPoint>::begin_knn_query(size_t,const std::array<SCALAR_EXTERNAL,2> &,DISTANCE &&,FILTER &&) const': could not deduce template argument for 'DISTANCE'
1>        with
1>        [
1>            SCALAR_EXTERNAL=double
1>        ]
1>C:\Users\Fagota\source\repos\PHTree\PHTree\include\phtree\phtree_multimap.h(556): message : see declaration of 'improbable::phtree::PhTreeMultiMap<2,int,improbable::phtree::SimplePointConverter<2,double,improbable::phtree::scalar_64_t,improbable::phtree::ScalarConverterIEEE>,std::unordered_set<int,std::hash<int>,std::equal_to<int>,std::allocator<int>>,true,improbable::phtree::QueryPoint>::begin_knn_query'
1>C:\Users\Fagota\source\repos\PHTree\PHTree\Source.cpp(21,63): error C3536: 'it': cannot be used before it is initialized
1>C:\Users\Fagota\source\repos\PHTree\PHTree\Source.cpp(21,76): error C2678: binary '!=': no operator found which takes a left-hand operand of type 'int' (or there is no acceptable conversion)
1>C:\Users\Fagota\source\repos\PHTree\PHTree\include\phtree\v16\iterator_base.h(49,24): message : could be 'bool improbable::phtree::v16::IteratorBase<improbable::phtree::v16::Entry<2,T,__int64>>::operator !=(const improbable::phtree::v16::IteratorBase<improbable::phtree::v16::Entry<2,T,__int64>> &,const improbable::phtree::v16::IteratorBase<improbable::phtree::v16::Entry<2,T,__int64>> &) noexcept' [found using argument-dependent lookup]
1>        with
1>        [
1>            T=std::unordered_set<int,std::hash<int>,std::equal_to<int>,std::allocator<int>>
1>        ]
1>C:\Users\Fagota\source\repos\PHTree\PHTree\include\phtree\phtree_multimap.h(81,17): message : or       'bool improbable::phtree::`anonymous-namespace'::IteratorBase<improbable::phtree::PhTreeMultiMap<2,int,improbable::phtree::SimplePointConverter<2,double,improbable::phtree::scalar_64_t,improbable::phtree::ScalarConverterIEEE>,std::unordered_set<int,std::hash<int>,std::equal_to<int>,std::allocator<int>>,true,improbable::phtree::QueryPoint>>::operator !=(const improbable::phtree::`anonymous-namespace'::IteratorBase<improbable::phtree::PhTreeMultiMap<2,int,improbable::phtree::SimplePointConverter<2,double,improbable::phtree::scalar_64_t,improbable::phtree::ScalarConverterIEEE>,std::unordered_set<int,std::hash<int>,std::equal_to<int>,std::allocator<int>>,true,improbable::phtree::QueryPoint>> &,const improbable::phtree::`anonymous-namespace'::IteratorBase<improbable::phtree::PhTreeMultiMap<2,int,improbable::phtree::SimplePointConverter<2,double,improbable::phtree::scalar_64_t,improbable::phtree::ScalarConverterIEEE>,std::unordered_set<int,std::hash<int>,std::equal_to<int>,std::allocator<int>>,true,improbable::phtree::QueryPoint>> &) noexcept' [found using argument-dependent lookup]
1>C:\Users\Fagota\source\repos\PHTree\PHTree\Source.cpp(21,76): message : while trying to match the argument list '(int, improbable::phtree::`anonymous-namespace'::IteratorNormal<improbable::phtree::v16::IteratorBase<improbable::phtree::v16::Entry<2,T,__int64>>,improbable::phtree::PhTreeMultiMap<2,int,improbable::phtree::SimplePointConverter<2,double,improbable::phtree::scalar_64_t,improbable::phtree::ScalarConverterIEEE>,std::unordered_set<int,std::hash<int>,std::equal_to<int>,std::allocator<int>>,true,improbable::phtree::QueryPoint>>)'
1>        with
1>        [
1>            T=std::unordered_set<int,std::hash<int>,std::equal_to<int>,std::allocator<int>>
1>        ]
1>C:\Users\Fagota\source\repos\PHTree\PHTree\Source.cpp(22,20): error C2563: mismatch in formal parameter list
1>Done building project "PHTree.vcxproj" -- FAILED.
========== Build: 0 succeeded, 1 failed, 0 up-to-date, 0 skipped ==========

The example is missing a distance function, can you try the following?

    for (auto it = tree.begin_knn_query(1, { 100.0, 100.0 }, DistanceEuclidean<2>()); it != tree.end(); ++it) {
        std::cout << *it << std::endl;
    }

I'll try it, by the way does this Euclidean distance include the square root? If so, I'll try to provide a distance function that returns just the squared distance.

Yes, it returns the square root. It should be easy to provide your own distance function. I doubt it makes a big difference though, the sqrt instruction has become quite cheap on modern CPUs, I usually found to effect to be negligible on memory heavy workloads such as a ph-tree.

Regarding the memory leak: this is again a one-liner: In entry.h in line 179, in the function ExtractValue() remove the line saying union_type_ = EMPTY;, it should just be:

    [[nodiscard]] ValueT&& ExtractValue() noexcept {
        assert(IsValue());
        return std::move(value_);
    } 

Unfortunately the change suggested above did not alleviate the memory leak, there is still rapid memory consumption and an increased relocation time:
image

Could you post the test code again? Also, so we are on the same page, is this still based on 1.2.0 or on master? And can you double check that you fixed ExtractValue() (and not ExtractNode() which looks very similar)?

image

Code:

#include "phtree_multimap.h"
#include <iostream>
#include <chrono>
#include <unordered_set>

using namespace improbable::phtree;

int main() {
	auto tree = PhTreeMultiMapD<2, int, ConverterIEEE<2>, std::unordered_set<int>>();
	std::vector<PhPointD<2>> vecPos;
	int dim = 1000000000;

	int num = 30000;
	for (int i = 0; i < num; ++i) {
		PhPointD<2> p = { (double)(rand() % dim), (double)(rand() % dim) };
		vecPos.push_back(p);
		tree.emplace(p, i);
	}

	while (true) {
		auto t1 = std::chrono::high_resolution_clock::now();
		for (int i = 0; i < num; ++i) {
			PhPointD<2> p = vecPos[i];
			PhPointD<2> newp = { (double)(rand() % dim), (double)(rand() % dim) };
			tree.relocate(p, newp, i);
		}
		auto t2 = std::chrono::high_resolution_clock::now();
		auto s = std::chrono::duration_cast<std::chrono::microseconds>(t2 - t1);
		std::cout <<  s.count() << std::endl;
	}

	return 0;
}

You are working with 1.2.0, right?

Ah, this is the bug I mentioned at some point (I hope I did), 1.2.0 has a memory leak that shows when relocate() fails. The loop is trying to move points that do not exist.

Can you try changing the inner loop as follows:

        for (int i = 0; i < num; ++i) {
            PhPointD<2> &p = vecPos[i];   // <------------ 'p' becomes a reference
            PhPointD<2> newp = { (double)(rand() % dim), (double)(rand() % dim) };
            tree.relocate(p, newp, i);
            p = newp;    // <------------ New line
        }

Apologies, I am a bit of a caveman when it comes to github. I was operating on Master. I just installed v1.2.0, and, after applying all the proposed changes, the leak seems to have been sealed.

A note of concern though. Consider the following relocation code:

for (int i = 0; i < num; ++i) {
	PhPointD<2>& p = vecPos[i];
	PhPointD<2> newp = { p[0]+1.0, p[1] };
	tree.relocate(p, newp, i);
	p = newp;
}

Essentially this moves an instance 1px to the right. This code now performs about 3x as slow as it does in Master, its performance is approaching the expected behavior of randomly teleporting.

-- I am still using ConverterIEEE though, didn't get around using the other yet.

Actually I myself need to apologize :-)
What I said earlier is only half correct, the memory leak in case of failed relocate() is indeed present in 1.2.6 but also and in master, the only place where it is fixed is in the #52 branch, however this branch still has some (unrelated) quirks that need fixing.

I assumed you were on 1.2.6. The master branch has several improvements so it is expected to be faster (I wouldn't have expected a factor of 3 though).

In short, if you were on master and it gave you good results then just stay on master. Once PR #52 is merged things should improve further, but that needs some work. Actually, you can try out #52 if you want, I think it may work just fine, just be aware that this is under development.