virtual_vec
is a type compatible with std::vector<T>
. The implementation targets large vectors that (1) are pushed to frequently and (2) the size is not known, (3) have a fixed memory upper bound. This guarantees O(1)
for appending to the end. Do note the purpose of this implementation is experimentation, and it has not been tested in a production system.
When the vector is instantiated, it reserves a large chunk of virtual memory addresses from the operating system. Memory from this address space is committed to our process as the vector grows. We never have to copy the entire vector to a new address space.
This implementation is hardcoded to use 4GB virtual address for each vector, so you can have thousands of these per process. The number was chosen arbitrarily and can be made bigger or smaller depending on your use case. For obvious reasons this is only applicable to 64-bit systems.
This implementation is just for fun. Your mileage will vary platform to platform. If you are on a system that overcommits memory, like Linux, then you should use std::vector<T>::reserve(4ULL << 30)
and you will get the best possible performance. If not, you should always run benchmarks, because you can incur large page fault penalties when bypassing the user space allocator (this is why reserve on Linux with std::vector
is the best of both worlds).
NOTE: Right now this only builds on Linux (would gladly accept patches to support other OSes). To build this you just need to include virtual_vec.h
and virtual_vec.cpp
and a c++17 (or newer) compiler.
gtest must be installed.
To run the tests:
./excute test
To run the tests under valgrind:
./execute test-valgrind
Google benchmark must be installed.
./execute bench
Below is a table of benchmarks comparing the performance of virtual_vec<int64_t>
to std::vector<int64_t>
. The test calls push_back
until # elems / sizeof(int64_t)
exceeds Storage size
.
All benchmarks are performed on a Intel i7-8665U (8) @ 4.800GHz
. Time is recorded in CPU time.
One thing to keep in mind is these benchmarks were done with -O2
. With -O3
std::vector gets 10% faster.
Storage size | std::vector (ns) | virtual_vec (ns) | diff (%) |
---|---|---|---|
512B | 1221 | 755 | 38 |
1KB | 2248 | 1627 | 27 |
8KB | 16635 | 12635 | 24 |
32KB | 85878 | 38748 | 54 |
131KB | 375485 | 234833 | 37 |
524KB | 1534855 | 798632 | 47 |
1MB | 2940128 | 1276827 | 56 |
4MB | 13290644 | 6129833 | 53 |
8MB | 29162281 | 12560318 | 56 |
10MB | 39694352 | 13500909 | 65 |