microsoft/ALEX

ALEX crashes in non-batch mode

alihadian opened this issue · 4 comments

I ran ALEX against all of the SOSD benchmark's uint64 datasets (download link). I tried to insert the first 100M records, but ALEX crashed for all datasets.

When I insert the first few thousand records (e.g., 100K records), the code does not crash, but ALEX crashes for a larger number of insertions across all datasets. It's most likely a case that went unnoticed. The error message is as follows:

alex_nodes.h:540: int alex::AlexDataNode<T, P, Compare, Alloc, allow_duplicates>::num_keys_in_range(int, int) const [with T = long unsigned int; P = int; Compare = alex::AlexCompare; Alloc = std::allocator<std::pair<long unsigned int, int> >; bool allow_duplicates = true]: Assertion `left >= 0 && left < right && right <= data_capacity_' failed. Aborted (core dumped)

image

I have plenty of memory (50GB+) so it is certainly not a memory issue.

The code is simple. I just insert the records one by one, up to MAX_INSERTS

#include <iostream>
#include <vector>
#include <algorithm>
#include <time.h>

#include "alex.h"

#include "load.hpp"

using namespace std;

#define KEY_TYPE uint64_t
#define PAYLOAD_TYPE int

#define MAX_INSERTS 100000000

template <typename T>
void test_alex(vector<T> &data, string filename)
{
	alex::Alex<KEY_TYPE, PAYLOAD_TYPE> index;

	int cnt = 0;

	for (typename vector<T>::iterator it = data.begin(); it != data.end(); it++)
		if (cnt <= MAX_INSERTS)
			index.insert(*it, ++cnt);

	cout << cnt << endl;
}

int main(int argc, char **argv)
{
	string filename = argv[1];

	vector<uint64_t> data;
	load_bin<uint64_t>(filename, data);
	sort(data.begin(), data.end());
	data.erase(unique(data.begin(), data.end()), data.end());

	test_alex(data, filename);
}

Thanks for bringing this our attention Ali. I'll look into it.

The latest commit should fix the issues you're seeing. Let me know if it still doesn't work.

Hi Jialin,

Thanks for your updates. We test the insertion of 100M with SOSD datasets (uint64), it works on fb, normal, osm, however, doesn't work for books, lognormal, uniform_dense, uniform_sparse. And the error message is the same as before.

Thanks for pointing that out. It turns out the assert itself was too restrictive. The latest commit should now work on all uint64 datasets from SOSD.