mackstann/hash-table-shootout

google_dense_hash_map doesn't compare string key

wcy123 opened this issue · 0 comments

Hi,
Great work!
I test it on my platform, and google_dense_hash_map is the fastest one. But it is a little bit misleading. When we use const char * as the key type, google_dense_hash_map doesn't look at the content of string, just comparing pointers, that is one reason why it is so fast, but semantic is wrong, for example,

#include <iostream>
#include <cstring>
#include <vector>
#include <algorithm>
#include <iterator>
#include <functional>
#include <deque>
using namespace std;

#include <google/dense_hash_map>

int main(int argc, char *argv[])
{
  typedef google::dense_hash_map<const char *, int64_t> str_hash_t;
  str_hash_t str_hash; str_hash.set_empty_key(""); str_hash.set_deleted_key("d");
  const char * key;
  key = strdup("hello");
  str_hash.insert(str_hash_t::value_type(key, 1));
  key = strdup("hello");
  str_hash.insert(str_hash_t::value_type(key, 2));
  for( str_hash_t::iterator i = str_hash.begin();
       i != str_hash.end();
       ++i){
    cout << i->first << ":" << i->second << endl;
  }
  return 0;
}

compile it and run it

 g++ -ggdb mytest.cc && ./a.out
 hello:1
 hello:2

we can see the strings hello have different address, but google_dense_hash_map<const char*, int> regards them as two distinct keys.

If google_dense_hash_map<const char*,int> doesn't invoke something like strcmp, it will be faster.

I try google_dense_hash_map<string, int> which gives the right answer. but it doesn't fit your testing framework.