Add iteration timing to time_hash_map.cc

Question

Add iteration timing to time_hash_map.cc

Closed this issue 9 years ago · 5 comments

GoogleCodeExporter commented 9 years ago

I'm interested in the comparing the time to iterate through a hash to perform 
an operation on each entry.  Can this timing method be added to 
time_hash_map.cc.

{{{
static void time_map_iterate(int iters) {
  MapType set;
  Rusage t;
  int r;
  int i;

  for (i = 0; i < iters; i++) {
    set[i] = i+1;
  }

  r = 1;
  t.Reset();
  for (typename MapType::const_iterator it = set.begin(), it_end = set.end();
       it != it_end;
       ++it) {
    r ^= it->second;
  }

  double ut = t.UserTime();

  srand(r);   // keep compiler from optimizing away r (we never call rand())    

  report("map_iterate", ut, iters, 0, 0);
}
}}}

Original issue reported on code.google.com by blair-ol...@orcaware.com on 25 Oct 2011 at 3:46

Answer 1 · 2015-07-14T10:54:03.000Z

Sure, will do.

I'm not sure how much it will tell you, though, since iterating over such 
tightly bunched keys is probably 'easy' for every hashtable implementation.  In 
general, I feel that time_hash_map is not testing a very real-world situation.  
I may try to revamp the tests a bit.  In any case, this will be in the next 
release.

Original comment by csilv...@gmail.com on 25 Oct 2011 at 9:50

Changed state: Started
Added labels: Priority-Medium, Type-Enhancement

Answer 2 · 2015-07-14T10:54:04.000Z

btw, can you sign the CLA at
   http://code.google.com/legal/individual-cla-v1.0.html
?  Thanks!

Original comment by csilv...@gmail.com on 25 Oct 2011 at 10:19

Answer 3 · 2015-07-14T10:54:04.000Z

I signed the iCLA.

I'm not an expert in map implementations, but does the bunching of keys even 
matter?  Won't an iterator just walk its internal data structure in the most 
efficient manner and return the pairs to the caller in whatever order they 
appear in?  For std::map, these are automatically ordered by keeping the keys 
sorted and for an std::unordered_map, it'll just return them as they appear in 
each hash bucket?

I was interested in the performance anyway, since if one hash map is always 
faster than another it's a no-brainer to use it.  However, if one hash map is 
faster for some operations then another, then one needs to weigh the most 
common operation on the map to pick an implementation, and iteration is one 
that we commonly use.

Regards,
Blair

Original comment by blair-ol...@orcaware.com on 28 Oct 2011 at 6:56

Answer 4 · 2015-07-14T10:54:04.000Z

Some implementations have overhead with iterators that show up in some contexts 
but not others.  dense_hash_set, for instance, stores everything in a big 
array.  Usually the array is half full; if all the keys are small integers, 
they'll all be next to each other at the beginning of the array.  This 
potentially makes iterating faster than if the empty buckets were spread out.

But you're right, it probably won't matter a great deal.

Original comment by csilv...@gmail.com on 28 Oct 2011 at 7:10

Answer 5 · 2015-07-14T10:54:04.000Z

This is done in sparsehash 1.12, just released.

Original comment by csilv...@gmail.com on 21 Dec 2011 at 5:27

Changed state: Fixed