Add iteration timing to time_hash_map.cc
Closed this issue · 5 comments
GoogleCodeExporter commented
I'm interested in the comparing the time to iterate through a hash to perform
an operation on each entry. Can this timing method be added to
time_hash_map.cc.
{{{
static void time_map_iterate(int iters) {
MapType set;
Rusage t;
int r;
int i;
for (i = 0; i < iters; i++) {
set[i] = i+1;
}
r = 1;
t.Reset();
for (typename MapType::const_iterator it = set.begin(), it_end = set.end();
it != it_end;
++it) {
r ^= it->second;
}
double ut = t.UserTime();
srand(r); // keep compiler from optimizing away r (we never call rand())
report("map_iterate", ut, iters, 0, 0);
}
}}}
Original issue reported on code.google.com by blair-ol...@orcaware.com
on 25 Oct 2011 at 3:46
GoogleCodeExporter commented
Sure, will do.
I'm not sure how much it will tell you, though, since iterating over such
tightly bunched keys is probably 'easy' for every hashtable implementation. In
general, I feel that time_hash_map is not testing a very real-world situation.
I may try to revamp the tests a bit. In any case, this will be in the next
release.
Original comment by csilv...@gmail.com
on 25 Oct 2011 at 9:50
- Changed state: Started
- Added labels: Priority-Medium, Type-Enhancement
GoogleCodeExporter commented
btw, can you sign the CLA at
http://code.google.com/legal/individual-cla-v1.0.html
? Thanks!
Original comment by csilv...@gmail.com
on 25 Oct 2011 at 10:19
GoogleCodeExporter commented
I signed the iCLA.
I'm not an expert in map implementations, but does the bunching of keys even
matter? Won't an iterator just walk its internal data structure in the most
efficient manner and return the pairs to the caller in whatever order they
appear in? For std::map, these are automatically ordered by keeping the keys
sorted and for an std::unordered_map, it'll just return them as they appear in
each hash bucket?
I was interested in the performance anyway, since if one hash map is always
faster than another it's a no-brainer to use it. However, if one hash map is
faster for some operations then another, then one needs to weigh the most
common operation on the map to pick an implementation, and iteration is one
that we commonly use.
Regards,
Blair
Original comment by blair-ol...@orcaware.com
on 28 Oct 2011 at 6:56
GoogleCodeExporter commented
Some implementations have overhead with iterators that show up in some contexts
but not others. dense_hash_set, for instance, stores everything in a big
array. Usually the array is half full; if all the keys are small integers,
they'll all be next to each other at the beginning of the array. This
potentially makes iterating faster than if the empty buckets were spread out.
But you're right, it probably won't matter a great deal.
Original comment by csilv...@gmail.com
on 28 Oct 2011 at 7:10
GoogleCodeExporter commented
This is done in sparsehash 1.12, just released.
Original comment by csilv...@gmail.com
on 21 Dec 2011 at 5:27
- Changed state: Fixed