Rust WritableHashMap based on a faster hash function and Python dictionaries with Read and Write traits in order to write and read hash tables from files without serialization/deserialization and rehashing.
This is a modification of tmmcguire/fasthashmap with Read and Write traits.
The hash function used by this map is not cryptographically strong. This map is open to denial of service attacks.
CAUTION IS ADVISED.
The implementation in this module currently supports many of the same methods as std::collections::HashMap, but not all.
let map = WritableHashMap::new();
Constructs a map with a default capacity, currently 8. The map will expand as necessary.
let map = WritableHashMap::with_capacity( 30 );
Constructs a map with a capacity of at least 30.
The implementation in this module currently supports many of the same methods as std::collections::HashSet, but not all.
let set = WritableHashSet::new();
Constructs a set with a default capacity, currently 8. The set will expand as necessary.
The hashing function currently used is DJB2,
unsigned long
hash(unsigned char *str)
{
unsigned long hash = 5381;
int c;
while (c = *str++)
hash = (hash * 33) ^ c;
return hash;
}
It may not be the best, but it is short.
For more information, see http://cr.yp.to/cdb/cdb.txt and http://www.cse.yorku.ca/~oz/hash.html.
The hashtable code is loosely based on Python's dictionaries. Since the HashMap implementation in the Rust standard library has been updated to be Robin Hood Hashing (i.e. bucket stealing), this implementation may not be faster than the standard libraries', although using the DJB2 hash function makes it likely.
It should be possible to mix-n-match hashing function implementations, though, so it might be worth experimenting further. Soon.
For more information on this implementation, see
-
How are Python's Built In Dictionaries Implemented. [StackOverflow]
For information on Robin Hood hashing, see the papers/articles mentioned in the Rust docs:
- Pedro Celis. "Robin Hood Hashing".
- Emmanuel Goossaert. "Robin Hood hashing".
- Emmanuel Goossaert. "Robin Hood hashing: backward shift deletion".