/rust-writablehashmap

Rust hashmap based on a faster hash function and Python dictionaries

Primary LanguageRustApache License 2.0Apache-2.0

WritableHashMap

Rust WritableHashMap based on a faster hash function and Python dictionaries with Read and Write traits in order to write and read hash tables from files without serialization/deserialization and rehashing.

This is a modification of tmmcguire/fasthashmap with Read and Write traits.

WARNING

The hash function used by this map is not cryptographically strong. This map is open to denial of service attacks.

CAUTION IS ADVISED.

WritableHashMap Use

The implementation in this module currently supports many of the same methods as std::collections::HashMap, but not all.

Constructing a WritableHashMap

let map = WritableHashMap::new();

Constructs a map with a default capacity, currently 8. The map will expand as necessary.

let map = WritableHashMap::with_capacity( 30 );

Constructs a map with a capacity of at least 30.

WritableHashSet Use

The implementation in this module currently supports many of the same methods as std::collections::HashSet, but not all.

Constructing a WritableHashSet

let set = WritableHashSet::new();

Constructs a set with a default capacity, currently 8. The set will expand as necessary.

Hashing Function

The hashing function currently used is DJB2,

unsigned long
hash(unsigned char *str)
{
    unsigned long hash = 5381;
    int c;
    while (c = *str++)
        hash = (hash * 33) ^ c;
    return hash;
}

It may not be the best, but it is short.

For more information, see http://cr.yp.to/cdb/cdb.txt and http://www.cse.yorku.ca/~oz/hash.html.

Hashtable

The hashtable code is loosely based on Python's dictionaries. Since the HashMap implementation in the Rust standard library has been updated to be Robin Hood Hashing (i.e. bucket stealing), this implementation may not be faster than the standard libraries', although using the DJB2 hash function makes it likely.

It should be possible to mix-n-match hashing function implementations, though, so it might be worth experimenting further. Soon.

For more information on this implementation, see

For information on Robin Hood hashing, see the papers/articles mentioned in the Rust docs: