carrotsearch/hppc

SortedHashMap?

bruno-roustant opened this issue · 4 comments

There are cases where we want a HashMap with iteration in sorted order. In those cases we can use the JDK TreeMap, but it uses more memory and provides slower O(log(N)) #get as well as slower iteration (2x slower than JDK HashMap in my benchmark).
Often the TreeMap is built once and is not modified afterward, in this case TreeMap is not really advantageous.

There could be an opportunity to have a SortedHashMap based on HPPC HashMap (open-addressing). It would require to extend HPPC HashMap and have a int[] orderArray sorted using HPPC IndirectSort (it could even sort on values). If the orderArray is built once and reused (for example discarded only if the map is modified, which does not happen in the use-case I describe), then such a map requires much less memory and provides O(1) #get and fast iteration.

But such a map has a fixed iteration order! So we get into big perf issue if we iterate it to add to another HPPC HashMap.

So here is my question:
Would it fit into HPPC with a warning that it must not be used to copy to another map (the same warning as ScatterMap had), or is it too dangerous or too exotic to fit?

Hi Bruno. I was wondering about adding sorted maps (and insertion-order-preserving maps) a few times but I just couldn't find the implementation scenario that would be generic and fast enough. The "iteration order" array seems more like a helpful specific façade - maintaining the entry order array is impossible while adding/removing elements so it's not like a general-purpose replacement for a TreeMap (and I think it would confuse people).

What you suggest makes sense to me as a lightweight, read-only view over a delegate map object, with the ability to create (or recreate) the iteration order array. Something like this, for example:

{code}
var delegate = new KTypeVTypeMap<Object, Object>();
var sortedMap = new SortedIterationKTypeVTypeMap<>(delegate, (Comparator) kvPair-> {...});
// sortedMap follows the order of the comparator. In assertions-mode it should also attempt to detect modifications to the delegate and fail on such modifications?
{code}

I like this idea of a read-only view with a delegate. It's even better than an ImmutableSortedMap builder because it offers wider use cases, while still being clear on the intended usage.
I'll try to sketch something.

Try it! I think you can try with one of the concrete classes first, the template can follow-up as it's more tedious to write.

Actually it could be nice to also have a KTypeVTypeImmutableSortedMap with a builder and taking a parameter to select performance or memory.
The immutable sorted map for performance would be a read-only sorted-iteration view on an internal KTypeVTypeHashMap (inaccessible thus immutable).
The immutable sorted map for memory would be composed of only 2 arrays for keys and values, of exactly the size required, and with a binary search for the get/contains. It would provide a very fast iteration.