Special Ordered Dict that automtically swaps from LittleDict to another type when too full

Question

Special Ordered Dict that automtically swaps from LittleDict to another type when too full

oxinabox opened this issue 4 years ago · 2 comments

Originally posted by @PallHaraldsson in #54 (comment)

Since LittleDict is faster, is it a reasonable default?

"It is reasonable to expect it to outperform an OrderedDict,
with up to around 30 elements in general;
or with up to around 50 elements if using a LittleDict backed by Tuples"

[I proposed OrderedDict as replacement for Dict at julialang.] I mean LittleDict is maybe not a good default as you can have more elements (how likely for Julia itself?). For a lot of code you will not have more, while the possibility should give you pause. Does it make sense to build a new Dict on this one for the first 30-50 elements, and have a fallback to another, OrderedDict (or some newer Swiss or OrderedRobinDict?).

Answer 1 · 2020-09-29T17:11:37.000Z

Yes, I think that would be neat. MagicOrderedDict

The exact time to swap depends a lot on the type of data and how expensive it's hash function is,
but swapping over at about 30 should be generally safe. (which is why the docs recommend it).

In theory we could even be more magic by timing hash and isequal on the first 2 elements added then decide what to do for all others.

Answer 2 · 2020-09-29T20:48:47.000Z

Thinking more about it, and what I had in mind with fallback, is not move stuff to another Dict but leave stuff unchanged in the LittleDict that handles say first 30, then add rest in an "overflow-dict". When you lookup, if at least count is twice the former size (maybe somewhat more as the latter is slower), 60+ then you look first in the LittleDict. If count is higher than the threashold (e.g. huge), then you would look first in the latter.

I'm not sure it's practical to look in both "sub"-dicts at the same time (threads have some overhead).

[I was also thinking, maybe scan the LittleDict from the front and back (similar to quicksort), that can be done in one loop. Maybe it helps a bit, maybe you're more likely to look up recent additions. Maybe just scan backwards?]