enhancement or clarification request: fetch Range[Set] keys in RangeDict overlapping given Range[Set]

Question

enhancement or clarification request: fetch Range[Set] keys in RangeDict overlapping given Range[Set]

Closed this issue 3 years ago · 2 comments

Suppose a RangeDict contains a great many Range keys, perhaps many millions:

... # millions of entries
Range(2300,2305) -> value
Range(2400,2800) -> value
... # millions of entries

Given Range(2302,2500), I'd like to fetch the above two Range keys in O(log n) time complexity, because they overlap the given range.

If the requested Range is entirely contained by an existing Range in the dictionary, such that __getitem__ would have succeeded, the single containing Range is returned.

The method I'm thinking of would therefore return a list of 0-N Range(Set) entries, with the len(result)==0 case occurring when __getitem__ would have thrown KeyError and there were no overlapping ranges.

This method would never return any dictionary values, it is simply for key discovery. Perhaps the existing ranges() function could take an optional query parameter? Perhaps an analogous feature could be added to RangeSet's ranges() for discovering certain Range entries in a large RangeSet?

Is this currently possible or easily added given the architecture? I'm currently calling ranges() and traversing it, but this is not efficient for large dictionaries.

Answer 1 · 2021-12-03T17:12:14.000Z

Hmm, maybe the value->range mapping is for the purpose of not needing to be able to quickly visit ranges that are near to one another, yet still being able to combine ranges? If there is no red-black tree or other ordered data struct for the ranges then perhaps this is a big architectural lift?

Answer 2 · 2021-12-24T21:08:42.000Z

Feature added, with the methods RangeDict.getoverlap() and etc.

It's true that the backend data structure that powers the RangeDict is a bit architecturally inefficient, but I lack the energy to improve it to a more efficient data structure and fine-tune the algorithms. If you're willing to make a pull request of your own with those improvements I'd be happy to incorporate it.