enhancement or clarification request: fetch Range[Set] keys in RangeDict overlapping given Range[Set]
Closed this issue · 2 comments
Suppose a RangeDict contains a great many Range keys, perhaps many millions:
... # millions of entries
Range(2300,2305) -> value
Range(2400,2800) -> value
... # millions of entries
Given Range(2302,2500), I'd like to fetch the above two Range keys in O(log n) time complexity, because they overlap the given range.
If the requested Range is entirely contained by an existing Range in the dictionary, such that __getitem__
would have succeeded, the single containing Range is returned.
The method I'm thinking of would therefore return a list of 0-N Range(Set) entries, with the len(result)==0 case occurring when __getitem__
would have thrown KeyError and there were no overlapping ranges.
This method would never return any dictionary values, it is simply for key discovery. Perhaps the existing ranges()
function could take an optional query parameter? Perhaps an analogous feature could be added to RangeSet's ranges()
for discovering certain Range entries in a large RangeSet?
Is this currently possible or easily added given the architecture? I'm currently calling ranges()
and traversing it, but this is not efficient for large dictionaries.
Hmm, maybe the value->range mapping is for the purpose of not needing to be able to quickly visit ranges that are near to one another, yet still being able to combine ranges? If there is no red-black tree or other ordered data struct for the ranges then perhaps this is a big architectural lift?
Feature added, with the methods RangeDict.getoverlap()
and etc.
It's true that the backend data structure that powers the RangeDict is a bit architecturally inefficient, but I lack the energy to improve it to a more efficient data structure and fine-tune the algorithms. If you're willing to make a pull request of your own with those improvements I'd be happy to incorporate it.