indexmap-rs/indexmap

Index-based mutable splitting

Closed this issue · 3 comments

SOF3 commented

User story

I am using indexmap to maintain a topologically sorted set, where map.get_index(i) may depend on map.get_index(j) for j < i. Thus, when I operate mutably on an element, I need access to its preceding elements.

Basically, I want to write this:

for i in 0..map.len() {
    let value = map.get_index_mut(i).expect("i < map.len()");
    for dep_key in value.deps() {
       let dep = map.get(dep_key);
       value.feed(dep);
    }
}

This doesn't compile because map is mutably borrowed for value and cannot be reused for map.get(). This problem could be solved if indexmap provides an equivalent slice.slice_at_mut:

for i in 0..map.len() {
    let (left, right) = map.split_at_mut(i); // split map into two slices first
    let value = right.get_index_mut(i).expect("i < map.len()"); // map -> right, not sure if i should reset to 0 here
    for dep_key in value.deps() {
       let dep = left.get(dep_key); // map -> left
       value.feed(dep);
    }
}

left.get(dep_key) can still be implemented as usual because the keys cannot be borrowed mutably by the user.

Guide-level explanation

Add new split_at_mut method on IndexMap:

fn split_at_mut(&mut self, i: usize) -> (map::Slice<'_, K, V>, map::Slice<'_, K, V>) { ... }

Add a new Slice struct that exposes most of the methods from map equivalently. Err(NotInSliceError) is returned if a parameter is resolved to be in the other slice.

I am not sure if index parameters/output should preserve the original value as in the full map, or if it should off-set at the split to be inconsistent with the slices API. Personally I prefer the former since the implementation needs to access the full slice anyway and users would probably cache the index somewhere else anyway (so the index is as important as the key), but I'm open to suggestions here.

Implementation details

The type of IndexMapCore entries changes to Vec<Bucket<K, UnsafeCell<V>>>. Interior mutability is only active when split_at_mut(&mut self) is called (which is exclusively borrowed anyway), so it is safe for all other functions to &*cell.get() directly.

When a map/slice is split into (sub)slices, all slices hold the reference to the whole map, with their own range bounds assigned. It is safe for all slices to access all parts of the map immutably except for the values (which are behind UnsafeCells). This allows performing probing in other slices to eventually reach their own slice if possible.

Slices are not allowed to perform structure-mutating operations (i.e. insertion, removal and draining) as this may cause buckets to cross slice boundaries [citation needed].

The master branch (not yet published) does have a Slice type, added in #177. However, that only acts like an indexable sequence of items, without any hashing operations -- it's not clear to me whether you need that when you're mentioning both get and get_index. I also had an older "views" experiment in #47 that did keep hashing, but we never pushed that idea further.

Could you see if that will meet your needs?

SOF3 commented

@cuviper Thanks, I didn't notice there is already a slice type. I could separately store the key-to-index mapping for now to workaround the issue.

indexmap v2 has been published -- I wonder if you've had any chance to work with that?