[Question] How to get previous node of bucket sentinel?
Closed this issue · 8 comments
In SplitOrderedList::lookup_buckets()
, we should be able to skip "finding" if we have a reference to the sentinel in self.buckets
.
AFAIK, self.buckets
gives us &Atomic<Node>
, which are references to the growable array's internal pointer to the sentinel node. At first, I thought of using these to create the Cursor
object to return, but this doesn't make sense, since these references are not the next
pointers of list nodes. "The Art of Multiprocessor Programming" doesn't uses cursors when looking up buckets, so doesn't seem to be experiencing this problem.
As long as we are returning a Cursor
for lookup_buckets()
, it seems like we should have a way to find the previous node of the sentinel. How should that be done? Am I missing something?
You probably want to take a look at
#938 (comment)
and
#939
let atomic = self.buckets.get(index,guard); let shared = atomic.load(SeqCst,guard); let cursor = Cursor::new(atomic,shared);The
V
inCursor::new()
is a generic parameter. For the split ordered list we instantiate it withMaybeUninit<V>
.
Copied from #938.
In the code above, atomic
is a reference to a pointer in self.buckets
, not a reference to a next
pointer of the node before shared
. The Cursor
API seems like it assumes that cursor.prev
is a reference to a next
pointer of a node in the list, and violating this assumption might lead to unexpected behaviors.
For example, if this cursor
is later used to insert a new node, cursor.prev
might be CASed to a pointer to a new node, changing the pointer in the growable array, which clearly isn't what we want.
In the code above, atomic is a reference to a pointer in self.buckets, not a reference to a next pointer of the node before shared.
atomic
is a reference to a node of self.list
, not to a pointer in the growable array; this is blocked at the type level, which returns a Atomic<T>
, not Atomic<Segment<T>>
. If that is what self.buckets.get()
returned, one has an incorrect growable array implementation.
The intention of the Cursor::new()
API is to literally use it like the example code I gave you. For example, see:
https://github.com/kaist-cp/cs431/blob/main/src/lockfree/list.rs#L288-L290
It simple loads the head pointer which has type Atomic<Node<K,V>>
, essentially same as atomic
, and creates a cursor saving those two values. There is nothing special with the head node here, and using a &'g Atomic<Node<K,V>>
that references any node in the list will do, including atomic
.
In the code above, atomic is a reference to a pointer in self.buckets, not a reference to a next pointer of the node before shared.
atomic
is a reference to a node ofself.list
, not to a pointer in the growable array; this is blocked at the type level, which returns aAtomic<T>
, notAtomic<Segment<T>>
. If that is whatself.buckets.get()
returned, one has an incorrect growable array implementation.
I thought atomic
is not "a reference to a node of self.list
", but "a reference to a Atomic<Node>
that points to a node of self.list
". If atomic
was "a reference to a node of self.list
", its type must've been &Node
. GrowableArray<T>
internally stores Atomic<T>
s, and give access to them through get()
.
In the Cursor
insert API, we create a new node, make the node point to cursor's curr
, and CAS the cursor's prev
to point to our new node. This makes sense when we have a cursor created from List::head()
, since the cursor's prev
is part of the list. Using atomic
, returned from GrowableArray<Node>::get()
, as a cursor's prev
is thus going to cause a problem, since a CAS to atomic
will not change self.list
, but corrupt self.buckets
.
BTW, at least on my implementation, this doesn't make the hash table buggy, since even if lookup_buckets()
returns a Cursor
with an invalid prev
, we never insert()
a new node into the cursor without a find_XX()
before it, and find_XX()
s will fix the prev
if they encounter a "curr
node with a valid tag, but has a smaller key than the target". Our curr
is pointing to the bucket sentinel, which is never removed (thus have a valid tag) and is guaranteed to have a smaller key than the target. (We calculate the bucket index to be smaller than the target node anyway.)
TL;DR: This problem won't cause the code to fail.
The return type for 'self.buckets.get()' for the split ordered list is '&Atomic<Node<K,MaybeUninit>', which means the returned value is a reference to the next field of some node in 'self.list'. I'm really not sure why you keep on saying using it will cause corruption of the array, which can only happen if it's an internal (children) node of the growable array, which it isn't.
This is my current understanding on GrowableArray<T>
:
A GrowableArray<T>
doesn't manage the value of type T
, but only stores a pointer to it. Leaf Segment<T>
s store Atomic<T>
pointers that point to the value of type T
. On get()
calls, the API returns a reference to the Atomic<T>
stored in leaf node segments. This is why GrowableArray<T>
doesn't have 'insert()' calls: the 'get()' call already returns a reference to an element (of type Atomic<T>
of its leaf segment, and users are expected to modify the pointer to point to a data of type T
. (If it returned Atomic<T>
, users wouldn't be able to modify the array at all.)
In my understanding, GrowableArray<Node>
can't possibly return references to the next field of a node in self.list
, because it is not designed to do so. GrowableArray<Node>
is designed to point to Node
s, not Atomic<Node>
s. If someone wanted to have self.buckets
point to the next field of some node in self.list
, they would've used self.buckets
of type GrowableArray<Atomic<Node>>
.
Ah ok I see your point. I agree on your points and think that the current Cursor
API is a bit fragile. I think the reason it's ok for this is due to the usage of sentinel nodes that won't be tagged and stuff, but in general it should be fixed. I'll try to fix it ASAP.