MRChemSoft/mrcpp

Buffer overflow?

susilehtola opened this issue · 6 comments

The random crash in MRChemSoft/mrchem#389 occurs at

posIsAvailable = (this->stackStatus[sIdx] == 0);

This line looks suspect to me, since looking at the loop structure

while (posIsAvailable and not(endOfStack)) {
sIdx++;
posIsAvailable = (this->stackStatus[sIdx] == 0);
endOfStack = (sIdx >= this->topStack);
}

the index may have already run over the length of the array.

My quick fix would be

diff --git a/src/trees/NodeAllocator.cpp b/src/trees/NodeAllocator.cpp
index 0443213..ec0e99f 100644
--- a/src/trees/NodeAllocator.cpp
+++ b/src/trees/NodeAllocator.cpp
@@ -341,8 +341,8 @@ template <int D> int NodeAllocator<D>::findNextOccupied(int sIdx) const {
     bool endOfStack = (sIdx >= this->topStack);
     while (posIsAvailable and not(endOfStack)) {
         sIdx++;
-        posIsAvailable = (this->stackStatus[sIdx] == 0);
         endOfStack = (sIdx >= this->topStack);
+        posIsAvailable = (!endOfStack) || (this->stackStatus[sIdx] == 0);
     }
     assert(sIdx >= 0);
     assert(sIdx < this->stackStatus.size());

but I'm not sure if this is how the function is supposed to work

Now that this has been fixed by #190, and compile issues and tests were fixed in #188, would it not make sense to make a bugfix release?

Yes, coming up. Starting the year with some linux kernel problems on my laptop, though 😛

Actually, looking at the code once more, I don't think there was a problem with the highlighted code. The "stack" in question is not a proper stack, in that positions other than the top might get deleted along the way. Then to avoid the number of holes getting too large we occasionally do a compression of the stack, which is the only time this function findNextOccupied is called. The this->topStack variable is supposed to point to the last occupied position at any given time, so it should never point to anything greater than the overall size of the stack (which is actually a std::vector). Then the original test for endOfStack = (sIdx >= this->topStack) should be sufficient to avoid overflow.

This however means that there must be something else wrong somewhere, perhaps that topStack for whatever reason do point out-of-bounds of the stackStatus vector.

Could it be that it's the initial bool posIsAvailable = (this->stackStatus[sIdx] == 0); that was the problem, not the one in the loop? If the sIdx input parameter was out-of-bounds there would indeed be a problem. Then the reason your fix solved the issue could be that the initial stackStatus[idx] evaluation is now optimized away since it's value is not used before it's overwritten inside the loop?

Edit: should not happen as there is a corresponding break statement before calling findNextOccupied(sIdx) in case sIdx >= topStack. Then the only reason I can think of is that topStack >= stackStatus.size(), which would mean something more fundamental has gone wrong.

Turns out topStack points to last last occupied plus one. Now everything makes sense, since we can get in the situation sIdx == topStack == stackStatus.size() which leads to an overflow in a subsequent stackStatus[sIdx] evaluation. Another fix coming up, including it's sister function findNextAvailable().

Bugfix release v1.4.1 out