celestiaorg/nmt

NamespacedMerkleTree.foundInRange should not narrow uint64 to int

Closed this issue ยท 5 comments

In

nmt/nmt.go

Lines 238 to 245 in 6274243

func (n *NamespacedMerkleTree) foundInRange(nID namespace.ID) (bool, int, int) {
// This is a faster version of this code snippet:
// https://github.com/celestiaorg/celestiaorg-prototype/blob/2aeca6f55ad389b9d68034a0a7038f80a8d2982e/simpleblock.go#L106-L117
foundRng, found := n.namespaceRanges[string(nID)]
// XXX casting from uint64 to int is kinda crappy but nebolousLabs'
// range proof api requires int params only to convert them to uint64 ...
return found, int(foundRng.start), int(foundRng.end)
}

the XXX'ed comment correctly identifies the narrowing cast from uint64 to int as problematic. I suggest either (1) removing the casts and changing the return types to uint64, or (2) add a range check and panic on overflow. Given #15, and the fact that return types tend to spread to callers, I recommend (1) even if it's a bit more work.

The choice of the index data type (the type of the return values of the foundInRange() as well as start and end in the leafRange) has implications for the maximum possible size of the tree. If int is used, the maximum tree size is limited to math.IntMax. On the other hand, if uint64 is used, the maximum size increases to math.UintMax, which is almost twice as large as int (assuming int = int64). In Celestia, the tree size maps to the square size i.e., the number of shares in each row/column which does not seem to get too high anyway, hence there should not be much difference between using int or uint64 from Celestia PoV. Other trade-offs regarding int vs uint64 usage are implementation-related:

If we opt for uint64:

  • We would likely need to create a custom type, such as Index, and define custom arithmetic operations for that type to prevent overflow/underflow in lines like these.
  • We should avoid using range loops with the index because the built-in range loop in Go uses the int type for the index, which cannot represent the maximum value of uint64. For example, refer to this line.

On the other hand, if we choose int:

  • We don't have to deal with overflow and underflow conditions. Consequently, there is no need to introduce a new type with custom arithmetic operations.
  • We are free to use range loops in Go without any concerns.

Considering the above, I am inclined toward int, please let me know your thoughts @liamsi @evan-forbes @rootulp @cmwaters

int SGTM

If you do settle on a signed type, I suggest being specific about the type (int32 or int64), or make the build fail if compiled on platforms where int is not at least 64 bits wide.

I have created a tracking issue for the latest comment within this thread to prevent it from being lost and overlooked. You can find it here: #239.

Considering this, we can mark this issue as resolved with reference to issue #198. Any additional actions can be coordinated through the tracking issue.