For what is operator `| 0`?
Opened this issue · 6 comments
crdt-example-app/shared/merkle.js
Line 23 in 3acd310
This is a quick way of converting the floating-point value to an integer (the bitwise operators only work on 32-bit integers, so it causes the 64-bit float to be converted to an integer).
In this case, it ensures the base-3 encoded timestamp strings end up looking like "1211121022121110"
instead of "1211121022121110.11221000121012222"
.
If it's helpful to anyone, I forked James' repo and have an "annotated" version that includes a bunch of code comments for stuff like this (and a NOTES.md
)--basically study notes. For example: https://github.com/clintharris/crdt-example-app_annotated/blob/master/shared/merkle.js#L43
@clintharris Thank you!
Do you know the reason why does merkle tree use 3 nodes instead of 2?
https://github.com/clintharris/crdt-example-app_annotated/blob/master/shared/merkle.js#L36
The base-3 number system used for timestamps means the tree will be ternary.
In other words: each char in the timestamp string can only be "0", "1", or "2". So each node in the tree can have 0..3 child nodes accessible via those keys.
Hi! As I see at
crdt-example-app/shared/merkle.js
Line 23 in 3acd310
If so, several edits that occur during one minute will have the same path in Merkle tree - 012 and 012 for example. How does a sync engine deal with a lot of edits during one minute? 🤔
I’m looking at this for the first time, so apologies if my interpretation is incorrect, but here’s how I see it working.
The Merkle tree isn’t storing the actual values that are set, only a hash. When inserting a value into the tree, the insertion process walks down the tree to the leaf node for the timestamp and XORs each existing node’s hash with the new hash along the way. This way, two trees can be compared quickly by finding the node with the earliest timestamp where the hashes differ.
So the tree doesn’t tell you exactly what is different, but it does give you a lower bound on which messages need to be sent to completely update a client with the latest state.
As for why a ternary tree instead of a binary tree, it’s probably just to reduce the depth of the tree to make traversal faster. I actually wonder if a higher base might work even better but I haven’t thought through it enough yet and it probably doesn’t matter much.
EDIT: It also simplifies traversal of the tree and somewhat reduces rebalancing concerns which could otherwise become an issue with monotonically increasing keys, since the "trie" representation means that the key is embedded in the tree structure itself. In this case, the tree starts out very unbalanced but becomes more balanced over time.