How to handle orphans?
Opened this issue · 3 comments
Add ideas here
Currently are orphans just left inside of mongo? are there any reorg recovery's in place to remove or flag orphans so that they are no longer considered for the mainchain? I've been digging into this project quite a bit the past couple of days and really think it's interesting what you have done. Wouldn't mind brainstorming or helping potentially.
Thank you!
Yes Bitdb maintains the mempool
and block
databases as a separate collection, and each operate in a different manner.
The mempool collection (unconfirmed
) is always in sync, it gets refreshed to blank slate whenever there's a new block.
The block collection (confirmed
) is append only. This part is important for security reasons because theoretically we could try to maintain the state as "accurate" as possible by updating the past additions whenever something changes, but I think this may open a whole can of worms. So the design decision is to:
- Append only and no deletes (just like bitcoin)
- Add unique index to
tx.h
so that no same transaction hash can be added again - Try to come up with a 2nd layer on top of the raw bitdb for the conflict/orphan/or any kind of anomaly resolution.
The 1 and 2 are already how bitdb works, and I think it would be great to come up with 3.
This can take two types of forms I think:
- Do it on Bitd side: Add an additional storage layer that includes info on which is accurate and which is no longer accurate.
- Do it on Bitqueryd side: Just keep adding stuff, but add some module to bitqueryd so that it only filters out the final version when it returns query responses
I'm currently tied up with decentralizing as top priority at the moment, but I'm always thinking about this so, if you have good ideas, feel free to contribute. Thanks!
I definitely like your idea of keeping it append only. However, I don't see how you can guarantee the data is not part of a fork at the time you query the node. I think in general 1 hr's worth of confirmations is considered safe (aprox 6 blocks on btc and bch), however there have been cases with much longer forks (20-40 blocks), so you can't bank on 6 blocks being 100%. So in my mind, at a minimum, anything less than 6 blocks old should be considered sort of mutable.
The filtering idea on the Bitqueryd side is interesting, but with the not allowing the same transaction hash to be added again we would have an issue. If we add a TX found in our saved block 1000000 and later we end up finding out our block 1000000 is orphaned and the new block 1000000 has some of the same transactions that were in our orphaned block. Well the TX hashes will actually be the same because the transactions are the same, however the block in which the TX was found is now in a new block. Thus the TX we saved in our DB actually has the wrong block information associated with the transaction!
I have seen a few implementations that have an extra field field called either "double spend" or "orphaned" and they adjust that to reflect changes of the longest chain. To me I feel like you either have to flag items in the DB as orphaned and add duplicates or update items with the new information in the case of a fork. One other thought is we could have a 3rd table called like mutable and once TX's get around 100 confirmations we merge that into our immutable confirmed table. Does this make sense?
I might not be understanding 100% how you are envisioning this or maybe i'm over thinking it. But, the real trick is the data we get from the node in a weird way is "mutable" towards the tip because our data could always be on the wrong side of the fork when we are getting data towards the head of the blockchain.
https://bitcoin.stackexchange.com/questions/52891/how-to-detect-reorganization-from-bitcoind-via-zmq