CloakProject/codename-phoenix

Potential PoS Attacks

Opened this issue · 2 comments

The following information was received from a student research team:

Dear _________ developers,

We are writing to disclose a resource exhaustion vulnerability affecting several cryptocurrencies based on chain-style proof of stake. An attacker that connects to a victim node as a peer can send invalid blocks which are stored on disk without being validated. The consequence is that an attacker can perform a disk exhausting attack. The attack is relatively cheap to carry out since it only requires the attacker to possess a small amount of stake.

We’ve made a demonstration in the form of a test script for the regtest / python test framework on a similar PoSV3 coin. _________ codebase shares similar affected components with coin on which we performed our demo. In particular, we believe that all bitcoin core based UTXO model PoS coins suffer this problem. Though this has not been tested on _________ coin, we believe that this would likely also affect your implementation. We encourage you to study this attack and determine whether it would affect this implementation.

This vulnerability affects several(>30) cryptocurrencies, so we are making a coordinated disclosure effort to responsible developers for the affected cryptocurrencies before publishing it.

Description of vulnerability:
_________ code inherits many of its design elements from the Bitcoin codebase, which made sense given Proof-of-Work chain but do not provide adequate security in a Proof-of-Stake chain. In essence, Bitcoin checks Proof-of-Work headers before guarding disk resources, while in _________ resources are claimed even without the analogous proof-of-stake checks. In more detail, when a node receives proof of stake block not on the main chain, it does a preliminary check of existence of coinstake transaction in TxDB. The attacker can pass this preliminary check by creating invalid PoS blocks (e.g., using already-spent stake), thereby storing these invalid blocks in disk. The stake transaction is only fully verified when there is a reorg. As a result, it is possible to get nodes to store data on disk, with a small amount of stake.
The relevant methods are:
AcceptBlock(): Performs the partial validation of the blocks. This calls the CheckProofOfStake() function for the preliminary checks of the blocks and stores the block on disk in /blocks/disk files.
ConnectBlock(): This performs the full validation of the the block. This checks whether the coinstake is actually unspent. This is only called when there is a reorg to the fork chain.
CheckProofofStake(): This functions performs a partial check on coinstake transaction by checking if the coinstake input exists in txDB. This also checks if the coinstake signature is valid and whether it meets the desired PoS target. Note however, this does check the whether the coinstake transaction is unspent.
Exploiting the vulnerability:

The idea behind exploiting the vulnerability is to send invalid blocks that pass the AcceptBlock() check, even though they don’t have valid stake. Instead, the corresponding coinstake transaction is built on a already-spent output. The attacker’s block is stored on disk, but bypasses any checks that the coinstake transaction spends a valid UTXO.

To carry out the attack starting from a small amount of stake, the attacker must amplify their amount of apparent stake. Apparent stake refers to total candidate stake outputs which may already be spent. If we start with a UTXO of amount k, we can create multiple transactions spending the coins back to ourselves as shown in the figure below. Only UTXO (n+1) should be allowed for staking, but we are able to stake with all UTXO from 1 through n+1, thereby increasing the apparent stake by n*k. This increases the chances of finding PoS block since the attacker can keep on doing this to increase his apparent stake.

The only check for coinstake transaction in AcceptBlock is the one which checks whether the transaction output is present in TxDB. The check that a proof of stake transaction is actually unspent only occurs later, in the ConnectTip method, which builds on chainActive.Tip, the current best known chain. Because this check occurs later, we can reuse already spent outputs for creating coinstake transaction in the malformed blocks. Theses blocks are stored permanently in the disk and are never validated because they don’t ever form a part of the longest chain.

Refer to the attached encrypted image, it is called illustration.png.

image

Even with 0.01% stake in the system, the attacker only needs 5000 transactions to mine blocks with 50% apparent stake power. Note that the attacker is only paying transaction fees for those 5000 transactions which is an insignificant amount. After the attacker has collected a large amount of apparent stake, he then proceeds to mine PoS blocks at a past time using the freshly collected apparent stakes outputs. Finally the attacker fills the disk of the victim peer with the invalid blocks as those pass the CheckProofOfStake() and AcceptBlock() checks. These blocks never undergo full validation because ConnectBlock() is never called.

The only cost incurred to the attacker is the transaction fees in amplification step. The attacker can also sell of his coins at an exchange before doing this attack, meaning that the attacker is only required to have possessed a stake at some point in history, but not at present moment. A real attack might work as follows. An attacker buys some coins from an exchange and amplifies his stake by creating transactions spending the coins to himself. He then sells those back and only then performs the attack.

Mitigations:

We think that this problem is a fundamental problem to UTXO based PoS and are not aware of complete and thorough mitigation for it. However in this section, we discuss a range of possible mitigations that we know of along with their respective tradeoffs:

Accept some risk of chainsplit by not processing reorgs beyond a fixed length. For blocks under that fixed length, do a full validation of the blocks. Give a maximum ban score for a peer for sending invalid past blocks in history. A possible approach might be to keep track of pcoinstip data structure per peer so that we can easily validate the block given by a particular peer.

Heuristically detect attacking peers and disconnect them by observing some fingerprint of attack.

Have a way to remove blocks completely. Don't keep more than 1 orphan fork per peer. The challenge with removing blocks from the block files is that they are appended only, so this would require some kind of background compaction process.

Enable full validation of blocks even outside of doing a reorg. This is costly as it would require making a copy of the coins database, potentially one for each peer. This can be made more efficient through techniques such as UTXO commitments or UTXO snapshots. With UTXO snapshots we’d take rolling snapshots of the chainstate database at some prior block. With UTXO commitments, the merkle tree proofs associated with new transactions would be transmitted alongside each block/header.

Although this potential issue pertains to the existing legacy codebase, it is important to bear this information in mind (and ideally mitigate attacks as needed) while porting the Cloak code to the new BTC codebase.

A number of these issues appear to be specific to PoS.v3 implementations and the retaining of coin-age in staking calculations should prohibit stake-amplification attacks.

The spent-stake disk attack however is likely to affect PoS systems prior to PoS.v3. I have traced through the legacy Cloak code and confirmed that a PoS block is indeed stored to disk prior to having its staking inputs checked to confirm they are unspent. This means that historic orphan stake blocks can indeed end up stored to disk and never have their inputs checked (as they are never a valid new top block and never get added to the chain (and have their inputs connected).

The suggested mitigation steps for attack are either very involved or akin to a sticking-plaster solution, so not ideal. The immediate initial solution that comes to mind is to move the ConnectBlocks check prior to the call to save the block to disk. This will likely just create a new attack vector as peers can potentially CPU bind the node by sending junk blocks that require expensive checks and can continue to do so until banned.

The above method could be adjusted to provide something of a working solution though. One thing that needs to be kept in mind is that any potential attackers will have access to source code, so any potential fixes should not provide scope for 'gaming' the system and working round any mitigation code.

A less [CPU] intensive solution may be to track orphan blocks by node. At startup, the client would generate a random bounding number to determine the max amount of successive [PoS] orphan blocks a peer node can send before being switched into 'Verification Mode' (VerMode). For example, when peer X connects to our node, we generate a bounding number for peer X (in this example, we're using a bounding range of 5-10 and peer X has been assigned 7 as a random value within that range. If peer X sends us 7 non-connecting orphan peer blocks, they are automatically switched into VerMode. When a peer node is flagged as being in VerMode, any new PoS blocks received from them are thoroughly checked before a corresponding DiskBlockIndex object is created for the block and committed to disk storage. If the PoS block is found to be invalid (spent stake inputs etc.), peer X is treated as a DoS attacker.