zawy12/difficulty-algorithms

A Simple DAG

zawy12 opened this issue · 2 comments

This describes the a DAG-based blockchain that uses Nakamoto consensus to get permanent ordering in only 1x the propagation delay if there's no increase or shift in network hashrate. Other consensus methods require an exchange of messages which require at least 2x the propagation delay. Tight timestamp limits can prevent <50% selfish mining from disruption the fast consensus.

Timestamps and Difficulty Algorithms (not just for DAGs)

The following are clock and timestamps requirements in Nakamoto consensus on which my DAG concepts are focused.

  1. Monotonic timestamps: Secure distributed consensus isn't possible without monotonic timestamps on messages. BTC's median of past 11 timestamps is an ugly, inefficient, problematic patch that is the result of a lack of knowledge in how Nakamoto consensus works. It allows an existing security hole in BTC, LTC, et al., and continues to cause tricky, hard-to-catch problems in alts and any code that attempts to use blocks that come after the MTP as if they have even partial probabilistic consensus. It wastes the most recent 5 blocks. Block height & the chaining of block hashes do not enforce Lamport ordering because timestamps affect difficulty which determines chain work which is the real consensus ordering that can re-order the heights and chain to a different tip.
  2. Local time: It's important for miners to keep accurate local time without consulting any other nodes or NTP service that other miners use. This prevents Sybil attacks. Nakamoto consensus is only as decentralized and secure as node clocks. Peer time in BTC is a mistake because it can be subject to Sybil & eclipse attacks. Nakamoto consensus (in a sense) works better than consensus because no node has to ask any other node which chain has the most work or what time it is. Local time must be a lot more precise than block time for Nakamoto consensus to be optimized. For a DAG attempting to achieve consensus close to the theoretical minimum of 1x propagation delay after a tx is sent, block times must be a lot faster than the propagation delay. For example, if propagation delay is 1 second, then use 0.1 second block times (~10-block DAG width).
  3. Timestamp limits (on parent blocks as viewed and enforced by current miners): These need to be tighter (closer to local time plus propagation delays) than the block time to optimize Nakamoto consensus. Enforcing accurate timestamps enable honest miners to estimate current network hashrate more accurately. This rule is difficult to achieve & probably not needed in a DAG with block times faster than propagation delays, so I will not require it in this article. In BTC-type chains, following this rule prevents selfish mining with <50% of the network hashrate from getting excess profit. Honest miners shouldn't accept newly-seen blocks with timestamps outside a reasonably accurate honest clock (plus expected propagation delays) unless there are enough blocks to indicate the newly-seen tip has higher chain work instead of higher luck. In other words, honest miners assume a reasonably level of synchrony (little to no unexpected level of partition in the network) to block selfish mining but allow sufficient PoW to override the assumption. This works because a selfish miners can't predict when they will need to release blocks, so they can't assign an accurate timestamp. If selfish mining isn't a concern and if an RTT is not being used (next item), the timestamp limits appear to only need to be a fraction of the difficulty averaging window to prevent an artificial reduction in difficulty.
  4. Real-time targeting (RTT): This changes a miner's difficulty during hashing, based on the timestamp he uses. This prevents stuck chains and unusually fast blocks from a sudden drops or increases in hashrate drop. It can also safely change the exponential distribution of solvetimes to be flatter or more centered on the target block time. This requirement does not have popular support, so I'll not require in this DAG. See Tom Harding's RTT paper. and my older article. This usually needs or requires the tighter timestamps (see previous requirement). This requirement does not have popular support, so I'll not require it in this DAG.

To repeat, I'll ignore requirements 3 & 4 for this article.

Most people argue against all of the above requirements based on good intentions that pave a road to attacks on distributed consensus, especially the 1st one. I came up with these after investigating the source of every problem that occurs in difficulty algorithms. It turns out that almost no problems are caused by difficulty algorithms, but by the clock and timestamp rules. I then looked back at Lamport's famous 1978 Clocks paper and thought carefully about how Nakamoto consensus works (e.g. each hash is a "vote") and can firmly claim the above are requirements for distributed consensus. Median of time past is just an ugly patch that reduces the speed of reasonably secure consensus and causes problems in unwary, disparate code.

I'll assume for this article that no one wants to use requirement 4 (RTT) and not discuss it except to say there's no harm in using it as the most recent timestamp in the following algorithms, as long as requirement 3) is applied with sufficiently tight timestamp limits.

Difficulty Algorithms in a DAG

[I'm currently working on this article. ]
I'll discuss here everything I know about difficulty algorithms and how they relate to PoW block-based DAGs. In comparing algorithms, "best" or "better" mean the algorithm that has the fastest response speed of response to hashrate changes for a given level of stability when the hashrate is constant. The parameter "N" (or N*M in Digishield) in all of these is 2x the "mean lifetime" of how long it takes the algorithms to respond, so it's a filter on random variation. All of these except BCH's Absolute ASERT require monotonic timestamps to prevent attack, and I'm not sure it's safe (I vetted and approved its use in BCH, with a consensus-theory-based philosophical objection. Our other options at the time would have meant an additional and potentially complicated consensus change, or kyuupichan's timestamp handling in Tom Harding's WT-144 to be implemented in ASERT. It requires going through the previous 11 timestamps (if BTC's MTP ~ 6 is used) and assigning MTP+1 to any timestamp that is before the MTP timestamp before using in the difficulty algorithm).

SMA algorithm
The simplest difficulty algorithm to use in a DAG of blocks is the simple Moving Average (SMA):

Simple Moving Average (SMA) difficulty algorithm
T = desired block time
N = number of ancestor blocks in averaging window
target = 2^256/difficulty in a given block.

target = (2^256/sum_N_ancestor_difficulties)  * (most_recent_timestamp - oldest_timestamp) / T
or
difficulty = sum_N_ancestor_difficulties * T / (most_recent_timestamp - oldest_timestamp)  

Notice the sum of difficulties is chain work as seen by the block who's target is being cited, which doesn't include siblings etc that the block hasn't seen. The 1st term in parenthesis in the 1st eq can be replaced by average of N target values (this is what Kaspa DAG does based on my recommendations), but more properly, a target-based version would need to multiply each target times its solvetime, sum them up, and divide by N. The "oldest timestamp" makes the same "2015/2016" error that Satoshi made if it's not actually the Nth oldest ancestor's most recent (or average or oldest?) parent's timestamp, or the "most_recent_timestamp" could follow the RTT rule above. All this is not important for large N.

Digishield algorithm
The SMA above can have oscillations if a large percentage of miners come and go based on difficulty changes. A better algorithm to prevent this is Digishield, if you don't include its error of using BTC's forced timespan limits (4x and 1/4 the measured timespan, which allows attacks & slows adjust at genesis) and not use its "MTP delay" in which timestamps it uses (assuming you require monotonic timestamps to prevent attacks from not using an MTP delay) which causes a persistent oscillation. A target version of Digishield is:

Corrected Digishield:
N = 1/M of the N above to have the same stability as the SMA
timespan = Max - Min timestamps which are N+1 blocks apart.
T = targetBlockTime
M = 4 = "filter", "buffer", "tempering", or "dilution" factor.  Notice M=1 makes it an SMA.
target = (2^256 / sum_N_ancestor_difficulties) * (1 + timespan/(N*T*M) - 1/M)

Again, avg_N_targets can be used to replace the 1st part in parenthesis. It's presented as a calculation with real numbers which should be allowed so that every validator can agree on the value after the division, so it should be expanded to remove the 2nd parenthesis where integer math division will work with little error.

WTEMA algorithm
A significantly better algorithm ("WTEMA") can simply use a parent's target and solvetime, but I'm not sure which parent to use (oldest, newest, average target and solvetime, one with shortest or longest solvetime, or the one with the highest or smallest target?) and this slightly affects my descendant ordering. Notice how it similar it is to Digishield.

WTEMA difficulty algorithm
N = 2x the N for SMA to have the same level of stability
target = parent_target * (1 + parent_solvetime/(T*N) - 1/N)

Absolute ASERT algorithm
An even better algorithm is BCH's "Absolute ASERT". I prefer the simpler WTEMA but that and a Relative ASERT use only 1 parent target and timestamp and this causes a slight imprecision in how descendant ordering works. They also increasingly (linearly with N) lose accuracy in getting the correct average block time (5% at N=1000?). BCH's absolute ASERT does not have these problems, nor require kyuupichan's timestamp handling if monotonic timestamps are not enforced. It still has the problem of which parent's timestamp to use but it doesn't use a parent's target. This is solved by follow the timestamp consensus requirement number 4 above and use the timestamp in the miner's block to determine his difficulty. BCH's integer-based code needs to be used, but here's the real-valued math.

Absolute ASERT
N = 2x the N for SMA to have the same level of stability
t = time since genesis
h = "height" = number of ancestors, not generations
target = genesis_target * e^((t/T-h)/N)

Effect of propagation delay changes

Propagation delays may change which will affect the minimal-possible confirmation times and it's best and easiest to have a limit on DAG width, if not a fixed width, despite the changes. A way to automatically optimize confirmation time and maintain a certain DAG width is to measure the DAG width and change the desired block time in the difficulty algorithm. Rewards that target a coin emission rate would need to decrease if the block time decreases while the DAG width is constant. Reasoning through the effect of a propagation delay: Delays decreases => DAG width decreases => difficulty per generation decreases if the difficulty algorithm targets a number of ancestors in a given amount of time instead of generations. WTEMA is the only algorithm above that doesn't target either a specific number of blocks or generations per timespan.
[ I'm currently working on this article ]

Why use a DAG?

  • It can enable faster confirmation times without increasing orphans. By having more blocks per time, DAGs can quickly confirm the network hashrate is intact to be sure a tx is not subject to a double spend. [see footnote 1] The theoretically fastest confirmation time is 2x propagation delays which is roughly equal to 2x the DAG width times the block time. This is if arbitrary ordering is used (usually based on block hash values) and ancestors older than 2x the propagation delay are not allowed to affect ordering which is a really hard "close out" period to achieve without forcing forks. If descendant work ordering (described below) is used and a double spend is not adding to network hashrate to secretly support the double spend, then a tx has evidence of being 1st in future ordering in only 1x the propagation delay which is the time it takes to confirm 50% of the network hashrate has seen that tx but not a duplicate (1/2 the network saw the tx in 1/2 delay and took 1/2 a delay to respond, if block solvetimes are much faster than the delays). Again, this requires an unreasonably fast "close out" period, but shows descendant ordering has an advantage over other ordering methods because it measures hashrate support. BTW, Avalanche uses "chits" to count descendant blocks which is the same idea but based on a summation of stake instead of hash votes.
  • DAG width can be used as a metric of network delays which can be used to dynamically increase block size or decrease block time as node bandwidth and/or centralization increase (this is just a different way of saying it allows faster confirmation times).

A DAG for minimizing confirmation times needs to average multiple parents per block (which is also the DAG width). This give tx recipients more "samples of solvetimes" in a shorter time, on the scale of propagation delays, to better estimate how much hashrate support his tx has. If the number of parents per block is not limited this is also the DAG width. The block time multiplied by the DAG width is the propagation delay. But because of the difficulty in "closing out" how old a block can include "previously unseen ancestors" in its history to prevent any significant change in ordering, it may be best to target a small average DAG width of say 2 or 3, especially if we adjust target solvetimes to achieve the desired DAG width which will keep solvetimes a specific fraction (after dividing by targeted DAG width) of propagation delays. A smaller DAG width can also help in reducing the following duplicate tx problem.

The Duplicate Transaction Inclusion Problem

A hard problem in a DAG of blocks instead of a DAG of txs is selecting which txs to include in a block. DAGs If a tx propagates faster than blocks, network bandwidth & block space will be wasted on by multiple blocks containing the same tx even if the DAG speed is being fully utilized (i.e. miners empty their mempool in every block they produce). If there's a sizeable mempool allowed to build up, pools can randomly select txs and hope they don't "collide" with other sibling block selections, but that would require a mempool that contains >10x the DAG width to have <10% chance of collision which means txs are slowed by 10x the potential speed before they can get included in a block, eliminating the purpose of a DAG. Pools could agree outside the protocol (or it could be coded if they have a long-lasting destination address for rewards) on which txs they will include based on a hash of each tx in their mempool (to sort them) and also based on each pool's portion of the total hashrate (this can be estimated in the protocol) to determine how many txs they get. This loses some efficiency and reduces the permissiolessness of Nakamoto consensus. A block-based solution is for wallets to "sign transactions over" (in some sense) to a selected pool (maybe wallet owners simply choose a pool to directly send the tx to and the pool does not tell other pools about the tx). The problem might be an argument for each tx being its own block, even letting wallets collect parent txs and mining their own tx.

Minimum Wait Times for Evidence of No Double Spend

An idealized network for simple analysis would be N equal-sized pools. An ideal DAG would have a target block time that changes as network propagation delays change so that the DAG width of N stays a constant (N = typical number of parents). the network propagation delay is equal to N*blocktime. I'll assume the descendant ordering method described below is used, which is the most precise & correct order if there are two competing blocks containing a double spend. Descendant work ordering can show a block has 50% of the hashrate support in only 1 propagation delay if the sender of the tx is not adding hashrate to the network to secretly support a double spend. The recipient has good evidence that 50% of the network hashrate has prioritized his tx over a potential (unseen) double spend in N block times (after the tx is included in a block) multiplied by 1.96/sqrt(N) for 95% confidence (the Poisson distribution gets closer to Gaussian with larger N). This is an acceptable wait time for small-valued txs. The minimum time to show there's not a double spend even if the attacker has 49% of the network hashrate is 2*N*1.96/SQRT(2*N) block times. If the recipient sees any double spend in that time period, he has to wait until the txs are too old to be allowed (by the protocol) in new blocks as a "newly-seen ancestor" and then possibly wait a few more blocks for the sum of descendant work to more precisely determine the winner. The "newly-seen ancestor" rule has to be 1x (or 2x?) propagation delays to make confirmation as fast as possible but causes orphans or tip races between 2 competing DAGs. To prevent various timestamp manipulations that can help an attacker, these minimum times to confirmation require honest moners to enforce tight timestamp limits on parent blocks (to allow only reasonable local time error between honest miners, plus propagation delays), accurate local times (no peer or NTP time), monotonic timestamps, and a procedure for properly handling blocks that fall near the timestamp limits (a future article to solve a current timestamp attack that exists in all PoW chains). It also assumes a difficulty algorithm that changes every block (as described below) to help descendant work sort out the winner.

A DAG Design

  1. Block headers include the hash of all valid parents a miner sees.
  2. A "no incest" (aka "no triangles") rule. This means a child can't reference a parent who is already an ancestor, e.g. a grandparent can't also be a parent.
  3. Duplicate transaction problem solution: Wallets send txs to a selected mining pool. The pool doesn't relay it to other pools. To optimize the everything. pools should have equal hashrate. The number of pools should be equal to the width of the DAG. Small miners might independently choose pools in order to assist these two goals. To the extent all this is not done, bandwidth and confirmation times are harmed.
  4. If the DAG begins to get too wide/narrow from the above, it means propagation delays have increased/decreased, so the block time should be decreased/increased. There should be a maximum block time in case there are too-few pools of similar mining strength to get the target DAG width.
  5. Ordering of blocks is achieved by a "descendant work" rule. The block with the most sum of difficulties in his descendants is the earliest. See "Ordering".
  6. Closure rule: If an ancestor "first seen" by a descendant is older in terms of a generation count than 2x the current target DAG width (2x the estimated propagation delay for fastest confirmation), that particular descendant's work doesn't count towards that ancestor's descendant work total, and a block can't cite a parent who's newest parent is older than that limit.
  7. Difficulty algorithm should have a large enough averaging period such that miners are not penalized in subsequent blocks for including all parent blocks it sees. WTEMA is best but nBits needs 4 more bits of accuracy if the "mean lifetime" setting (~1/2 SMA's averaging window) is from 300 to 5000. Above 5000, it needs more bits to have less than 1% error (makes blocks slower). Error in solvetime caused by nBits at its minimum 2^(-15) accuracy with WTEMA is (1-2^(-15))^mean_lifetime.
  8. Use monotonic timestamps.

Ordering

We normally think of the earliest blocks as the ones with the least amount of work, but this isn't correct. The winning tips in POW, the ones with txs in blocks when the fork first occurred, are declared "first" by having the most descendant work up until the current time, not the ones with the least ancestor work (otherwise it would let a miner choose by some trickery which tx is first by "virture" of having less hashrate). The following Descendant Work rule is the only method I could think of that would work. In the event of a double spend, the first (aka "oldest") block is the one with the valid txn and it has the most descendant work. You simply add up the difficulties of all his descendants. This can be seen as the basis for ordering when sibling chains in normal POW are orphaned. The "oldest" winning blocks in competing chains in normal POW are not the ones with the least chain work, lowest block number, or earliest valid timestamp because subsequent hashing can void them. The hashes that come after are votes that tell us which blocks were seen first.

TL; DR: The number of hashes that come after a block is how early the block occurred. The scarcity of hashes before the block does not prove it was earlier because fewer hashes has less cost.

The following image shows Phantom/GhostDAG (February 2020 updated paper) ordering in the small circles which is based on ancestor work, which is what everyone (including me) initially assumes. My suggested correction is to use the descendant work shown in magenta. (Note: this is not a correction to how GhostDAG selects blue blocks, but how final ordering should be.) There is some parallel in GhostDAG and descendant work as can be seen by B and F being the "excluded" blocks in GhostaDAG and also having the lowest rank out of their siblings in descendant work. There's no descendant work for JML so they're not ranked, but it's possible to use each blocks' own difficulty as part of the sum of descendant ranking. We should include that difficulty if we know all the txs in the block came before mining on it started. BTC, C comes before D despite both of them having the same number of descendant blocks because C 's descendants had a higher sum of difficulties because they had more ancestors, so even though we're only looking at descendants it's including ancestors, provided the difficulty algorithm is not changing too slowly.

image

If a tie occurs such as when two blocks have the same ancestors and descendants, an arbitrary rule can decide the winner, like the block with the lowest hash.

footnote [1] There is a common misconception that fast finality is cheaper to attack. If the worldwide hashrate using your POW is not dedicated (non-"rentable") enough to your coin to prevent that, even a slow, big coin like BTC would be subject to double spends, costing only 6x12.5 BTC to get 500 BTC in double spends.

Footnote: How To Do Monotonicity
Only a node's local time, determined independently by each node operator (no previously-agreed-upon oracles like NTP, GPS, or his phone company) should be used because peer time is subject to Sybil attacks. Distributed consensus systems like POW are only as secure as the "truth-without-another-consensus mechanism" of available UTC time.

A miner accepts only parents whose timestamps are < FTL+local_time.
If newest_parent_timestamp >= local_time then temp_local_time = newest_parent_timestamp+1
else temp_local_time = local_time
All code everywhere needs to respect temp_local_time as a constant until local_time exceeds it.
This enforces a strict ordering for the node. It might be called a Lamport Timestamp (see wikipedia.) This enforces a strict ordering on the consensus messages (blocks) being sent and received in the distributed system. This 1978 paper by Lamport seems to say this is necessary for all distributed consensus mechanisms. Here's my forest level interpretation of that paper: Conditions IR2 and IR2' require sequential stamps on messages. Paragraph before conclusion says clocks can't be set backwards for "our situation". In the case of blockchain, this means that clocks can't be set to something before a validated timestamps.

Nice!

This is a copy-paste of Kaspa issue 1691 that I replied to. It discusses issues surrounding the necessary difficulty algorithm.

The current Kaspa DAA is

T = targetTimePerBlock
N = windowSize
target = avgTarget * ((MaxTimestamp - MinTimestamp) / (T * N))

The options to make it faster in computation are to use WTEMA or a variation of Digishield.

WTEMA is "relative" ASERT's very close cousin, which only requires the parents and grandparents. It is very close to "relative" ASERT by using the e^x =~ 1+x approximation for small x. "Relative" means using the grandparent instead of a reference block in the "absolute" ASERT at the bottom of this comment. Here's WTEMA for a normal POW chain.

ST = solvetime between parent and grandparent
next_target = prior_target * (1+ ST/T/N - 1/N)

To see how it works, if ST = T there is no adjustment and then notice the outcomes if ST>T or < T. ETH uses a variation of this. You can expand the above equation out to use integer math. A high N =1320 will accumulate error if nBits is not higher precision than in BTC. BTC's nBits lowest precision range is 2^(-15) and the error is then N/2 * 2*(-15) = 2% in the solvetime for N=1320.

N=1320 for ASERT and WTEMA have the same stability as N=2640 in Kaspa's current SMA and they are faster to respond to hashrate changes.

If the ST in WTEMA is too small to have good accuracy, then WTEMA can be modified in a way that makes it like Digishield. This also happens to be a way to modify the current algo that requires a lot less computation and greatly reduces possibility of oscillations. If you have to go N blocks in the past for WTEMA to get a large enough ST or want to make the current SMA 10x more efficient by going from N=2640 to N=2640/10 = 264, then the equation would be (leaving out PPD and PD adjustments):

N = 264
timespan = Max - Min timestamps which are N+1 blocks apart.
M = 10 = "filter", "buffer", "tempering", or "dilution" factor.  Notice M=1 makes it an SMA.
target = avgTarget * (1+timespan/(N*T*M) - 1/M)

Again, multiply it out to use integer math. This has same stability as Kaspa's current SMA.

WTEMA and Kaspa above are not targeting solvetimes after a miner 1st sees a block, but the propagation delay plus the solve time. But that's not correct either because propagation delay has a non-linear effect on DAG width which depends on how close it is to the miner's solvetime, and DAG width has a linear effect on the difficulty algorithm in the opposite direction as the propagation delay. Propagation delay alone decreases difficulty because of the longer apparent solvetime, but as it gets closer to the solvetime (a DAG is needed precisely because propagations delays are "close" to solvetimes), the extra DAG width makes the averaging window in Kaspa go back less in time (smaller Max-Min timespan) which increases difficulty (decreases target) which pulls propagation delays back away from from the total solvetime, decreasing DAG width. It's a negative feedback that seems to work out OK towards a balance, except you are not controlling DAG width or the apparent solvetime (aka generation time), but their combination. Does it risk oscillations?

There are other ways to do a DAA for a DAG but they require measuring DAG width. You could target solve time after a block's arrival, the ratio of propagation time to solvetime, or target a DAG width.

If you want to target solve time after a miner sees a block, the average parents per block (PPB) in the window might be a good-enough estimate of siblings per generation which you can use it to adjust the equation. Another way to get PPB might be to count generations by counting the number of median blue parent to median blue parents in the window so that PPB = N/count. Here's what I have to adjust for this and propagation delays (PD). PD can be estimated from an estimate of siblings per generation (see GhostDAG papers for the equation). Notice that PD and PPB can be artificially lowered (but not raised) which makes the difficulty harder. The only potential drawback to this is that a block-withholding attacker can make the public difficulty go high to discourage public hashrate before working on his private chain.

target = avgTarget * ((Max - Min - PD * N/PPB) * PPB / (T * N))

Another possibility instead of using PPB like the above is to let the window size refer to generations based on a count of "median blue parents" and use only the Min and Max timestamps from those (but still determine avgTarget from all reds and blues. This probably has the same effect as using PPB=N/count in the above.

If the DAG width is being controlled and set to a constant, then you can use "median" timestamp, target, and height of the block's parents as the input to BCH's new ASERT. It's the best DAA. It only requires looking at the parents. It looks like it could cause serious problems if the DAG width is not set to a constant. In ASERT's case, there is a reference block at some point in the past that the DAA uses as "initialization" of target and height. It knows how to set the difficulty by counting blocks after that reference height and knowing (for a given DAG width) how many blocks there should have been since then.

"Absolute" ASERT in normal POW is

T = block time 
ts = timespan between parent and reference block
dH = difference between parent and reference blocks' heights
=
next_target = reference_target * e^( ts/dH/T/N-1/N)