nucypher/protocol

Number of workers in the network – determining max_stake_size, min_stake_size & default m/n

Opened this issue · 2 comments

Stimulated by a discussion with @derekpierre, @mswilkison, @vepkenez and @szotov, plus an interchange between Gaia and @cygnusv & @michwill on the #staking channel.

The protocol cannot fully control the number of individual workers/Ursulas in the network. However, it can push this towards desirable bounds through careful parametrization; in particular max_stake_size, min_stake_size and in less direct sense via the default and/or recommended threshold.

There are many incentivizing forces underpinning how a staker chooses to split their stake. Here are some possible objectives:
(1) Maximise exposure to sharing policies and therefore fees. In general, due to the threshold scheme, the more workers a staker runs, the more policies it will be selected for. This holds if: a) 5 workers with x/5 the stake earn exactly the same as 1 node with x stake, b) there is nothing preventing multiple workers controlled by the same staker to be selected for the same policy and c) some policies have an n such that larger stakes are excluded (even if these are rare).
(2) Minimise downtime. In general, the more workers a staker runs, the more redundant their set-up is and the fewer work opportunities they will squander. This may, if high-throughput policies become common, trade-off with the capacity of each individual worker, and is therefore bound by overheads – see (6)
(3) Maximise exposure to policy issuers optimising for latency. Although this isn't yet a formal feature, it's probable that certain use cases (e.g. Staker KMS) will necessitate this user optionality, and furthermore, the protocol may not be able to prevent network users from testing latency with various nodes and selecting accordingly. The more spread out (and strategically placed) a staker's workers are, the better.
(4) Minimise slashing downside. The more workers, the smaller the amount of stake at risk of being ceded in the case of an incorrect re-encryption.
(5) Minimise attack downside. The more workers, the smaller the amount of stake at risk if a worker is compromised and forced to re-encrypt incorrectly.
(6) Minimise overheads. Generally, the more workers, the greater the cost – both in terms of time+effort and cloud/hardware overheads.

The net economic pressure might be run as many workers as possible, increasing over the long-term as fees overtake rewards. Hence min_stake_size must be set to counter-balance this, or there may be a race to the bottom – compromising a key protocol objective; that workers handling policies and kfrags have sufficient collateral attached to incentivise a minimum competence and reliability.

There are other constraining factors on the max number of workers, such as the scalability of the learning/discovery loop and worker sampling before policy creation, and the capacity of dependencies – but I'll leave it to others to elaborate on these!

As usual, your issues are extremely stimulating @arjunhassard !

It's late here and I just want to add something wrt to this:

There are other constraining factors on the max number of workers, such as the scalability of the learning/discovery loop and worker sampling before policy creation, and the capacity of dependencies – but I'll leave it to others to elaborate on these!

I don't know about min_stake_size, but I ran some simulations for sampling with different max_stake_size and I think we can safely increase it from 4M to ~30M NU, without affecting the expected sampling distribution. Another way to see it is that if we define a new parameter max_stake_ratio = max_stake_size / min_stake_size, which tell us the maximum potential proportion between the largest and smallest stakers, the sampling algorithm works best with a ratio of 1, but it still does pretty good for a max_stake_ratio up to 2000 (i.e., 30M / 15K). That means that, as long as what it concerns to sampling, we can change the min_stake_size and max_stake_size to any amount we want as long as the max_stake_ratio is kept below 2000.

Perhaps on that regard we could prepare a more extensible tool to help us to study the effect of different parameters.

@arjunhassard: Very interesting stuff here. Can you expand on how you view default m/n in terms of economic modeling?