nucypher/protocol

Calculating the genesis slashing penalty

arjunhassard opened this issue · 18 comments

I'd like to open a discussion on this crucial component of the slashing protocol, and indeed our overall economic design. Our current logic and parameters originate from back-of-the-envelope reasoning in the nucryptoeconomics compilation doc:

penalty = basePenalty.add(penaltyHistoryCoefficient.mul(penaltyHistory[_miner]));
penalty = Math.min(penalty, _minerValue.div(percentagePenaltyCoefficient));

&

BASE_PENALTY = 100
PENALTY_HISTORY_COEFFICIENT = 10
PERCENTAGE_PENALTY_COEFFICIENT = 8

(see PR nucypher/nucypher#507 for more context )

Background

In general, it's not straightforward to design penalty calculation algorithms (i.e. determining the amount X to decrease a stake Y by, given offence Z), that yields fair punishments uniformly across all Y and Z, and maximises the likelihood of behaviour aligned with the protocol's goals. It's hence unsurprising that there's some variance in the approaches taken by prominent staking+slashing networks. For example:

Livepeer:

if (del.bondedAmount > 0) {
    uint256 penalty = MathUtils.percOf(delegators[_transcoder].bondedAmount, _slashAmount);
    del.bondedAmount = del.bondedAmount.sub(penalty);
... }

where the _slashamount coefficient depends on the offence (failed verification, missed verification, double claiming a segment).

Casper:

fraction_to_slash: decimal = convert(convert(recently_slashed * self.SLASH_FRACTION_MULTIPLIER, "int128"), "decimal") / \ convert(convert(self.validators[validator_index].total_deposits_at_logout, "int128"), "decimal")

where self.SLASH_FRACTION_MULTIPLIER = 3.

Polkadot's most recent testnet version:

Slashing starts with a 0.001 DOTs penalty and doubles. If your node successfully passes a session, the current penalty is reduced by 50% with a minimum of 0.001 DOTs.

And if this post is accurate, then a Tezos baker who screws up will immediately lose their entire deposit, including what they would have earned from the 'offending operation'.

Discussion

Penalties calculated with absolute figures (e.g. a fixed number of tokens) run into issues:

  • Volatile token conversion rate to fiat means punishment in real terms has an unpredictable multiplier
  • Fixed penalties can be overly punitive to small stake holders and potentially irrelevant to large stake holders. Making them large enough to be disincentivizing to large stake holders might wipe out smaller stake holders for a single error.

However, calculations involving the percentage of the offenders' stakes are also problematic:

  • They can be overly punitive to large stake holders. For example, a set 10% decrease in stake for 2 nodes who commit the same offence/error, but node A holds $100k worth of tokens and node B $1k – larger node could reasonably consider it unfair that there's a $9,900 difference in absolute terms between their respective penalties. This could lead to a disproportionate favouring of smaller stakes, depending on each individual's perceived risk that they might commit a slashable offence. This is especially important if the network ever ventures into attributing offences imperfectly, or punishes offences that could be caused by network issues or censorship (i.e. nodes failing to come online)
  • Traditional crime and punishment does not generally take into account the perpetrator's wealth, as percentage based penalties do (in the sense that theoretically the letter of the law treats, or should treat, everyone the same. Of course in practice wealthier citizens have many advantages with respect to criminal justice requirements such as bail / counsel costs).

Combination ideas

A natural solution to balance the tension between absolute and percentage based penalties is to combine them.
Broad ideas:

  • Fixed amount OR Percentage of stake, whichever is lower
    more or less what we have now
  • Fixed amount OR Percentage of stake, whichever is greater
  • Fixed amount AND Percentage of stake
    More complex ideas:
  • Percentage of stake input to penalty calculation modified by the size of stake relative to others.
    i.e. a tiered system wherein the greater the percentage of the total staked tokens you control, the smaller the percentage of your stake slashed given offence Z.
    Although this sounds slightly unpalatable, it would avoid large nodes abandoning the network when they are punished too severely (in absolute terms).
  • Fixed amount figure calculated using current fiat value of tokens
    i.e. using a price oracle to calculate a base punishment – e.g. minimum of $500 fine for offence Z
    this would avoid uneven real-world severity of punishment due to token volatility

Choosing the exact parameters

e.g. 5% of stake if stake < 0.01% of network + fixed penalty of $100 in nu tokens
This is a to-do following a discussion of the approach. Also I think this will require scenario modelling.

Thanks for initiating this Arj.

Quick Thought:
I don't necessarily think that the percentage calculation is as "unfair" as it seems on the surface.

Something to keep in mind is that larger stakers receive more work from the network (and rewards) and are therefore relied upon more for the functioning of the network. Therefore, if they are dishonest it may necessitate a harsher slashing punishment than smaller stakers.

  • In the context of your court example it could go the other way where wealthier citizens could receive more punitive bail judgements (higher amounts, loss of passport, or remand) because of their significant means.

The point is basically: in some cases equitable != equal.

That being said, I understand the point of adjusting the calculation since on its own is probably not the best solution.

Need to think a bit more about this.

tuxxy commented

I also agree with @derekpierre wrt percentage calculations not being "unfair".

Keep in mind that providing an incorrect re-encryption is pretty much the worst offense a staker can commit. In the example you give of 10%, I could even see this being too low.

Outside faults of our code, for someone to perform an incorrect re-encryption it would mean that they went through the process of modifying the code to do something malicious. It doesn't matter how much they staked, they have now become a significant threat to the network.

Slashing, therefore, has two predominant objectives:

  1. Disincentivize stakers from acting maliciously via incorrect re-encryptions, and;
  2. Remove malicious stakers from the network. (The most important one, imho.)

The second point above is the most important for our network's health. The slashing protocol acts as a recovery measure against fault. If a staker is willing to try to perform incorrect re-encryptions, they should not be staking in the first place. This isn't an act against a mistake (like downtime could be), but a penalty against a deliberate and malicious action.

It seems that we need to consider the possible underlying causes of Ursula returning an incorrect CFrag.

  • The most obvious one is that she's fucking with Bob. That's a paddlin' for sure.

  • Maybe we pushed a bad update and she applied it. Then we're the assholes.

  • Maybe she got infected with malware. Still her fault, but not as bad in terms of intent (though, as @tuxxy points out, it's probably not in our users' best interest for us to try to parse that).

  • Maybe underlying OS updates caused some weird incompatibility that nobody had considered, but that we can easily and instantly fix with a patch. What then?

It seems that we need to consider the possible underlying causes of Ursula returning an incorrect CFrag.

Let's not forget 3rd party dependencies. We're not pinning the underlying cryptographic libraries (umbral, cryptography.io, OpenSSL), so we cannot rule out that an update there can cause some problem.

tuxxy commented

It seems that we need to consider the possible underlying causes of Ursula returning an incorrect CFrag.

Let's not forget 3rd party dependencies. We're not pinning the underlying cryptographic libraries (umbral, cryptography.io, OpenSSL), so we cannot rule out that an update there can cause some problem.

This is true, however, if we deploy code that has such an obvious problem, then it's our fault in general. Hopefully something like this would be caught in CI.

It doesn't really make sense for us to try to include every case where a proof can be faulty. That's not our role here. As we're building an autonomous, decentralized network it must have the partiality to decide for itself what the punishment is.

We cannot determine a "soft" penalty over a "hard" penalty. There is simply no way for us to do that.

If a group of stakers get slashed because of a bug in proofs or a dependency, it's up to the network to choose how to correct and my recommendation would be hard-fork.

You're right, such situation justifies a hard fork.
I'm not very concerned with the rest of situations that @jMyles describe, btw.

tuxxy commented

@cygnusv You make a great point that observes some flaws in our testing, though.

We should probably include property-based tests for all of our proofs to verify that they can't be faulty and include as many weird edge cases as possible.

I think, the "hard" punishment is indeed appropriate here.
We will need to go back to this question though once we start thinking about slashing for inactivity. A "soft" variant could be tied to the fee of creating a policy, or maybe total coin supply divided by total number of policies

tuxxy commented

In the form of "soft" punishment, I think a non-slashing penalty is appropriate. For example, (and this assumes we solve the downtime problem) if we identify a node who is down, we can just prevent them from receiving more KFrags and freeze their rewards until they come back online.

Yeah, and mark the whole time from the last check like that as "inactive". That works.

Given that the node may already be in possession of kfrags, what about those corresponding policies if the node is offline? Yes, the m-of-n scheme allows resiliency for some of the nodes to be down, but if too many nodes for the same policies are down, the policy cannot be enacted. If I understand correctly, in this scenario the 'soft' punishment would have the nodes not get charged a real cost other than opportunity cost. Is it possible then, that the 'soft' punishment is too soft?

@derekpierre makes a good point – from the perspective of adopting platforms and their end-users, the impact of a set of stakers being offline when an access request arrives is roughly equivalent to those stakers re-encrypting incorrectly (intentionally or otherwise) – i.e. it increases the probability that their (potentially critical) data sharing flow will be impeded

tuxxy commented

@derekpierre Great points.

Before I dive in, let's just be clear that we cannot, presently, determine if a node is passively down (without consent of the staker), or if a node is actively down (with the consent of the staker); this is important. Without some accurate method to determine either case in an autonomous manner, we must assume the former (passive downtime). It's better for the health of the network for us to assume this because otherwise, we would be disincentivizing honest stakers. Individuals/Groups would be less willing to run nodes if any downtime meant serious penalties to their stakes.

Furthermore, hard penalties (without an oracle that determines the type of downtime) actually incentivize attackers whose objective is to hurt the NuCypher network.

With that said, I'll get to your points (and tie my response back to the above) below.

Let's go through this:

... what about those corresponding policies if the node is offline? ... if too many nodes for the same policies are down, the policy cannot be enacted.

Understandably, this is very frustrating for the users. My response to this problem, today, is that due t the experimental nature of decentralized networks our users should expect to take into account these issues of reliability. If a developer were to ask me what to do in this case, I'd suggest that they be willing to "re-grant" the policy.

... in this scenario the 'soft' punishment would have the nodes not get charged a real cost other than opportunity cost. Is it possible then, that the 'soft' punishment is too soft?

Instead of asking the following implicit question, "How can we adjust a penalty to be more fair for stakers and users?" We should rather ask the question, "How can we incentivize stakers to be more reliable?" Remember, we can't assume that a node is actively down (unless the staker specifies it).

In an ideal setting (where we know the type of downtime), it's very easy for us to give a hard or soft punishment accordingly. In this setting, we would also be able to penalize stakers accordingly based on the effect they have on the health of the network. If that's possible, then it's relatively straightforward for us to incentivize a healthy network rather than incentivize good behavior.

Since we cannot attain this ideal setting (yet), we can only incentivize good behavior and presume that it will influence the overall health of the network. There is also an implicit assumption of rational actions on behalf of the stakers.

With these points staged, I think we're in a better place to determine a course of action. At the moment, we do perform a "soft" punishment when nodes don't check-in via confirm_activity. Nodes who don't check-in will lose the previous day's rewards. This is a very indiscriminant penalty that takes no preference towards either downtime type.

In my opinion, this is the most ideal incentive right now until we come up with a better solution to incentivize reliability.

I'd say that confirm_activity is something designed for actively being online. But understandably, shit happens. Perhaps, if the staker goes offline and this gets noticed, at least he shouldn't get compensated for the assumed downtime. I wouldn't say that there should be no reward for the whole day (or otherwise he may think "this day is fucked up, I won't fix it before tomorrow then"), maybe only for the "what is the upper limit for downtime" (time between the last check and the moment when downtime was detected?).
What's the most important: no one should get rewarded for detecting that he is offline. Otherwise, you'd have incentivised DDOS attacks

Thanks for the clarification @tuxxy annd @michwill

Understandably, this is very frustrating for the users. My response to this problem, today, is that due t the experimental nature of decentralized networks our users should expect to take into account these issues of reliability.

Such reliability concerns could limit our traction, especially if Alice ends up paying for the policy. End-to-end encrypted data sharing would be great, but if it doesn't work when it is needed most, that may be even less desirable for some projects.

If a developer were to ask me what to do in this case, I'd suggest that they be willing to "re-grant" the policy.

Could be a work-around, but how would that work? Would re-granting the policy incur an additional cost at that moment?

"How can we incentivize stakers to be more reliable?"

I agree that if the reverse ("How can we disincentivize stakers from not being reliable") is not feasible at this point, we really need to think through the above.

("How can we disincentivize stakers from not being reliable") is not feasible at this point

I think, it can be feasible, more so than proof of stake on Ethereum. But some careful thinking is required

Circling back to behaviour we can reliably detect and attribute – i.e. incorrect re-encryptions – it might be useful to establish the nature of the offence itself before working towards the optimum severity of the punishment. Arguably the point of slashing is to automatically/quickly remove threats to the network – as opposed to a light behavioural nudge, after which the staker hopefully improves their conduct. Hence we should all understand what these threats constitute.

What are the motives for re-encrypting incorrectly? Or, why would a staker go through the trouble of modifying NuCypher code? This offense differs from going offline, which you could attribute to incompetence, laziness, disinterest or some other passive decision. Excluding for now the three situations in which a staker is not directly at fault (laid out by @jMyles above), incorrect re-encryptions involve active decision-making, and therefore beg some rational/self-serving explanation.

Also unlike being offline, re-encrypting incorrectly doesn’t directly reduce overheads (or even effort) versus re-encrypting correctly. In both cases, stakers stay online, hold re-encryption keys and respond to access requests. Since each cfrag transformation procedure incurs a near-negligible cost, we can probably rule out configurations where it is tangibly cheaper to produce garbage for Bob. All things equal, and in the absence of a bribe or other economic advantage, re-encrypting incorrectly may actually lose you money – even with no attribution/slashing, repeatedly failing to re-encrypt properly could drive users away from the network as a whole, and therefore eat into the offending staker's own revenue (in terms of overall demand and token price).

There may of course be a motive to mess with specific Alices/Bobs. This is plausible, but at first glance appears difficult and expensive to orchestrate, even without the presence of slashing. Say Bob is a news outlet, Alice is an undercover journalist, and there are a set of stakers controlled by the adversary, an evil regime. They want to prevent this story from getting out. To have a chance of blocking the re-encryption from happening, they would need to control a large proportion of the network – otherwise there’s no way to ensure their stakers will be assigned the relevant sharing policy. In theory, if the first message is blocked, Alice could repeatedly re-issue sharing policies, each with a new, random set of stakers – and eventually the message would be re-encrypted by any honest node that isn't controlled by the adversary [side note: this might be a cool product option – automatic re-granting to a totally different staker set if Bob does not confirm receipt within some specified time period]. If Alice were to set m to 1 and n to near the total staker number, an adversary wishing to guarantee censorship would need to control the whole network. The cost and difficulty of achieving this depends on the network’s stage of growth, but this prerequisite would price out most adversaries.

Instead of running nodes directly, the adversary may attempt to bribe as many stakers as possible. This would be logistically more feasible, as it does not depend on each target staker’s willingness or ability to sell their stake at a given moment (their tokens may be locked for a time period so long as to render the attack pointless). Bribe recipients may demand compensation equal to the loss of expected income (rewards + fees), plus some motivating premium.

Regardless of how the adversary seeks control (staking or bribing), they will need to identify and convince existing stakers to cooperate (selling their stake or accepting the bribe). This is probably easier to achieve earlier in the lifetime of the network, when there are fewer stakers to approach, they are less spread out, and (possibly) the total market cap, and therefore cost of attack, is lower. Importantly, in all these cases, it would be expedient for the attacker to instruct their bribed stakers to go offline, rather than re-encrypt incorrectly, since we can’t (yet) differentiate between lazy, DDOSed and malicious stakers who fail confirm_activity, and it would cost the corrupt stakers far less while achieving the attacker's goal.

The good news is that these attacks seem very, very difficult to orchestrate. The bad news is that slashing for incorrect re-encryptions may be a bit of a strawman defence.

It's worth noting that slashing stakes for misbehaviour originated as a response to specific threats to consensus protocols; for example, validators signing multiple blocks at the same block number or epoch. Slashing under certain conditions has the explicit purpose of preserving the protocol's safety and liveness, such that it is formally provable that for either to be jeopardised, 1/3 of the total deposited balances of active validators must act improperly. It seems the threat model for incorrect re-encryptions is quite different. For one, the impact is far more contained to the owners of the policies that the offenders manage. This means that the rest of the network's users can go about their daily business, unaware the offence took place. Conversely, signing an incorrect block without consequence can cause a dangerous domino effect where others start validating on top of a fraudulent chain. Basically, there's such a palpable downside to under-punishing in the context of a PoS consensus protocol, that maximum harshness may indeed be the best choice. The equivalents of safety and liveness for NuCypher, in terms of danger to the network (with the acknowledgment that this analogy is imperfect), do not appear to be as as vulnerable. The rest of the network continues on unabated – so ‘liveness’ is intact. And as discussed, sharing flows are only ineluctably disrupted if the adversary has taken over the network (a ‘100%’ attack?).

Given the differences, there may be some downsides to mimicking base layer punitiveness. Heavy-handedness could alienate stakers from our network – we don't have perfect certainty that an incorrect re-encryption is deliberate, and even if the vast majority of non-deliberate instances imply incompetence (on our part, or theirs), stakers need to take the non-zero possibility of this occurring into account, especially if they’re staking large sums.

This risk may be increased, from a staker point of view, by the cadence/granularity with which punishments are dealt out. Correctness proofs can be generated per re-encryption, so stakers are on the hook for each individual failure to re-encrypt properly, irrespective of the size, value or sharing regularity of the underlying data. This puts stakers assigned to a high throughput applications, like an IoT devices or multi-user communication channels, at risk of suffering harsher punishments – for example, if a high volume of requests occurs exactly when the staker’s machine is infected with malware, before they have a chance to fix it. Our current punishment also increases with each subsequent incorrect re-encryption:
penalty = basePenalty.add(penaltyHistoryCoefficient.mul(penaltyHistory[_miner]));
In the instance we ship code containing a bug that leads to a similar outcome, harsh/uneven punishments exacerbate the negative impact on stakers, and therefore they will account for this risk too, even if it’s very low.

There may be other classes of threat to the network achievable via incorrect re-encryptions, and we should strive to identify and articulate them – so that when we assess the trade-off between harsh punishments (extinguish threats, but potentially over-punish) and soft punishments (fail to remove threats, but avoid low staking volumes), we know precisely what we are legislating against.

I think, incorrect re-encryptions are indeed unlikely to happen. I view that more as the first step towards the protocol which ensures liveness.
For example, when Bob sends a request to re-encrypt on-chain, Ursula has to re-encrypt, but Bob cannot submit more than m - 1 (so that he doesn't have everything he needs to decrypt on-chain). And punishment not going into anyone's pocket in this case, but being seized in favour of the future inflation.
That protocol would require proofs that re-encryptions were correct (or not) on-chain. Not that anyone will publish incorrect re-encryptions, but if you don't check re-encryption correctness at all, Ursulas could publish garbage if they want to censor/bullshit Bob.
That said, the "ensure liveness" protocol might be not very trivial, so we want to start with something tested, which is to be able to challenge correctness.