WalletWasabi/WabiSabi

Protocol: KVAC based

nothingmuch opened this issue · 15 comments

I had a brain fart and for some reason I put down algebraic MACs in #13 as relying on bilinear groups, and consequentially we did not look at them in much detail. This is not the case. Again I have @jonasnick to thank for prompting me to look again at Keyed-Verification Anonymous Credentials (KVACs).

The gist of KVAC is that it is designed for when the issuer is also the verifier, like in our use case.

A newer scheme: The Signal Private Group System and Anonymous
Credentials Supporting Efficient Verifiable Encryption
allows attributes to be group elements: for our use case the attributes can be Pederesen Commitments to amounts.

Since this scheme allows unlinkable multi-show, this reintroduces the double spending problem. However, this should be easily solvable by introducing serial numbers which are revealed as part of credential showing.

Note that by only using Pedersen commitments as group attributes we don't require the blind issuance protocol of either KVAC scheme, which relies on ElGamal encryption of the requested attributes, which has the nice benefit of perfect hiding of the sub-amounts and serial numbers, so the coordinator can't retrospectively de-anonymize users even if there is a crypto break.

Input Registration

The user submits a request for N credentials, with either 1 or 2 group attributes (not clear which is simpler yet):

  • Two separate Pedersen commitments, one for the sub-amount and one for the serial number.
  • A single Pedersen multi commitment for a requested sub-amount and serial number.

Along with a range proof for each amount commitment, and a proof that the sum of all the amount commitments matches the input amount.

The coordinator verifies the proofs, and responds with N MACs on the requested attributes, along with a proof of knowledge of the secret key.

Output Registration

The user executes the Show protocol for the combination of credentials they want to use, which produces as output randomized commitments to the attributes and a proof of the MAC verification.

The user also proves that the serial number commitments can be opened to their corresponding serial numbers. These serial numbers are revealed for double spending protection, but the commitment opening should be done in zero knowledge to avoid revealing the randomness of the original commitment in the input registration phase or the randomization added in the Show protocol.

The user also proves that the sum of the amount attributes matches the output amount, analogously to input registration (but there is no need for range proofs at this phase).

The user submits these proofs, the randomized attributes, and the serial numbers.

Note that the serial numbers should not be "revealed attributes" as I previously stated, since those would be trivially linkable to input registration.

Note that since all proofs are created with sigma protocol, the 2n+1 proofs should be composable into a proof of their conjunction. I believe this also holds if using bulletproofs for the range proofs at input registration.

Self Issuance

The (sub-)proof of MAC validity could be modified so that a credential is considered valid if it has a valid MAC OR the amount it certifies is exactly 0.

Since users could then generate such credentials without ever contacting the server, this makes it possible to require valid credentials to be shown at input registration, which would allow users merge an arbitrary number of inputs together into a single credential without requiring the output registration phase to accept as many credentials as there are inputs.

As few as 1 spent & newly requested credentials per input/output registration operation are needed if output registrations also produce newly signed credentials for the remaining amount, but ideally a larger number on the order of log(participants) should be used to allow users to make parallel requests for improved privacy.

Since attributes cannot be shown separately for a single credential covering multiple amounts the serial numbers would have to be hidden attributes as far as the credential showing, but then proven equal to some public input.

Anyway, the more I think about it the more needlessly complex the approach of having multiple attributes under a single credential approach seems, but i still think having one credential per amount is strictly an improvement over #16

The main advantage over #16 would be smaller tokens and reduced interactions during issuance, as well, and it also seems like this requires fewer group operations.

  • smaller tokens -> we don't really care.
  • reduced interactions during issuance -> we care, but how can the interactions be reduced while ACL already has the theoretically minimal interactions in the first place?
  • this requires fewer group operations -> what does this mean? does this mean it reduces computational load, in which case we don't really care, or it means it makes the overall scheme simpler, in which case we care?

A better approach might be to request a single KVAC for multiple sub-amounts, and relying on the unlinkable multi-show property of the algebraic MAC scheme to keep these unlinkable.

I'm not sold. This seem to make the scheme more complex and the gain is less things gets communicated over the network, which we don't care as it does not seem to be a significant gain. Is this a fair assessment?

If we can achieve this would be a significant efficiency improvement for input registration over #16 because a single credential can cover many amounts.

I disagree. There's no efficiency gain in the big picture. Empirically I suspect we're fine to transmit data that's <= 100KB without much penalty. FTR I'm going on a bit of tangent: theoretically 512 bytes is a Tor cell size so making data smaller than that makes no difference as Tor would just fill the remaining things up with fake data. In fact, a compression is usually applied before transmitting data, where we can assume something like 1KB of raw data after compressed would get into only a single cell.

A better approach might be to request a single KVAC for multiple sub-amounts, and relying on the unlinkable multi-show property of the algebraic MAC scheme to keep these unlinkable.

I'm not sold. This seem to make the scheme more complex and the gain is less things gets communicated over the network, which we don't care as it does not seem to be a significant gain. Is this a fair assessment?

yes, i think that approach has many downsides and additional complexity, and we should remove it from consideration leaving only the "naive" approach

  • reduced interactions during issuance -> we care, but how can the interactions be reduced while ACL already has the theoretically minimal interactions in the first place?

actually it doesn't, some of the proofs required for the 3 stages are interactive

  • this requires fewer group operations -> what does this mean? does this mean it reduces computational load, in which case we don't really care, or it means it makes the overall scheme simpler, in which case we care?

both senses, but we should mainly care about the simplicity of the protocol, especially in explaining how unlinkability is achieved (i think ACL's main downside is that this aspect is rather complex)

  • reduced interactions during issuance -> we care, but how can the interactions be reduced while ACL already has the theoretically minimal interactions in the first place?

actually it doesn't, some of the proofs required for the 3 stages are interactive

Which proofs are interactive in the ACL-based scheme? I think all of them can be made non-interactive via the Fiat-Shamir heuristic! What am I missing?

Which proofs are interactive in the ACL-based scheme? I think all of them can be made non-interactive via the Fiat-Shamir heuristic! What am I missing?

hmm, you're right about the first one.

Since the Fiat Shamir heuristic is already used in the signature scheme, I thought there was some reason why the proof in the registration phase must be interactive, but reviewing it now I can't find one:

During registration the User and the Signer carry out a standard interactive zero-knowledge proof of knowledge protocol where the User creates a proof π1 to convince the Signer that he knows an opening of the commitment C.

but the sigma protocol for producing the OR proof is still interactive, so even if combining the Preparation and Validation obtaining the signature requires 2 round trips (coordinator first gives rnd, a, a', then user responds with e, and then the coordinator responds with c, r, c', r').

I don't think this is a serious issue since this is per client

Note that there is a third KVAC scheme I was not aware of, which natively supports pseudonymous showing, which removes the requirement for a serial number, but it also supports anonymity revocation using a trusted party, which is undesirable for this use case even if the trusted party's keys are set up with hash-to-curve.

This scheme also supports non interactive ZK proofs without the Fiat-Shamir heuristic, but I don't think this changes anything for our use case since Bitcoin already relies on that in its digital signature scheme, and lacks security proofs for other aspects of its security even in the random oracle model.

i updated the top post to remove the more complicated approach, which has no real advantages (even the ones i speculated about seem invalid).

The coordinator verifies the proofs, and responds with N MACs on the requested attributes, along with a proof of knowledge of the secret key.

Why does the coordinator have to prove he knows the secret key?
Assuming it does need to, then does this mean the coordinator signs some random string the user chooses or there needs to be a more complex zero knowledge protocol to be executed here?

Why does the coordinator have to prove he knows the secret key?

Since the user can't verify the MAC without the secret key, this assures for the user that the parameters are the same for everyone, so that the coordinator can't tag users by using a different key. There may be another reason, I'll need to go over the papers again. It's not required if only using the MAC algorithms as MACs and using VerifyMAC directly, but it is for the key verifiable anonymous credential security.

Assuming it does need to, then does this mean the coordinator signs some random string the user chooses or there needs to be a more complex zero knowledge protocol to be executed here?

Those are kind of the same thing, e.g. a Schnorr digital signature is a zero knowledge proof of knowledge of the discrete log of a group element with respect to some generator, and it's made non-interactive using the Fiat Shamir heuristic, instead of the verifier choosing a challenge randomly, the prover gets one out of a hash function that depends on their initial commitment, forcing them to execute the protocol in the same order.

A bit more formal presentation of the proposed coin splitting and merging is presented below in the context of the KVAC-scheme.

Each registered UTXO has two attributes: a value and a serial number. They are both committed in Pedersen commitments, i.e. $M_v=\mathit{Com}(v,r_v),M_s=\mathit{Com}(s,r_s)$.

  1. Input UTXO splitting
    Since input UTXO values are still committed in Pedersen commitments, hence input UTXO splitting works just like in all the previous schemes.

  2. UTXO merging
    Suppose user wants to merge two UTXOs at output registration with secret values $v,v'$ to a UTXO with public value $v_{\mathit{out}}$. User needs to convince the coordinator that $v_{\mathit{out}}=v+v'$.

User has two Pedersen multi-commitments to the input UTXO values she would like to merge: $C_{y_{v}},C^{'}{y{v}}$. Each commitment has the following structure: $C_{y_{v}}=G^{z}{y_v}M_v=G^{z}{y_{v}}h^{v}g^{r}$. Therefore multiplying the two commitments we'll have $C_{y_{v}}\cdot C^{'}{y{v}}=G^{z}{y_v}M{v}\cdot G^{z'}{y_v}M{v^'}=G^{z+z'}{y{v}}h^{v+v'}g^{r+r'}$. The proof contains of $\pi=(z+z',r+r')$, i.e. coordinator learns $h^{v+v'}$ and checks whether $h^{v+v'}=h^{v_{\mathit{out}}}$.

  • Soundness is achieved as user does not know the discrete logarithms between generator points used in the commitment scheme.

  • Zero-knowledge is ensured as the revealed sums do not leak anything about individual $r,r',z,z'\mod q$.

Hmm, this is exactly equivalent to what we discussed in the call, where we end up with a commitment to 0 and open that as $(z+z'-z'',r+r'-r'')$, right?

Is it worth rewriting these for n credentials instead of 2? there's no loss of generality in assuming n=2, as we did on the board, but I think it would be clearer, and we did free up numerical subscripts by indexing the different attributes as v and s.

Secondly, for each credential M_s commitment is, the user submits $(s, \frac{C_{y_s}}{h^s}) = (s, G_{y_s}^z g^{r_s})$, which the coordinator can then verify without linking to a specific M_s. (edit: also need to prove in zero knowledge DLOGs of these products WRT g and G_{y_s})

Note that after the coordinator has learned s, the original M_s commitments are no longer prefectly hiding, since there is exactly one r_s \in \mathbb{Z}_q where h^s g^{r_s} = M_s.

This is admittedly a contrivance, but I think it's still desirable to preserve unconditional hiding, so that we can be sure that the coordinator can't ever de-anonymize users retrospectively even in case of a crypto break. If I'm not mistaken, I think all we would need is to add another randomness term to the serial number commitment with a 3rd generator, so that $M_s = h^s g_1^{r_{s1}} g_2^{r_{s2}}$.