Prio3 VDAF for vectors with a bounded number of 1s
Closed this issue · 21 comments
Section 4.5 of Samplable Anonymous Aggregation for Private Federated Data Analysis notes that it's possible to efficiently validate that a vector consists of zeros and ones, with at most some constant number of ones. This could be done by generalizing Prio3Histogram. (Open question: does this belong in this document or a separate one?) Here's how I'd implement this:
- The parameters would be the length,
$n$ , the chunk length, and the maximum number of ones,$m$ . - Input length would be
$n+\left\lfloor{log_2(m)}\right\rfloor+1$ . Let$l_m=\left\lfloor{log_2(m)}\right\rfloor+1$ , the number of bits needed to represent$m$ . - Encoding: The first
$n$ field elements are just projected measurement values. Let$c=2^{l_m}-1-m$ , a constant offset to let us achieve the sum limit we want. Then, for input values$x_i$ , compute$c+\sum_i{x_i}$ , encode it as a bit vector, and append it to the encoded measurement. - Truncation: Take the first
$n$ elements. - Decoding:
$id$ - Validity circuit: Evaluate the range check polynomial
$x(x-1)$ on allINPUT_LEN
elements, and combine the results in a random linear combination (using the parallel sum optimization). Separately, evaluate the following, where$x_i$ are the measurement values from the encoded measurement, and$s_j$ are the bits of the$c+\sum_i{x_i}$ from the encoded measurement.$$c + \sum_{i=0}^{n-1} {x_i} - \sum_{j=0}^{l_m-1} {2^j s_j}$$ Add the range check and sum check intermediate values together in another random linear combination, and return the result.
If the generalization is straight-forward, then I think we can put it here.
Can you explain why this range check is sufficient? (Suppose
Can we make the
During encoding, we require the client to commit to the sum of its measurement and the constant c in some auxiliary bits. Because we apply range checks to the auxiliary bits, then
Note that c is equal to zero when m is one less than a power of two. As a concrete example with an "interesting" value of c, let's set m to 5. Five fits in three bits, so
The VDAF described here, with m set to 1, is not semantically equivalent to Prio3Histogram, since it would allow measurements with either zero or one ones, while Prio3Histogram only allows exactly one one. (This additional flexibility is reflected in encoded measurements being one element longer, for
Cool, this sounds like a new Prio3 variant. I don't know how many variants we want in the core document -- I suppose we should discuss this. But if we have concrete use cases in mind for this, I don't see why we couldn't add it here.
One other way I can imagine using this is to for aggregating a Bloom filter: Each measurement would be a contribution to this filter, where the contribution is a bitvector with between
What would you call this thing?
Another observation: There is an implicit generalization of the Sum
circuit here that I think would be useful for Prio3Sum
, and perhaps even Prio3SumVec
. Though for the latter we would want to make sure we don't blow up the communication overhead for the common 0/1-vector use case.
What would you call this thing?
One suggestion, from discussions about Client-side DP mechanisms: NoisyHistogram
?
We could consider renaming Prio3Histogram to Prio3OneHotVector and making this new idea Prio3ManyHotVector
Thanks for putting this together @divergentdave ! I think this new Prio3 variant is useful for checking bounded number of 1s after the one-hot vector goes through randomized response Client-DP.
One clarification is: this assumes
We could consider renaming Prio3Histogram to Prio3OneHotVector and making this new idea Prio3ManyHotVector
+1, I would suggest Prio3OneHotVec
and Prio3MultiHotVec
, or if we want to keep Prio3Histogram, can we call the new one Prio3MultiHotHistogram
.
Alternatively we can keep Prio3Histogram
an alias to Prio3OneHotVec
@junyechen1996 it looks like we're going to want this circuit for the DP draft for PPM, correct? If so, what do you think about at least adding it to the reference code here? If useful enough we could also consider putting it in the spec.
Also, I just wanted to suggest a potential alternative circuit inspired by a discussion with @pag-crypto: If the range is relatively small, then instead of bit-encoding the sum and doing an equality check, an alternative would be to do a kind of look-up table thing. (This isn't exactly Paul's idea, so don't blame him if it turns out to be broken.)
Let
This means more multiplications, but the upside is we don't break wire compatibility. Plus, we might get away with not increasing the number of ParallelSum
-calls. What do you think @divergentdave?
Yeah, that makes sense, I think that's just the "range check polynomial" applied in a new place. One nit: none of the alternatives would be wire compatible, since the proofs wouldn't verify naturally, but this would let us keep the same AFE encoding/decoding, and only change the validity circuit.
There's going to be a complicated performance and proof size tradeoff between the two, since the range check polynomial could have a high degree with certain parameter choices. Both approaches would work better with small ranges of valid Hamming weights, but I'm sure they scale differently as the range of Hamming weights broadens.
Oh hmm, I wouldn't want to implement this if we had to use a high-degree gadget like PolyEval
. I was thinking we could just Mul
for this, which we already have?
Yeah, that makes sense, I think that's just the "range check polynomial" applied in a new place. One nit: none of the alternatives would be wire compatible, since the proofs wouldn't verify naturally, but this would let us keep the same AFE encoding/decoding, and only change the validity circuit.
You're right, the proof would change.
There's going to be a complicated performance and proof size tradeoff between the two, since the range check polynomial could have a high degree with certain parameter choices. Both approaches would work better with small ranges of valid Hamming weights, but I'm sure they scale differently as the range of Hamming weights broadens.
Oh hmmm, I wouldn't want to use a high-degree gadget for the range check. Couldn't we use Mul
, via ParallelSum
?
The two straightforward ways to do this would be to put the entire polynomial in one PolyEval
gadget (one gadget call, arity 1, degree equal to the number of allowable s values) or to build it out of affine gates and many many Mul
gadgets (gadget call count equal to one less than the number of allowable s values, arity 2, degree 2).
ParallelSum wouldn't be easy to apply, because at minimum we'd need some separate multiplication gadget calls upstream of our ParallelSum gadget. Since subcircuit outputs get added up, that means we have to start by expanding the polynomial. We could end up burning gadget arity on constant inputs in order to meet the "identical subcircuit" restriction as well.
The goal of ParallelSum is to balance out the number of gadget calls and the gadget's arity, so we could instead take inspiration from that, define one PolyEval gadget that uses a smaller degree range check polynomial, feed in 's' minus various constants into that gadget, and then finish up the larger range check polynomial by feeding the output of the small polynomials through Mul
.
Before we get too far in the weeds specifying something that complicated, do we have a good idea for how wide the range of allowable Hamming weights would be randomized response applications with typical DP parameters?
Before we get too far in the weeds specifying something that complicated, do we have a good idea for how wide the range of allowable Hamming weights would be randomized response applications with typical DP parameters?
I think that depends a lot on the dimension, for e.g. with 100,000 dimension, I think the range is between dozens to hundreds.
Bit-checks it is then :)
I'd say this is ready for work. Who'd like to take it?
I can take this. I will start with a Python implementation in poc folder. Does that sound good @cjpatton @divergentdave ?
SGTM.
Next step is to decide whether to assign a codepoint for Prio3MultiHotHistogram and, potentially, add it to the draft. It may end up being a dependency for https://github.com/wangshan/draft-wang-ppm-differential-privacy/, in which case we'll at least need to assign the codepoint
Now might be a good time to figure how how this is supposed to work. I'll try to get some feedback at CFRG on how to manage codepoints for VDAFs (cc/ @bifurcation in case you already have a vision for this.)
@divergentdave do you have thoughts about if/how to incorporate this new type into the draft?
My latest thought here is that we should just replace Prio3Histogram with Prio3MultiHotHistogram, since the latter is just a generalization.
Replacing it seems reasonable. For Prio3Histogram use cases, the encoded measurement will get one field element longer, which is negligible. It won't be possible to enforce "exactly one hot" anymore, but only "zero or one hot". I think that should be fine, by and large. Choosing the "wrong" bucket affects the aggregate twice as much as choosing no buckets when you "should" choose one, so this isn't giving malicious clients an advantage. If choosing zero buckets doesn't make sense semantically, like in a multiple-choice survey, I think results will still be meaningful even if some zero bucket measurements get included.
How about Prio3MultiHotVec
for the name? The path dependence of the "histogram" name is starting to wear on me.
Works for me. As for renaming: so far we have tried to align the name of the circuit with the data type of the aggregate result rather than than the type of the measurement. I'm OK breaking this convention if it's annoying :)
Erring on the side of not breaking compatibility, I'm going to add this as a new Prio3 variant and assign it a codepoint. I'll take @divergentdave's suggested name of Prio3MultiHotVec
.