cfrg/draft-irtf-cfrg-vdaf

Prio3Count circuit optimization

Closed this issue · 2 comments

The Count circuit currently computes $x^2-x$ using one Mul gate, and some affine gates. This could be slightly improved by instead using a polynomial evaluation gate, with arity 1 instead of 2, to compute either $x^2$ or the whole polynomial. This would save a little bit of space on the wire, and probably some computation time as well. We should benchmark the potential improvement, and then decide whether it is worth breaking wire compatibility of Prio3Count or not.

See divviup/libprio-rs#1090 for preliminary benchmarking results. I didn't see any CPU performance thus far, though there may still be some room for optimizing the implementation. In terms of message sizes, we would save 8 bytes on the leader input share and 8 bytes on the prepare shares.

Thanks for trying this out! Given that you had to roll a special gadget for this to get competitive CPU time, I would say the implementation complexity outweights the bandwidth gains. What's 8 bytes between friends?