pytorch/opacus

Opacus uses unsafe floating-point noise generation, even with secure_mode=True

TedTed opened this issue ยท 5 comments

๐Ÿ› Bug

The _generate_noise primitive has a secure_mode parameter. When set to True, the documentation claims that the noise distribution is secure against floating-point attacks. This is wrong for two reasons.

  • The approach taken, which relies on this paper, only defends against one specific attack. It still generates a distribution that has "holes" in the output space, creating distinguishing events that break DP. It just makes the exact positions of these "holes" a little harder to guess, but there is no exact quantification of how hard it would be under standard cryptographic assumptions.
  • Floating-point attacks not only rely on flaws present in the noise generation primitive, but in the addition step as well. In particular, precision-based attacks are successful even when the noise generation primitive is perfect. In particular, diffprivlib, which relies on the same paper, is still vulnerable to such attacks.

Solutions to this vulnerability include using an approach based either on discretization (like what GoogleDP does), or on interval refining (like what Tumult Analytics does).

Additional context

As an aside, I do not understand why it makes sense to even have a secure_mode parameter, especially if the default is False. In which context does it make sense to use a DP library but not want the output to be actually DP?

Thanks a lot to @TedTed for raising this to us. We totally agree that the current mitigations in Opacus is just an initial step towards various types of attacks to differential privacy. Note that the floating-point is just one of such, there are much more different types (e.g., timing attack). We will try to make Opacus more robust when we have bandwidth.

@TedTed's second point is very interesting, and here is my POV: DP stands as a captivating theoretical construct. Nonetheless, its practical application necessitates certain concessions. For instance, the privacy amplification for DP-SGD only holds when there is "Poisson" subsampling. Despite this, many practical applications employ it with different subsampling technique, resulting in an estimation of the theoretical epsilon that still serves as a reasonable approximation.

Moreover, Opacus primarily serves as a platform for rapid prototyping and experimentation of novel ideas and algorithms. While it's feasible to integrate all essential mitigations, doing so could lead to other trade-offs such as diminished running speed, increased memory usage, or strain on developers' bandwidth, which slows down the development of other important features.

Thanks for the comment. Two notes.

  1. Floating-point vulnerabilities are fundamentally different than timing attacks in that timing attacks only have a practical impact when the attacker can measure computation time. This is only the case for interactive use cases (like someone sending DP queries to data they don't have access to), and I'm not sure this ever makes sense for machine learning use cases. Floating-point vulnerabilities, on the other hand, can lead to real-world impact even if the attacker had no control over the training process.

  2. I'm surprised to hear that Opacus is primarily meant for prototyping and experimentation, and that the (ฮต,ฮด) guarantees can be approximations and not upper bounds. None of this seems to be prominently indicated in the documentation, which even suggests otherwise in multiple places. The FAQ says "Importantly, (epsilon, delta)-DP is a conservative upper bound on the actual privacy loss.". The introduction blog post mentions "Safety" as a core feature of Opacus. The version number, 1.0, suggests this is mature software that can run in production. As a result of all of this, Opacus ends up being used by other software libraries (like SmartNoise-Synth) and repackaged by vendors of synthetic data generation or private machine learning training software. What are your plans to make sure downstream users of your software are aware of its safety limitations?

Thanks for your comments! Although I am not persuaded that the floating point attack is the top thing we should worry about right now, and whether there is a simple mitigation that incurs minimal memory and QPS degradation. I do agree that a better documentation is essential and should be helpful to the community. I will write a small post, as well as adding inlined comments in the next version. Thanks again for raising this issue to us.

I see this is now closed, but I'm not seeing any new post discussing this, the FAQ hasn't changed, and the documentation hasn't changed either. Was this closed by mistake?