Secretness Analysis is potentially incorrect for ops that are not `IsolatedFromAbove`
AlexanderViand-Intel opened this issue · 9 comments
I think that, as written, the analysis might only be correct for operations that are IsolatedFromAbove
(i.e., do not use SSA values from above/outside their own region), which the scf
operations aren't:
%p = ... : i32 // something public
%s = ... : i32 // something that's (potentially only "transitively") secret
%r = scf.if(%cond : i1) -> i32 { // assume a public condition
%sum = arith.addi %s, %s : i32 // secret
%scf.yield %sum : i32 // also secret
} else {
%scf.yield %p : i32 // doesn't really matter for this example
}
%ss = arith.addi %r, %r : i32 // should ALSO be secret, but won't be marked as such?
@MeronZerihun could you come up with a "real" test for this and see if my suspicion is correct? Thanks!
Weirdly enough, I couldn't reproduce this, as the below example DOES recognize that the output of the first if is secret, despite the fact that the intial secretness value is false!?😕
func.func @test_secret_in_region(%p: i1, %s: !secret.secret<i1>, %cond: i1) -> !secret.secret<i1> {
%0 = secret.generic ins(%s : !secret.secret<i1>) {
^bb0(%s_: i1):
%1 = scf.if %cond -> (i1) {
%ss = arith.addi %s_, %s_ : i1
scf.yield %ss : i1
} else {
scf.yield %s_ : i1
}
%2 = scf.if %1 -> (i1) {
%pp = arith.addi %p, %p : i1
scf.yield %pp : i1
} else {
scf.yield %p : i1
}
secret.yield %2 : i1
} -> !secret.secret<i1>
return %0 : !secret.secret<i1>
}
unexpectedly produces the correct thing:
func.func @test_secret_in_region(%arg0: i1, %arg1: !secret.secret<i1>, %arg2: i1) -> !secret.secret<i1> {
%0 = secret.generic ins(%arg1 : !secret.secret<i1>) {
^bb0(%arg3: i1):
%1 = scf.if %arg2 -> (i1) {
%4 = arith.addi %arg3, %arg3 : i1
scf.yield %4 : i1
} else {
scf.yield %arg3 : i1
}
%2 = arith.addi %arg0, %arg0 : i1
%3 = arith.select %1, %2, %arg0 : i1
secret.yield %3 : i1
} -> !secret.secret<i1>
return %0 : !secret.secret<i1>
}
However, something WEIRD is clearly going on here, as the following example causes the pass/compiler to crash:
func.func @test_secret_in_region(%p: i1, %s: !secret.secret<i1>, %cond: i1) -> !secret.secret<i1> {
%0 = secret.generic ins(%s : !secret.secret<i1>) {
^bb0(%s_: i1):
%1 = scf.if %cond -> (i1) {
%ss = arith.addi %s_, %s_ : i1
scf.yield %ss : i1
} else {
scf.yield %s_ : i1
}
%2 = scf.if %1 -> (i1) {
%pp = arith.addi %p, %p : i1
scf.yield %pp : i1
} else {
scf.yield %p : i1
}
%3 = scf.if %2 -> (i1) {
%pp = arith.muli %p, %p : i1
scf.yield %pp : i1
} else {
scf.yield %p : i1
}
secret.yield %3 : i1
} -> !secret.secret<i1>
return %0 : !secret.secret<i1>
}
The relevant section of the stack trace:
#5 0x000055d5bd619cac std::_Optional_base_impl<bool, std::_Optional_base<bool, true, true>>::_M_is_engaged() const /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/optional:433:58
#6 0x000055d5bd81e025 std::optional<bool>::has_value() const /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/optional:945:9
#7 0x000055d5bd81df75 mlir::heir::Secretness::isInitialized() const /proc/self/cwd/./lib/Analysis/SecretnessAnalysis/SecretnessAnalysis.h:32:32
#8 0x000055d5bd81df99 mlir::heir::Secretness::getSecretness() const /proc/self/cwd/./lib/Analysis/SecretnessAnalysis/SecretnessAnalysis.h:25:5
#9 0x000055d5bda521fc mlir::heir::IfToSelectConversion::matchAndRewrite(mlir::scf::IfOp, mlir::PatternRewriter&) const /proc/self/cwd/lib/Transforms/ConvertIfToSelect/ConvertIfToSelect.cpp:41:14
#10 0x000055d5bda5210b mlir::detail::OpOrInterfaceRewritePatternBase<mlir::scf::IfOp>::matchAndRewrite(mlir::Operation*, mlir::PatternRewriter&) const /proc/self/cwd/external/llvm-project/mlir/include/mlir/IR/PatternMatch.h:331:12
I think perhaps the problem is that the pattern is replacing the second if, but then the resulting (newly created) SSA value does not have a secretness value in the lattice. I seem to recall discussing this with Meron, and she had a means to set the secretness value after replacing the op, but I cannot find where that was added to the pass.
Thanks for looking into this! I guess there's technically two issues there: 1) the secretness should be set on the newly created if and 2) the pass shouldn't crash when called on something with secretness not set. @MeronZerihun and I will have a look in our meeting today :)
Regarding the original bug: In our meeting yesterday, we found out why the false negative issue doesn't show up directly for scf.if
: The sparse forward data flow analysis framework handles control flow operations (specifically, anything with the RegionBranchOp Interface, I think) itself, so the buggy region-ignoring code in the analysis itself isn't called. I'll still PR a bugfix for this later today, as we might call the analysis on other region-bearing OPs that aren't handled by the analysis framework.
Btw, here's a "working" example of the region bug this issue started out with, with a region-carrying operation that does not implement RegionBranchOpInterface
and therefore does not benefit from "magically correct" handling by the analysis framework:
func.func @secret_region(%p : i1, %s: !secret.secret<i1>) -> !secret.secret<i1> {
%c0 = arith.constant 0 : index
%c1 = arith.constant 1 : index
%0 = secret.generic ins(%s: !secret.secret<i1>) {
^bb0(%s_: i1):
%t = tensor.generate %c1 {
^bb1(%i : index):
%sp = arith.muli %p, %s_ : i1
tensor.yield %sp : i1
} : tensor<?xi1>
%t0 = tensor.extract %t[%c0] : tensor<?xi1>
%if = scf.if %t0 -> (i1) {
%pp = arith.addi %p, %p : i1
scf.yield %pp : i1
} else {
scf.yield %p : i1
}
secret.yield %if : i1
} -> !secret.secret<i1>
return %0 : !secret.secret<i1>
}
produces
func.func @secret_region(%arg0: i1, %arg1: !secret.secret<i1>) -> !secret.secret<i1> {
%c0 = arith.constant {secretness = false} 0 : index
%c1 = arith.constant {secretness = false} 1 : index
%0 = secret.generic ins(%arg1 : !secret.secret<i1>) attrs = {secretness = true} {
^bb0(%arg2: i1):
%generated = tensor.generate %c1 {
^bb0(%arg3: index):
%2 = arith.muli %arg0, %arg2 {secretness = true} : i1
tensor.yield %2 : i1
} {secretness = false} : tensor<?xi1>
%extracted = tensor.extract %generated[%c0] {secretness = false} : tensor<?xi1>
%1 = scf.if %extracted -> (i1) {
%2 = arith.addi %arg0, %arg0 {secretness = false} : i1
scf.yield %2 : i1
} else {
scf.yield %arg0 : i1
} {secretness = false}
secret.yield %1 : i1
} -> !secret.secret<i1>
return %0 : !secret.secret<i1>
}
despite the %generated
tensor clearly needing to be "secret".
Working on creating a fix PR for both this and the crashes that came up
While exploring this further, I actually think I found a bug in the upstream data flow analysis.
For RegionBranchOpInterface operations such as scf.if
, it only seems to consider the yields of each branch for the final output of the analysis, essentially ignoring the condition. So the result of an scf.if
with two public branches but a secret condition is incorrectly marked non-secret:
func.func @region_branch_op(%arg0: i1, %arg1: !secret.secret<i1>) -> !secret.secret<i1> {
%c0 = arith.constant {secretness = false} 0 : index
%c1 = arith.constant {secretness = false} 1 : index
%0 = secret.generic ins(%arg1 : !secret.secret<i1>) attrs = {secretness = true} {
^bb0(%arg2: i1):
%1 = scf.if %arg2 -> (i1) {
%2 = arith.addi %arg0, %arg0 {secretness = false} : i1
scf.yield %2 : i1
} else {
scf.yield %arg0 : i1
} {secretness = false}
secret.yield %1 : i1
} -> !secret.secret<i1>
return %0 : !secret.secret<i1>
Because the if-to-select pass replaces the if with a new op that's marked as secret, this doesn't break this pass, but it's still a bug. I'll see if there's some way the pass can override the RegionOpBranch interface handling.... This "quick fix" is really sending me down a rabbit hole xD
Because the if-to-select pass replaces the if with a new op that's marked as secret, this doesn't break this pass, but it's still a bug. I'll see if there's some way the pass can override the RegionOpBranch interface handling.... This "quick fix" is really sending me down a rabbit hole xD
Unfortunately, this kind of behavior seems to be baked into how the analysis framework deals with control-flow operations. Since our use case is pretty odd, I'm not sure if it's worth filing an upstream issue for this...
I agree, though we could always try adding it as an opt-in callback/virtual function. Why don't you pose the question on the MLIR Discourse?