why in C?
Closed this issue ยท 25 comments
is it for the constant time
? for performance? wouldn't a pure OCaml implementation (see below) work in the same fashion?
let eqaf a b =
let len = min (String.length a) (String.length b) in
let res = ref 0 in
let get_int idx s = int_of_char (String.get s idx) in
for i = 0 to pred len do
res := !res land ((get_int i a) lxor (get_int i b))
done ;
!res = 0
I'm waiting a long benchmark to check the constant time of the C implementation. However, from my tests, this ML function does not check the equality with a constant time - I'm not sure if it's about the GC noise or not but I will look the assembly code generated to understand why thea ML version of equal
does not have a constant time execution.
I clearly want a OCaml version (for portability) but if I can not check the constant time execution on it, I prefer to use a C function. I will update this issue with my results ๐ !
########## Random ##########
---------- eqaf ----------
min: 295.563355.
max: 451.895365.
mean: 331.880177.
median: 324.179808.
deviation: 5.453890.
deviation: 18.100379%.
---------- eqst ----------
min: 25.332532.
max: 402.526954.
mean: 32.466916.
median: 27.256772.
deviation: 6.122530.
deviation: 1.987797%.
---------- eqml ----------
min: 4115.224663.
max: 7396.156073.
mean: 4862.692974.
median: 4599.915675.
deviation: 27.496528.
deviation: 1337.071723%.
########## Equal ##########
---------- eqaf ----------
min: 311.025058.
max: 612.734348.
mean: 384.195579.
median: 337.415476.
deviation: 8.980862.
deviation: 34.504076%.
---------- eqst ----------
min: 387.414471.
max: 733.811892.
mean: 432.752349.
median: 415.002861.
deviation: 7.110980.
deviation: 30.772933%.
---------- eqml ----------
min: 4082.366589.
max: 9207.859649.
mean: 5259.984148.
median: 4733.766429.
deviation: 35.377266.
deviation: 1860.838571%.
########## Total ##########
---------- eqaf ----------
min: -235.383571.
max: 83.966068.
mean: -52.315402.
median: -16.787965.
deviation: 8.791549.
deviation: -4.599334%.
---------- eqst ----------
min: -706.617198.
max: -31.485017.
mean: -400.285433.
median: -386.676459.
deviation: 7.962065.
deviation: -31.870984%.
---------- eqml ----------
min: -4738.452043.
max: 3084.672993.
mean: -397.291174.
median: -108.030682.
deviation: 40.194553.
deviation: -159.689412%.
Some explanations needed.
eqaf
is the current implementation.eqst
is the OCaml's standard library implementation.eqml
is the OCaml implementation (see the snippet of @hannesm).
We run 2 kinds of test:
random
: 2 lists of random strings. One random element of these list is exactly the same to see behavior specially witheqst
(in this case, and only in this case, time needed byeqst
should be the biggest than all others entries)equal
: 2 lists and all entries are equal each other.
From these entries, expected outputs are:
eqaf
forrandom
entries andequal
entries should be the sameeqst
forrandom
should be the faster but one entry (the only one equal element) should take a long timeeqst
forequal
should take the same timeeqml
should have the same behavior ofeqaf
forrandom
andequal
Then, we do a subtraction between equal
entries and random
entries and outputs expected are:
- diff of
eqaf
should have a very low standard deviation first but the most important is median of entries should close to0
- if we consider than we have a constant-time execution, time needed to executeeqaf
on random inputs with time needed to executeeqaf
on equal inputs should be equal. - diff of
eqst
should have a middle standard deviation and median of entries should be far from0
- indeed time needed onequal
inputs should be the same but should be far from time needed onrandom
input.
Behavior on eqml
is not clear ...
@dinosaure So how do the runtime stats here compare to your expectations?
PS @hannesm your implementation fails to detect this inequality: ๐ฏโโ๏ธ ;) (you should fail and return false
early if lengths differ)
utop # eqaf "ab" "abCDEF";;
- : bool = true
EDIT: For those reading this thread and lacking context, it sprung out of this issue: mirage/digestif#34
IMHO, eqaf
has expected result. time needed to compare equal values or not is the same (a diff between series of times when we compare equal values and series of times when we compare different values is close to 0
- see Total/eqaf/mean = ~50
).
eqst
(so caml_string_equal
) has expected result too. It is the faster implementation when it compares different values but took the same time than eqaf
when values are equal (Equal/eqaf/mean = Equal/eqst/mean = ~400
). A diff between them shows eqst
is not a constant time equal function Total/eqst/mean = ~400
.
Finally, the OCaml implementation eqml
is ... incomprehensible.
And again, I don't like to put another C stub on projects for this trivial function but benchmarks don't help me to avoid this way. If someone can show me an OCaml implementation which is close to what eqaf
has, I will happy to use it.
Good news, Travis CI is happy and standard derivation on eqaf
has an expected result (between -10% and 10% - eqst
exceeds this internal).
Another good new is about eqml
which, only on Travis CI, has expected result (standard derivation is close to 0
). So I'm able to externalize on the API eqml
but I will keep eqaf
and let the user to choose between them.
Finally, I will delete the support on 4.03.0
because Core
...
I think this makes sense. I am still kind of confused about the deviance that the OCaml implementation displays, but hopefully we can work that out. In the meantime, I think it makes sense to go ahead with this C implementation.
I would be curious to see a comparison with these two other implementations:
let eqaf_native a b =
let len = min (String.length a) (String.length b) in
let res = ref 0n in
let get_int idx s = Nativeint.of_int (int_of_char (String.unsafe_get s idx)) in
for i = 0 to pred len do
res := Nativeint.logor !res (Nativeint.logxor (get_int i a) (get_int i b))
done ;
(String.length a = String.length b) && !res = 0n
external unsafe_get64 : string -> int -> int64 = "%caml_string_get64u"
let eqaf_int64 a b =
let len = min (String.length a) (String.length b) in
let res = ref 0L in
let get_int64 idx s = unsafe_get64 s idx in
for i = 0 to pred (len / 8) do
res := Int64.logor !res (Int64.logxor (get_int64 (8*i) a) (get_int64 (8*i) b))
done;
let get_int idx s = Int64.of_int (int_of_char (String.unsafe_get s idx)) in
for i = len - len mod 8 to pred len do
res := Int64.logor !res (Int64.logxor (get_int i a) (get_int i b))
done;
(String.length a = String.length b) && !res = 0L
Beware that if the Travis CI tests run in a VM, you might see different runtimes/behavior in a non virtualized computer.
@UnixJunkie indeed, outputs on this issue come from my server (and can be replicated easily on your computer). And, as I said, problem on Mac OSX about eqml
should be about noise of virtualization.
@gasche I will prepare some benchmarks about that ๐
Of course my first eqaf_int64
version was wrong (didn't handle the modulo-8 leftover at the end of the inputs etc.), I edited it. I'm still not fully sure that it is correct, but it looks better now.
@gasche eqaf_native
and eqaf_int64
can leak the length of the unknown string (which may or may not be an issue depending on the code), you should do the length comparison before running through the for loop.
EDIT: The point being that then you can exit early if the length differ.
Cannot the attacker also guess the length of the unknown string by calling the exit-early version many times with strings of each length, and see which exit early? I tried to avoid that, but of course you are right that running min s1 s2
iterations of the loop also leaks information. I guess that a safe formulation should be explicit into which argument is the secret and which is attacker-provided, and always run as many steps as the attacker-provided length.
(Apologies if I say something silly, I know nothing about timing attacks.)
@gasche In theory, yes, but that depends on the application. If the attacker passes a long string they could (if they have a baseline for your string comparisons) provide one very long string and get the length of the compared-against string with only one invocation of eq
. If it exits early, they at least need multiple chances to compare against the same sensitive value.
On the other hand this makes it easier to determine the length (since the diff between a 128 vs 127 string would be much easier to distinguish if you exit early than if the attacker has to time the diff between 127/128 invocations), so you have a point.
I would still prefer exiting early, also because the performance of the neq
case would be impacted less, leading to less overhead compared to vanilla OCaml String.equal
in that case.
So, some news about previous benchmarks. I rewrote the benchmark to avoid dependency with the world^Wcore_bench
and I have good results now about eqml
. You can see the new implementation on #4 which follows mostly what core_bench
do but the biggest change is about time where we use now a monotonic timer.
Then, I integrated snippets of @gasche and results on my computer (X1 Carbon, virtualization) is:
########## Random ##########
---------- eqaf ----------
min: 0.001152.
max: 0.002241.
mean: 0.001906.
median: 0.001954.
deviation: 0.015425.
deviation: 0.000000%.
---------- eqst ----------
min: 0.002506.
max: 0.027956.
mean: 0.025222.
median: 0.025612.
deviation: 0.058866.
deviation: 0.000015%.
---------- eqml ----------
min: 0.000080.
max: 0.000162.
mean: 0.000133.
median: 0.000136.
deviation: 0.004615.
deviation: 0.000000%.
---------- eqnt ----------
min: 0.000057.
max: 0.000176.
mean: 0.000109.
median: 0.000097.
deviation: 0.005999.
deviation: 0.000000%.
---------- eqbg ----------
min: 0.000736.
max: 0.001512.
mean: 0.001254.
median: 0.001305.
deviation: 0.014299.
deviation: 0.000000%.
########## Equal ##########
---------- eqaf ----------
min: 0.001005.
max: 0.002599.
mean: 0.001691.
median: 0.001458.
deviation: 0.022845.
deviation: 0.000000%.
---------- eqst ----------
min: 0.000904.
max: 0.002425.
mean: 0.001603.
median: 0.001576.
deviation: 0.020863.
deviation: 0.000000%.
---------- eqml ----------
min: 0.000069.
max: 0.000161.
mean: 0.000129.
median: 0.000138.
deviation: 0.005324.
deviation: 0.000000%.
---------- eqnt ----------
min: 0.000096.
max: 0.000167.
mean: 0.000135.
median: 0.000138.
deviation: 0.004099.
deviation: 0.000000%.
---------- eqbg ----------
min: 0.000020.
max: 0.001221.
mean: 0.000913.
median: 0.000998.
deviation: 0.016835.
deviation: 0.000000%.
########## Total ##########
---------- eqaf ----------
min: -0.000838.
max: 0.001058.
mean: 0.000215.
median: 0.000343.
deviation: 0.023630.
deviation: 0.000000%.
---------- eqst ----------
min: 0.001392.
max: 0.026118.
mean: 0.023619.
median: 0.024057.
deviation: 0.058371.
deviation: 0.000014%.
---------- eqml ----------
min: -0.000051.
max: 0.000075.
mean: 0.000004.
median: 0.000003.
deviation: 0.005738.
deviation: 0.000000%.
---------- eqnt ----------
min: -0.000092.
max: 0.000067.
mean: -0.000026.
median: -0.000023.
deviation: 0.006447.
deviation: -0.000000%.
---------- eqbg ----------
min: -0.000410.
max: 0.001049.
mean: 0.000341.
median: 0.000333.
deviation: 0.018763.
deviation: 0.000000%.
Your results are a bit hard to interpret for me. Correct me if I'm wrong:
-
the
deviation
is lower foreqml
andeqnt
than foreqaf
on Random (and all are sensibly lower thaneqst
of course), so the OCaml implementations are even more constant-time than the C version, excepteqbg
which is about the same? -
assuming we can use
mean
to draw performance conclusion, the numbers suggest that the OCaml-based solutions are sensibly faster (when called from an OCaml program) than the C version, doesn't that suggest just getting rid of the C code? -
in the Equal case there is essentially no performance difference between
eqml
(the naive version) andeqnt
(using native machine integers), so this suggests using the simpler/clearereqml
version from @hannesm. -
The OCaml constant-time functions also seem much faster than
eqst
(usingString.compare
) on equal strings, which is somewhat strange. Does this suggest thatString.equal
should be reimplemented as OCaml code rather than a C primitive? -
I don't understand why the
mean
value is different between Random and Equal for the same function, aren't the inputs of the same length in both cases, so shouldn't we get the same mean values? Are you testing enough rounds? -
The mean of
eqbg
is ten times smaller in the Equal case than in the Random case, isn't that wrong?
deviation(eqml) = 0,004615 < deviation(eqnt) = 0,005999 < deviation(eqbg) = 0,014299 < deviation(eqaf) = 0,015425 < deviation(eqst) = 0,058866
From these results, as you said, yes, the OCaml implementation (eqml
) is more stable than the C implementation (eqaf
). Then, we get expected results about eqst
, this implementation about random inputs is not stable.
-
It's not a surprise than the OCaml implementation (
eqml
) is faster than the C implementation (eqaf
). I think, we can get this kind of results (instead previous version of the benchmark) only because I decided to use a monotonic clock to benchmark them (which is less subject to noise). And from these results, I completely agree to use the OCaml implementation now. -
Again, I completely agree with this. The code of
eqml
is more readable thaneqnt
. -
Indeed
eqml
is more faster thaneqst
when we compare 2 samestring
(but physically different). I need to figure out why and see the ASM code produced of both, I think. -
mean(eqml) = 0,000133
on Random inputs is sensibly equal tomean(eqml) = 0,000129
on Equal inputs. It's the same foreqnt
(mean(eqnt) = 0,000109
andmean(eqnt) = 0,000135
). But, you are right, meaneqaf
andeqbg
differ between Random and Equal inputs. Mean ofeqst
differs too but it's an expected result. -
For your last point, I need to figure out why we can have this situation.
Currently, size of inputs is 4096
and we produce 50 string
. For each string
we run the equal function until 1 second. Then, we see how many runs we did. I think, I will run a bigger benchmark.
Yes, it's not surprising that the OCaml version is faster because it can avoid the (small but noticeable) FFI costs at the boundary of the call, saving the registers to bridge the different calling conventions etc. This also explains why String.equal (which I think is C code) is slower. This is a per-C-call cost so the difference should be less noticeable on large strings. What is the size of the strings you use to benchmark?
I will provide a test on Windows and then make a new release of it with the OCaml implementation and all will happy ๐ !
I guess with the FFI version it also becomes harder to filter out jitter from the OCaml runtime (since we are looking to be constant time
in the sense that execution time does not depend on the data [apart from the length]; not in the sense that each execution should take a constant amount of time to execute).
We can extrapolate that if something does take a constant time to execute for all inputs of a given length, the execution time is also constant-time (independent of the data), but we cannot go the other way.
So the FFI version, while it might be correct, is harder to measure correctly.
thanks to all who contributed, since this repository now uses a purely OCaml-based equal I'll close this issue.