modpow implementation is not constant-time
phayes opened this issue · 59 comments
Hi there,
I'm the author of sidefuzz (https://github.com/phayes/sidefuzz) and I have found what appears to be variable-time behavior in the rsa::internals::encrypt()
function. Specifically, rsa::internals::encrypt()
appears to be variable-time in relation to the message. Note that I haven't worked this up into an actual exploit, but merely demonstrated that this function isn't constant-time in relation to the message inputed.
Specifically, the message 20d90c8af42aac9b1ee53dc9a0187201
takes 549894 instructions to encrypt, while the message 5a28cec68d47f6fe3b1df54c9f320f6d
takes 552427 instruction to encrypt. This is a difference of 2533 instructions, or about 0.5%. So it's a slight difference, but probably exploitable with sufficient sampling.
I have crated a fuzzing targets here: https://github.com/phayes/sidefuzz-targets
You can confirm this difference with the sidefuzz tool like so:
sidefuzz check ./target/wasm32-unknown-unknown/release/rsa_encrypt_message.wasm 5a28cec68d47f6fe3b1df54c9f320f6d 20d90c8af42aac9b1ee53dc9a0187201
samples: 20000, t-value: 219771.0790572351, confidence: 100%
Found timing difference of 2533 instructions between these two inputs with 100% confidence:
input 1: 5a28cec68d47f6fe3b1df54c9f320f6d (552427 instructions)
input 2: 20d90c8af42aac9b1ee53dc9a0187201 (549894 instructions)
My first suspicion was that this was due to num_bigint_dig::BigUint::from_bytes_be()
being variable-time, but fuzzing that function specifically results in what appears to be constant-time behavior. So I'm not actually sure where the problem is.
IIRC RSA blinding (which I believe is used in this implementation) uses random number, so while execution time is still variable, it does not correlate with a protected secret.
Hi @newpavlov ,
I don't think that's the case here. The fuzzing target is directly testing this function:
/// Raw RSA encryption of m with the public key. No padding is performed.
#[inline]
pub fn encrypt<K: PublicKey>(key: &K, m: &BigUint) -> BigUint {
m.modpow(key.e(), key.n())
So the problem is going to be in the implementation of modpow
.
I've also got another fuzzing target that does full pkcs1v15 padding with a statically seeded CPRNG. It displays the same variable-time behavior (with a statically seeded deterministic PRNG mind-you), but the first fuzzing target was the more minimal case, so that's what I reported. (sidefuzz also accounts for RNGs by repeated sampling and taking a t-value, but this doesn't apply here)
Admittedly, this could also be an artifact of the fuzzer (which would be a bug in the fuzzer), but I don't think that's the case here either.
- https://github.com/rust-num/num-bigint/blob/7562ab24330792817e42b808f60b0cac51ca261a/src/monty.rs#L69-L73
- https://github.com/rust-num/num-bigint/blob/7562ab24330792817e42b808f60b0cac51ca261a/src/monty.rs#L121
- https://github.com/rust-num/num-bigint/blob/7562ab24330792817e42b808f60b0cac51ca261a/src/monty.rs#L207-L219
may be causing this.
crypto-bigint
now has a constant-time modpow implementation in the form of DynResidue::pow
, however it uses Montgomery form internally so it may not be the fastest option for straight modpow since converting in/out of Montgomery form is costly, and it still monomorphizes around a fixed number of limbs.
For the purposes of rsa
(and dsa
) we probably need to add a new Uint
type like UintVec
which uses a number of limbs chosen at runtime as a proper num_bigint_dig::BigUint
replacement.
Edit: work on crypto_bigint::BoxedUint
has started.
Until this is addressed it seems like something we should more prominently highlight in security-related documentation.
Edit: opened #373 (and merged!)
Hi! 👋
I've recently been working on RSA side-channel attacks, and found that many implementations are vulnerable. That's described in the Marvin Attack.
Thanks to help by @ueno, who contributed the test harness, I was able to run the test against rust-crypto (exact versions of all packages are in the PR).
Unfortunately I have to report that the side-channel leakage from the numerical library is very substantial and easily detectable over the network. As such, I consider RustCrypto RSA to be vulnerable and exploitable.
Test results from a run with 100k repeats per probe on an i9-12900KS @ 5.225GHz:
Sign test mean p-value: 0.2189, median p-value: 0.06354, min p-value: 3.946e-60
Friedman test (chisquare approximation) for all samples
p-value: 8.827139162990364e-108
Worst pair: 2(no_padding_48), 4(signature_padding_8)
Mean of differences: -8.07298e-07s, 95% CI: -9.06918e-07s, -7.019407e-07s (±1.025e-07s)
Median of differences: -8.44500e-07s, 95% CI: -9.78000e-07s, -7.200000e-07s (±1.290e-07s)
Trimmed mean (5%) of differences: -7.27307e-07s, 95% CI: -8.13648e-07s, -6.316737e-07s (±9.099e-08s)
Trimmed mean (25%) of differences: -8.48652e-07s, 95% CI: -9.17708e-07s, -7.769631e-07s (±7.037e-08s)
Trimmed mean (45%) of differences: -9.56301e-07s, 95% CI: -1.05768e-06s, -8.603802e-07s (±9.865e-08s)
Trimean of differences: -4.72000e-07s, 95% CI: -5.76750e-07s, -3.635625e-07s (±1.066e-07s)
pairwise results are in report.txt
confidence intervals for the measurements are as follows:
legend:
ID,Name
0,header_only
1,no_header_with_payload_48
2,no_padding_48
3,no_structure
4,signature_padding_8
5,valid_0
6,valid_48
7,valid_192
8,valid_246
9,valid_repeated_byte_payload_246_1
10,valid_repeated_byte_payload_246_255
11,zero_byte_in_padding_48_4
explanations of that are in the step2.py script in marvin-toolkit repo.
In other words, the protection from adding blinding is not sufficient, and Rust Crypto has at least the same issue as the CVE-2022-4304 in OpenSSL.
@tomato42 indeed that's using PKCS#1v1.5 without random blinding, so with modpow being non-constant-time, it's to be expected. We should definitely get rid of any APIs which permit private key use without random blinding.
Also just a note for the future, per our SECURITY.md we would've appreciated an advisory for this opened under a private security disclosure.
@tarcieri If the de-blinding isn't performed using constant-time code, then use of blinding won't remove the side-channel signal, that's what the bug in OpenSSL was about: removal of the blinder and conversion to the constant-length bytes needs to be side-channel free too.
Also just a note for the future, per our SECURITY.md we would've appreciated an advisory for this opened under a private security disclosure.
ah, sorry about that; since this was linked as a security issue, I've assumed that the effect of it on security of RSA decryption was assumed public too.
Our answer until now has been that the application of random blinding prevented such sidechannels, so the fact that isn't the case is news to us
sorry, I'm confused, on one hand you say that the the privkey.decrypt(Pkcs1v15Encrypt, &buffer);
won't use blinding, yet that's the public API that you say is protected by use of blinding in README.md
...?
The README.md is wrong in that case.
(Also I'm just discovering this and haven't had time to look over any code yet to confirm specific details)
ok, I'm still running test with a larger sample, to see if there aren't additional sources of leakage. If you have any additional questions, feel free to ask
I'd be curious if you saw similar sidechannels in non-PKCS#1v1.5 constructions like OAEP decryption or PSS signatures
if you could propose a change on top of that PR that adds an OAEP decryption, I can run tests for that too (there are scripts in marvin-toolkit for testing OAEP too)
but since the leak is clearly from the numerical library, that means OAEP is also vulnerable, as that leak happens before any OAEP checks
even if PSS leaks like that, it doesn't make the implementation as easily exploitable, as both PKCS#1v1.5 an OAEP attacks are chosen ciphertext attacks, with PSS (or with PKCS#1 v1.5 signatures) the attacker can only reasonably affect the hash being signed, not the whole plaintext
(in other words, it's a fundamentally different attack, so the test for it needs to be different too)
Can you confirm you see the sidechannel against the raw blinded modpow operation (when used with e.g. rand_core::OsRng
)? That's really what I'm curious about.
Sorry, I'm not a rust developer, you will need to hand-hold me (provide the complete code to execute) if you want me to run any tests
The branch containing that code has disappeared: https://github.com/ueno/marvin-toolkit/tree/wip/rust-crypto
You can test the low-level "hazmat" API: https://docs.rs/rsa/latest/rsa/hazmat/fn.rsa_decrypt.html
It would look something like:
...replaced with:
loop {
let mut buffer = vec![0; args.size];
let n = infile.read(&mut buffer)?;
if n == 0 {
break;
}
assert!(n == buffer.len());
let now = Instant::now();
let c = rsa::BigUint::from_bytes_be(&buffer);
let _ = rsa::hazmat::rsa_decrypt(Some(&mut rand_core::OsRng), &privkey, &c);
writeln!(outfile, "{}", now.elapsed().as_nanos())?;
}
Note you'll need to add rand_core
as a dependency and enable the hazmat
feature of rsa
:
[dependencies]
anyhow = "1"
clap = { version = "4", features = ["derive"] }
rand_core = { version = "0.6", features = ["getrandom"] }
rsa = { version = "0.9", features = ["hazmat"] }
(will test rsa_decrypt
from hazmat later)
With 1 million observations per sample (same setup) I'm getting:
Sign test mean p-value: 0.09834, median p-value: 2.633e-08, min p-value: 0.0
Friedman test (chisquare approximation) for all samples
p-value: 0.0
Worst pair: 2(no_padding_48), 4(signature_padding_8)
Mean of differences: -2.06196e-06s, 95% CI: -2.09851e-06s, -2.022175e-06s (±3.817e-08s)
Median of differences: -3.22400e-06s, 95% CI: -3.24600e-06s, -3.205000e-06s (±2.050e-08s)
Trimmed mean (5%) of differences: -1.99461e-06s, 95% CI: -2.02364e-06s, -1.965729e-06s (±2.895e-08s)
Trimmed mean (25%) of differences: -2.08608e-06s, 95% CI: -2.10654e-06s, -2.067761e-06s (±1.939e-08s)
Trimmed mean (45%) of differences: -3.09282e-06s, 95% CI: -3.11547e-06s, -3.070944e-06s (±2.227e-08s)
Trimean of differences: -2.22769e-06s, 95% CI: -2.25800e-06s, -2.198494e-06s (±2.975e-08s
and pairwise results:
report.txt
Which show that the errors in padding checks also leak (compare valid_48
to valid_246
: sign test p-value of 5.13e-7)
ran the tests against this:
let c = rsa::BigUint::from_bytes_be(&buffer);
let _ = rsa::hazmat::rsa_decrypt(Some(&mut rand_core::OsRng), &privkey, &c);
and have confirmed that it's leaking too.
Same overall configuration but with 2.5 million observations per probe:
Sign test mean p-value: 0.1915, median p-value: 0.008017, min p-value: 1.428e-15
Friedman test (chisquare approximation) for all samples
p-value: 6.50448690941627e-27
Worst pair: 0(no_padding_48), 3(too_short_payload_0_1)
Mean of differences: -1.11105e-07s, 95% CI: -1.42260e-07s, -7.775684e-08s (±3.225e-08s)
Median of differences: -9.50000e-08s, 95% CI: -1.19000e-07s, -7.300000e-08s (±2.300e-08s)
Trimmed mean (5%) of differences: -9.89875e-08s, 95% CI: -1.25208e-07s, -7.384255e-08s (±2.568e-08s)
Trimmed mean (25%) of differences: -9.23689e-08s, 95% CI: -1.18946e-07s, -6.803432e-08s (±2.546e-08s)
Trimmed mean (45%) of differences: -9.30858e-08s, 95% CI: -1.17708e-07s, -7.039496e-08s (±2.366e-08s)
Trimean of differences: -9.47500e-08s, 95% CI: -1.20250e-07s, -7.056250e-08s (±2.484e-08s)
Since for raw RSA padding doesn't matter I've executed a slightly different set of tests:
ID,Name
0,no_padding_48
1,no_structure
2,signature_padding_8
3,too_short_payload_0_1
4,too_short_payload_0_3
5,too_short_payload_0_7
6,too_short_payload_0_15
7,valid_0
8,valid_repeated_byte_payload_245_0
and pairwise statistical test results:
report.txt
I'm not sure what solution we can provide here short of a fully constant time implementation (which was the planned long-term solution for this problem, using e.g. a Barrett reduction as provided by crypto-bigint
, but that will require a substantial amount of work)
I guess I need to figure out how to run your reproducer.
Can you explain a bit more what those charts are showing? Are you actually able to recover keys?
I'm not sure what solution we can provide here short of a fully constant time implementation
what's necessary, is ensuring that the unblinding happens in constant time with respect to the output; that means you need to convert the whatever arbitrary precision format is in use for the modulo exponentiation into constant length integers, then multiply them using real constant-time code, and finally do constant time reduction modulo. Something like what's in https://github.com/tomato42/ctmpi but in rust, not in C. Leakage in conversion from one format to the other is not a problem as that value is uncorrelated with the RSA plaintext, so leakage of it doesn't provide useful information to the attacker (as long as the blinding/unblinding factors remain secret).
That being said, if your mod_exp is leaky, you really have to employ both base blinding and exponent blinding. Details of that, as well as links to papers showing attacks against implementations that used just one kind of blinding are in the https://datatracker.ietf.org/doc/html/draft-kario-rsa-guidance-02
But to fix raw RSA implementation against Bleichenbacher/Marvin you just need to fix the last multiplication and reduction modulo.
Can you explain a bit more what those charts are showing?
they show the estimation of the difference in timing for different inputs. Like in the last one, no_padding_48
(probe 0) is processed about 83 ns slower than no_structure
(probe 1): that's why it's marked 1-0 (or "probe 1 time subtract probe 0 time")
Are you actually able to recover keys?
those graphs show that it's possible, using time of the decryption, to execute one instance of the Bleichenbacher oracle. To decrypt a RSA ciphertext (or forge a signature) the attacker would need to run the test good few thousand times. See the details on the Marvin vulnerability page and in the associated papers. Since my goal is proving that there is no side-channel, not actually attacking implementations, I don't have good tooling to actually perform the attack (but then there's 25 years of literature on the topic, so once you have a working oracle, there are ready solutions for the rest).
Another run for rsa_decrypt
, this time with 10.5 million observations per sample:
Sign test mean p-value: 0.1359, median p-value: 2.134e-07, min p-value: 1.433e-60
Friedman test (chisquare approximation) for all samples
p-value: 9.61782206869971e-128
Worst pair: 0(no_padding_48), 8(valid_repeated_byte_payload_245_0)
Mean of differences: -8.28395e-08s, 95% CI: -9.87044e-08s, -6.762672e-08s (±1.554e-08s)
Median of differences: -9.10000e-08s, 95% CI: -1.02000e-07s, -8.000000e-08s (±1.100e-08s)
Trimmed mean (5%) of differences: -8.62429e-08s, 95% CI: -9.87246e-08s, -7.492710e-08s (±1.190e-08s)
Trimmed mean (25%) of differences: -8.64124e-08s, 95% CI: -9.79358e-08s, -7.520178e-08s (±1.137e-08s)
Trimmed mean (45%) of differences: -9.07003e-08s, 95% CI: -1.00972e-07s, -7.931308e-08s (±1.083e-08s)
Trimean of differences: -9.07500e-08s, 95% CI: -1.02000e-07s, -8.025000e-08s (±1.087e-08s)
and pairwise statistical results: report.txt
Which suggests that for the leakage to happen, the whole most significant word needs to be equal 0. This will make attacks against OAEP with 2048 bit keys rather impractical against 64 bit architectures. But it does mean that both 32 bit architectures, and less common key sizes, like 2049 or 2056 bit keys, are rather realistic to attack.
Note: I haven't excluded the possibility of leakage happening with probe 3 or probe 4 (16 and 32 most significant bits being zero respectively), but it does look exactly the same as the leakage pattern for CVE-2022-4304, where I did do that.
what's necessary, is ensuring that the unblinding happens in constant time with respect to the output; that means you need to convert the whatever arbitrary precision format is in use for the modulo exponentiation into constant length integers, then multiply them using real constant-time code, and finally do constant time reduction modulo. Something like what's in https://github.com/tomato42/ctmpi but in rust, not in C.
The core modpow
operation in num-bigint-dig
should already implemented in terms of Montgomery multiplication: https://github.com/dignifiedquire/num-bigint/blob/6f73f0a4025164325e85f8645dad6712b3110c51/src/monty.rs#L129
That alone is insufficient, however, per the OP. I'd generally worry that num-bigint
is a general-purpose arbitrary precision bignum library that wasn't designed for constant-time operation.
I could potentially attempt to completely sidestep num-bigint
for this, extracting the raw limbs out of the BigUint
s and performing modpow in code which has been carefully written to avoid any secret-dependent branching, similar to what we have already in crypto-bigint
: https://github.com/RustCrypto/crypto-bigint/blob/a02c505/src/uint/modular/reduction.rs
Rather than try to hunt down where timing sidechannels are occurring in num-bigint
, I've been attempting to build out BoxedUint
in crypto-bigint
, which now has partial support for sharing the Montgomery multiplication implementation we already provide for the stack-allocated Residue
and DynResidue
types.
It's designed around a fixed (albeit dynamic) precision rather than arbitrary precision, and supports decoding BoxedUint
s with a precision specified a priori, allowing us to ensure that the number of limbs is constant and always equal to the limbs of the modulus. This should eliminate sidechannels due to fewer limbs because zeros are being stripped, which I suspect is what's happening per the earlier measurements.
If we go that route, it will be a somewhat invasive, breaking change, but also one we've been wanting to do for awhile (see #51). I'm open to alternative options which work with num-bigint-dig
, but I'm not sure we'll actually solve this problem without switching to code which has been carefully written from the ground-up to operate in constant-time.
I am fully in favor of finally switching 😅 thanks @tarcieri for pushing crypto-bigint forward!
FWIW it looks like Go used to have similar problems due to their use of big.Int
and solved the issue in a similar manner, by ripping out big.Int
and replacing it with a carefully written internal cryptographic integer library: golang/go@8a81fdf
I've opened a WIP @rustsec advisory here: rustsec/advisory-db#1825
@tarcieri I can probably spend some time helping on this, can you point me at the missing pieces/where you are currently at?
@dignifiedquire great! I've been grinding away at crypto_bigint::{BoxedUint, BoxedResidue}
and they should have most of the functionality necessary at this point: just added Montgomery Ladder-based modpow and modular inversion support. There are probably a bunch of little gaps but they'll mostly be things like adding core::ops
impls that are missing.
I can cut a prerelease of crypto-bigint
and then I think it's ready to start integrating. The trickiest thing will be ensuring constant precision throughout (i.e. that all values match the modulus in precision), although for now several operations have no widening support and will panic on a precision mismatch, so hopefully that "helps".
I was thinking a first step might be to reimplement rsa_decrypt
and then attempt to validate that the new implementation closes the sidechannel.
It's also using subtle
now which should assist in implementing constant-time operations, although it's stimeyed a bit by a Copy
bound on ConditionallySelectable
, which prevents the CtOption
constant-time combinators from working: dalek-cryptography/subtle#94
I've filed RUSTSEC-2023-0071 for this vulnerability.
@tomato42 not sure if you want to add that to https://people.redhat.com/~hkario/marvin/ or wait for a CVE assignment
Thanks for the mention, yes, I've seen it. If you want, I can add it, but I planned to wait for CVE assignment.
To consumers of the rsa crate
While the attack requires the attacker to measure precisely the processing time of the ciphertext, one of the main contributions of the Marvin Attack paper is that it's is possible to measure differences of just few nanoseconds (billionth's of a second) even over the network. The ability to do that depends only of the number of measurements the attacker is able to perform, not on the size of the side channel.
The old wisdom of "side channels smaller than 100ns are not attackable over the network" is incorrect, see the vulnerability page for details.
@tomato42 after we're able to mitigate the core rsa::hazmat::rsa_decrypt
API, it seems like we'll need to start handling padding failures via implicit rejection (producing a pseudorandom output)? My only familiarity with that approach is from Kyber
we'll need to start handling padding failures via implicit rejection (producing a pseudorandom output)? My only familiarity with that approach is from Kyber
if you consider deprecation of PKCS#1 v1.5 encryption padding to be not a realistic option, then yes, that's the best way to provide an API that's about as safe to use as one for OAEP
it seems like we'll need to start handling padding failures via implicit rejection (producing a pseudorandom output)? My only familiarity with that approach is from Kyber
This same approach is also used in most TLS implementations that support RSA encryption. You generate a fake 48 byte pre master secret, then attempt RSA decryption; if the padding is invalid, or if the padding is valid and the PMS is not 48 bytes, or if the embedded TLS version number in the PMS is incorrect, you return the pseudorandom PMS [*] and allow the Finished
checks to fail, since at that point the attacker can not distinguish if the padding was correct or not.
[*] And of course all of this must be in constant time as well.
Yes, but that (generating new random 48 byte long pre master secret to use in case of padding check errors) works in TLS only because the random PMS is later mixed with ServerHello.random before it's used to decrypt the Finished message.
only because the random PMS is later mixed with ServerHello.random
Unfortunate it true since that suggests that for other protocols, which do not perform any mixing but only attempt to use the key directly, there is no possible fix. Since the only two possible options are returning some explicit indicator (which immediately leaks the padding validity, due to the application performing a different action based on if the padding is valid or not) or returning a randomly generated string as a rejection symbol.
(TBH though, I don't at all see why this would be the case or how mixing in the server random would change the situation.)
due to the application performing a different action based on if the padding is valid or not) or returning a randomly generated string as a rejection symbol.
and that's why with implicit rejection on the PKCS#1v1.5 level, the one described in https://datatracker.ietf.org/doc/draft-kario-rsa-guidance/ and in the Marvin paper, the API call always returns a message, and for every key-ciphertext pair, it will return the same ciphertext over and over, even if in actuality the PKCS#1v1.5 padding check failed. Thus the attacker has no oracle to tell which ones are good and which ones are bad, as all "look good".
(TBH though, I don't at all see why this would be the case or how mixing in the server random would change the situation.)
because we have two cases:
- good padding, decrypts to a message, unexpected value (e.g. not matching the AES ciphertext), but the same value every time
- bad padding, decrypts to a random message on every try
so the behaviour of application in case 1 will be consistent for every try, while in case 2 it will be more random (in case of aes-cbc there's 1 in 256 chance that that padding check passes, which will provide huge difference in behaviour)
in case of TLS, those two different situations are also combined with the ServerHello.random, which is different for every try, so the subsequent actions based on that random value will also be random, independent if the decryption returns the same value every time or if it returns random value every time
how stable is the rfc draft? is it at a stage where it makes sense to start implementing?
The algorithm is already implemented and shipped in NSS and OpenSSL, so bearing something catastrophic discovered during review, I consider the algorithm final.
FWIW ring
that is used by rustls
also implements constant-time bigint arithmetic for RSA purposes: https://github.com/briansmith/ring/blob/main/src/arithmetic/
This could be worth a look for potential similar vulnerabilities and/or code sharing.
@Shnatsel contributions to marvin-toolkit welcome (see example/
directory)
For posterity, here's one source of timing variability: calls to "normalize" within Karatsuba
- https://github.com/rust-num/num-bigint/blob/1f6b80fa65e81bd5622ca44d6f4f8522c9e338fc/src/biguint/multiplication.rs#L183-L184
- https://github.com/dignifiedquire/num-bigint/blob/6f73f0a4025164325e85f8645dad6712b3110c51/src/algorithms/mac.rs#L149-L150
The comment says "the adds go faster if we drop any unneeded 0s from the end".
Similar problems exist within the implementation of Almost Montgomery Multiplication, which not only calls normalize
repeatedly but also does some data-dependent branching/operations:
Any update on this?
Hello. I would like to code an exploit for a responsible vulnerability disclosure. I am a beginner in crypto. How much noise time is acceptable in miliseconds to have a reliable exploit please?
If you want I can provide the exploit to the project.
The rsa crate/RUSTSEC-2023-0071 was mentioned on a Debian mailing list:
https://lists.debian.org/debian-rust/2024/08/msg00017.html
This bug is marked as "release critical" in Debian terms and stopping the crate from migrating to testing (and all crates that depend on it, directly or transitively):
@gogo2464 the Marvin toolkit already contains the necessary code to exploit the vulnerability if you'd like to experiment with that.
As it were, it would be quite helpful if we could find some way to run it in CI, if only on a scheduled basis.
Regarding the Debian announcement, it would be good if we could move #394 forward /cc @dignifiedquire
@tarcieri I am actively digging!!!!
I used the marvin toolkit. I have the data. But I need a tool to recover message from the identified vulnerability from the previously extracted data. Do you know a such tool please?
I don't offhand, although I believe it's part of the toolkit.
@gogo2464 it would probably help if you asked these questions on Zulip so we can keep this already quite busy issue on topic for resolving the core issue: https://rustcrypto.zulipchat.com/#narrow/stream/260047-RSA
No, the toolkit doesn't have the exploit as part of it, there are multiple reasons for it, but the main one is that it's hard to create one that's universal. In practice, Marvin toolkit is able to run a single instance of the Bleichenbacher oracle, you need to create a separate piece of code that will call this oracle.
the Marvin paper does give a high level overview, for details you'll need to look at the past papers
How much noise time is acceptable in miliseconds to have a reliable exploit please?
I'm not exactly sure what you're asking here... a reliable exploit requires you to be able to run statistical tests that give highly significant results (p-value below 1e-6 at least) for values we're interested in (i.e. ciphertexts that decrypt to plaintexts starting with specific bytes) or ones that consistently rejects values we're not interested in (i.e. ciphertexts that decrypt to plaintexts that don't start with specific bytes). That's a product of everything: size of the sample, size of the side-channel, size of the noise, nature of the noise (is it changing in time?), amount of irrelevant code measured together with the leaky code, resolution, precision and accuracy of the time source used... just as an example of the most significant parts of that.