Question: what is "memLimit" ?
Closed this issue · 12 comments
I noticed this:
int memLimit = byteToInt(getBytes(i, 8));
if (memLimit < 0 || memLimit > 0x20000000) {
throw new Exception("too much memory aaah");
}
What is this? That's 536,870,912 bytes...
Does it mean the library can't handle an APKM file that's larger than that?
This is taken from https://libsodium.gitbook.io/doc/password_hashing/default_phf and defined in https://github.com/jedisct1/libsodium/blob/master/src/libsodium/include/sodium/crypto_pwhash_argon2id.h
And the memLimit is checked not to be too high, but it's only used in the key derivation bit, the actual decryption is handled by the cryptoSecretStream*
stuff
But can it be an issue?
Reading on the link, it says:
"
memlimit is the maximum amount of RAM that the function will use, in bytes. This number must be between crypto_pwhash_MEMLIMIT_MIN and crypto_pwhash_MEMLIMIT_MAX.
"
Searching the Internet, crypto_pwhash_MEMLIMIT_MAX is 4,398,046,510,080 .
Does it mean that the whole APKM file is read into memory? And why the added limit here of 536,870,912 bytes ?
What does it mean about the input file?
The pwhash functions are only applied to a 32 byte segment in the file to derive the encryption key (so that each file has a different encryption key, based around a hardcoded key).
The encryption key is then used to decrypt the rest of the file, in chunkSize
sized chunks (~1M?) so as not to read the whole file into memory.
The limit itself is actually also read straight from the file, and is present as a long
- the checks there just make sure the values aren't too ridiculous!
I don't think I understand. Can you please just answer this (yes/no, with optional explanation) :
- Can it be an issue? Reading what you wrote, my guess is that no, unless something weird is going on.
- Can an input file that's too large cause an issue (such as OOM) ? Is the entire content of the file being read into memory? My guess according to what you wrote is no, because I think you meant that it always use up to 1M of memory.
However, if it's up to 1M, why even add this check... - Is there any importance to the check you've added to it with this special number limit on it? Shouldn't it be based on the available memory, or even some parameter that is sent to the library (from whoever called it) ?
- Probably not
- No because the decryption happens later in the function. The memLimit is only used for the single call here but is necessary for libsodium
- Yes, the memLimit and opsLimit set the bounds on a
memory-hard, CPU-intensive hash function
so needs to be not too high and the special number limit is there as a quick sanity check to make sure it's not super duper high. The libsodium docs have several different values ranging from
crypto_pwhash_MEMLIMIT_INTERACTIVE
crypto_pwhash_MEMLIMIT_MAX
crypto_pwhash_MEMLIMIT_MIN
crypto_pwhash_MEMLIMIT_MODERATE
crypto_pwhash_MEMLIMIT_SENSITIVE
(see https://libsodium.gitbook.io/doc/password_hashing/default_phf#guidelines-for-choosing-the-parameters and isn't very cut-and-dry to exactly what the limit should be)
- I see
- So suppose I run the app, the max memory size it uses should be around 1MB , as you said?
I've ran a profiler, and seems you are right. It doesn't seem to affect memory usage:
- Why this number though? The number I've found is way larger...
- What happens if it is larger, and I avoid this check? What could occur?
- Does it matter how large the APM file is? Does this number
memLimit
change based on it?
- It's fairly arbitrary :p If it's larger your computer will probably end up using more memory...
- No the
memLimit
is read from the header of the file, see https://github.com/souramoo/unapkm/blob/master/src/main/java/org/example/UnApkm.java#L81
- I see.
- Yes I know it is. But whatever that has made the file - does this part of the header depend on the file size? And what happens if I ignore it and continue, even though it's quite large? Just might use more memory?
But according to my tests, it doesn't seem to use any special size of memory. In fact it's un-noticeable .
Do you think that the memory usage is only theoretical?
I checked on an input APKM file that is 46,083,349 bytes, and the memLimit that it got is 268,435,456 .
That's quite a lot. The condition you got is 536,870,912
If I assume the relation is the same (meaning that memLimit is about x5 times larger than the file size) , then it means that if I find a file that is 125MB , it could reach this limit, and then the library will fail to extract it, even though in practice there is enough memory, and even though I don't even see it using much memory (and in fact almost nothing).
I think you should have a setting of this limit, to avoid such cases.
I think you misunderstand the part of the code the memLimit is for - it is for applying to the key generation function that is only a few bytes long (24-30), not to the ciphertext (several MB). Thus the relation you mention is not going to scale linearly, as the password is the same length regardless of file size. Plus the memLimit is the maximum RAM to use for this function not the actual amount of memory it will use, and I suspect it will stay the same regardless of if this was different each file.
But I didn't write any of the code that generates these APKM files so you might need to experiment with different files and report back what you find ;)
OK I tested back on the smaller APKM file.
It got me 268,435,456 again.
So maybe it stays this way.
When you say "I suspect it will stay the same regardless of if this was different each file" , you mean the memLimit ? Or the actual memory usage?
I know that you try to convince me that memLimit is probably always the same, and according to my tests, I think that it doesn't even need this much memory and will use much less in practice.
However, the point I'm trying to make:
Regardless of the explanation of what memLimit is (which because I'm not an expert on those things, I can't talk about it) : What if you get to an APMK file that memLimit is too large, yet there is enough memory? In this case it will fail to handle the file, even though the PC/device can still handle it.
And, even if it's larger than the memory that the PC/device has, it doesn't always mean it won't be able to handle it. Maybe you could provide a way to override it, because as you told, the memory being used is in practice much lower than what's written here.
I think the memLimit will stay the same and the actual memory usage will stay the same, but happy to be proven wrong!
Have now provided a parameter to override it just in case it is necessary ;)
Fair enough. Thank you