idrassi/HashCheck

Peculiarities of Blake3

Opened this issue · 1 comments

Hi,

Today I've been switching my hashes from md5 and sha1 to Blake3. To do so, I've been running my previously calculated hashes to make sure my data was fine and then I've been calculating the Blake3 hashes.

My general feeling is that the Blake3 is, in general, faster, but I've noticed a few "peculiarities" that I wanted to mention just in case there's something wrong going on:

  1. During the calculation of my hashes, I've felt (I didn't time it up) that Blake3 was faster than md5 and sha1, but there were a few cases where it was incredibly faster. I've been unable to identify a pattern, a type of file where Blake3 performs so much better. For example, this divergence in speed has been observed with video files: in some cases, the hash was calculated in a normal speed; in some other cases, in a real fast speed. Therefore, I can't say Blake3 performs better with video files as opposed to other types of files.

  2. My experience using hashes is that the calculation speed remains constant during the whole process. However, with Blake3, I've noticed a few cases where, after having completed 20-30% of the process at a particular speed, the rest of the calculation was done almost instantaneously.

Is everything as expected? Thanks!

I've noticed similar. I think it's due to the file being hashed having been cached by Windows in some way. For example, I've hashed a file using BLAKE3 via HashCheck, and then again the same multi-gigabyte file using XXH3 via OpenHashTab, and the XXH3 checksum was generated in mere seconds when the BLAKE3 checksum took many minutes to generate moments ago. I think things are functioning as expected. I do not know how to clear this "cache", or whatever it may be, to properly compare the hashing algorithm speeds.

I've also noticed large files that I've not previously hashed being hashed more quickly than expected. Perhaps it's due to the file's content being easy to hash for some reason, or probably more likely that it's still this "cache" at work. It's not a problem however, it just means generating checksums faster at times.