/poc-quick-zip-sha256-hash

Calculate the SHA256 hash over a Zip file's 'Central Directory' record (which contains the list of files + their CRC32 checksums)

Primary LanguagePythonMIT LicenseMIT

poc-quick-zip-sha256-hash

Calculate the SHA256 hash over a Zip file's 'Central Directory' record (which contains the list of files + their CRC32 checksums)

Concept

This concept was originally described on a PR (ref: 1, 2) on AUTOMATIC1111/stable-diffusion-webui

@dfaker: @raefu had a nice and consistently fast solution to this: hashing the zip directory section at the end of the file so it's a hash of the attributes and crcs of all the contents.

@dfaker: it's a small section at the end of the file, starting with the signature 0x02014b50, taking the last MB of the .pt sould capture it.

Implementation Overview

  • Open the zip file (model.ckpt) in binary read mode
  • Seek to 22 bytes from the end of the file (the length of the 'end of central directory' (EOCD) record so long as it has no comments in it
  • Read in the EOCD record, extract various parts of it (including the 'central directory' (CD) record offset + length), and do some rudimentrary error checking
  • Seek to the CD record offset
  • Read in CD record length bytes, extract the CD signature, do some rudimentary error checking
  • Calculate the SHA256 hash of the CD record bytes
  • Display the hash + write it to model.sha256-cd

References

Zip File Format