/quickxorhash

Primary LanguageCMIT LicenseMIT

This python library is a Cython wrapper for the original C implementation. quickxorhash is a C library (libqxh) implementing Microsoft's QuickXORHash algorithm.

Algorithm

QuickXORHash is a non-cryptographic hash function that XORs the input bytes in a circular shift pattern, then XORs the least significant bits of the hash with the input length. The canonical representation of the resulting hash is a Base64 encoded string, because hexadecimal is too plebeian.

Usage

>>> import quickxorhash
>>> h = quickxorhash.quickxorhash()
>>> h.update(b'hello world')
>>> print(h.digest())
b'h(\x03\x1b\xd8\xf0\x06\x10\xdc\xe1\rrk\x03\x19\x00\x00\x00\x00\x00'
>>> import base64
>>> print(base64.b64encode(h.digest()))
b'aCgDG9jwBhDc4Q1yawMZAAAAAAA='

for files, you can use below script

import quickxorhash, base64, sys, os

inf = sys.argv[1]
h = quickxorhash.quickxorhash()
with open(inf, 'rb') as f:
    bufsize = 64 * 1024
    tempsize = os.stat(inf).st_size
    tsize = tempsize
    while tempsize > 0:
        if tempsize < bufsize:
            bufsize = tempsize
        buf = f.read(bufsize)
        tempsize -= bufsize
        h.update(buf)
        print(f'\r{int(100 - ((100 * tempsize) / tsize))}%', end='')

print(f'\r{base64.b64encode(h.digest()).decode()}')