/imagehash

A Python Perceptual Image Hashing Module

Primary LanguagePythonBSD 2-Clause "Simplified" LicenseBSD-2-Clause

ImageHash

A image hashing library written in Python. ImageHash supports:

  • average hashing (aHash)
  • perception hashing (pHash)
  • difference hashing (dHash)
  • wavelet hashing (wHash)

Travis Coveralls

Rationale

Why can we not use md5, sha-1, etc.?

Unfortunately, we cannot use cryptographic hashing algorithms in our implementation. Due to the nature of cryptographic hashing algorithms, very tiny changes in the input file will result in a substantially different hash. In the case of image fingerprinting, we actually want our similar inputs to have similar output hashes as well.

Requirements

Based on PIL/Pillow Image, numpy and scipy.fftpack (for pHash) Easy installation through pypi.

Basic usage

>>> from PIL import Image
>>> import imagehash
>>> hash = imagehash.average_hash(Image.open('test.png'))
>>> print(hash)
d879f8f89b1bbf
>>> otherhash = imagehash.average_hash(Image.open('other.bmp'))
>>> print(otherhash)
ffff3720200ffff
>>> print(hash == otherhash)
False
>>> print(hash - otherhash)
36

The demo script find_similar_images illustrates how to find similar images in a directory.

Source hosted at github: https://github.com/JohannesBuchner/imagehash

Changelog

  • 4.0: Changed binary to hex implementation, because the previous one was broken for various hash sizes. This change breaks compatibility to previously stored hashes; to convert them from the old encoding, use the "old_hex_to_hash" function.
  • 3.5: image data handling speed-up
  • 3.2: whash now also handles smaller-than-hash images
  • 3.0: dhash had a bug: It computed pixel differences vertically, not horizontally.
    I modified it to follow dHash. The old function is available as dhash_vertical.
  • 2.0: added whash
  • 1.0: initial ahash, dhash, phash implementations.