SkeLLLa/node-object-hash

What is the reason for default to sha256?

gajus opened this issue · 5 comments

gajus commented

sha256 is a cryptographic hash. The most likely use case for a library such as this is object fingerprinting. That doesn't require cryptographic safety. Something like https://github.com/Cyan4973/xxHash would likely be a lot better choice.

@gajus the short story: there's no special reason for it, just picked one from built-in algs in node.

The long story. This module was born when I tried to evict duplicates from database. First I've tried object-hash lib (that implemented sha1 and sha256 hashes that time) and it was too slow. Also it has memory leak, so it failed completely. So I've decided to write this one. Just took plan and simple approach and picked the first thing came to mind. It worked almost 100 times faster that old version of object-hash and it completely solved my task. And before publishing it to npm I've just added default values, cause most people are lazy and they want the thing just to work :).

Ideally I think it would be good to provide option for users to pass hasher into "constructor" or use default if noone provided.
The only thing is that hasher should implement interface like in node's crypto library (or at least that looks like that).

interface Hasher {
  create(alg?: string): Hasher;
  update(data: any): Hasher;
  digest(encoding?: string): Buffer | string;
}

, so it can be called like

hasher.create().update(data).digest(encoding)

If you want to make PR or if you have any ideas how to do that better (ideally as non-breaking change) - you're welcome.

Also such approach may help to use this in browsers without any browserify magic. Users just will need to pick hash lib that works in browser and implement Hasher interface if that lib has some differences.

gajus commented

For what it is worth, I have ended up implementing hashing using https://github.com/lovell/farmhash The only downside is that it is native module (and so there in an extra step at a time of npm install).

Yes. Just picking "object sorter" and passing it's output to any hash function will also do the trick.

gajus commented

@SkeLLLa Did this change?

Nope. It's remains the same just to be sure for backward compatibility of hashes produced since version 1.0.0. But in general as I've mentioned before you can use any hash lib on top of results produced by object sorter.

In general hash here is used to provide short representation of string that's produced in order to fit database limits.

So in other words the default hash algorithm likely will not be changed, since it will break backward compatibility, however if you need to use different library it's easy to create a hash on top of "string" representation of objects that this library provides.