ngerakines/commitment

Use arbitrary hash to select commit message.

emallson opened this issue · 3 comments

git uses SHA1 hashes for content. It would be nice if we could request contents based on those hashes (or other content, really) in a deterministic way.

Because the hashes used by commitment are generated based on message content, what would be needed is an inexpensive deterministic way to map from the set of all strings to the set of MD5 hashes used by this project.

A deterministic, time-bounded way to do this would be by initializing a random number generator with the numeric representation of the SHA1 / MD5 hash, then sampling from messages with that RNG. Given that it'd always be seeded in the same way, this would be deterministic and O(1). Since we don't need crypto-level hash guarantees, this ought to be sufficient.

The request would look like:

GET http://whatthecommit.com/[index.txt]?content-hash=md5sum

with responses the same as they are presently.

Thoughts on adding this? I'm going to look into submitting a PR to do this this weekend.

So, you are saying that you want to be able to provide a seed (referred to as "content-hash") to the random number generator used to select which message to display. This would allow the same message to be returned with subsequent calls using the same seed.

👍

@ngerakines: yes, exactly!

Yeah, I'm cool with that. Currently it supposes a line hash ( http://whatthecommit.com/d06b941bc4eba888ff7e3d4219809c65 and http://whatthecommit.com/d06b941bc4eba888ff7e3d4219809c65/index.txt ) but those are based on the sha of the line/message displayed. That works fine, but doesn't provide a deterministic way of getting the same name/number with subsequent calls.