typicode/lodash-id

Asynchronous inserts

artiee opened this issue · 4 comments

Just a thought.
Insert-function finds the max id value and adds 1. It would be possible to obtain the same id for newly inserted id when writing multiple values asynchronously. Maybe use a lock or randomize id? Using random id (e.g. hash) would enable distributing db over multiple files and operating on each separately, thus allowing greater scalability...

Interesting thought I must say.
It's true that readability was preferred to scalability for id generation and that it doesn't scale well with big collections.

At the moment, I guess a simple solution to that could be to override _.createId like this:

var _ = require('underscore');
_.mixin(require('underscore.db'));
_.createId = function() { return new Date().getTime(); }

It creates a timestamp instead of an incremental id.

I think that timestamp-based approach is simple and effective. I don't know what would be the odds to do insert exactly at the same millisecond, but it certainly isn't a production-ready solution. Adding some short random string after that would in theory help. The problem however is that Math.random() is seeded with current time, so more elegant solution is required.

Here's how MongoDB does it:

A BSON ObjectID is a 12-byte value consisting of a 4-byte timestamp (seconds since epoch), a 3-byte machine id, a 2-byte process id, and a 3-byte counter. Note that the timestamp and counter fields must be stored big endian unlike the rest of BSON. This is because they are compared byte-by-byte and we want to ensure a mostly increasing order. Here's the schema:

0123 456 78 91011
time machine pid inc

Traditional databases often use monotonically increasing sequence numbers for primary keys. In MongoDB, the preferred approach is to use Object IDs instead. Object IDs are more synergistic with sharding and distribution.

http://docs.mongodb.org/manual/reference/object-id/
http://stackoverflow.com/questions/4650334/mongodb-custom-and-unique-ids

So probably adding some counter, which starts with a random value will solve the problem. After that, welcome sharding and distribution :)

After some research, found some really nice libraries and gists that solves the problem way better than what I've proposed :)

Especially https://github.com/pid/puid implements something similar to what MongoDB does. So if Underscore.db is run in a Node environment puid can very well be used to generate ids.

I've also released a new version without the auto-increment id for faster inserts and added a section about this in the README.

Nice solution! puid seems to be a perfect fit for this.