/couchbase-channels

Manages Couchbase sync for apps where each user has many mobile devices. Experimental.

Primary LanguageJavaScript

This code is one implementation of a process I expect will be common among mobile CouchApps: coordinating sync for millions of mobile devices.

The first target is just registration of devices via an email confirmation loop. The second target is automatic synchronization of all of a users Channels across devices.

A channel is just a CouchDB database, except it's distributed across potentially multiple users and synchronized via Couchbase CouchSync. Couchbase uses standard CouchDB replication as the communication backbone. All messages between the cloud and devices take place over replication, so the device can always configure its relation to the cloud, even when it is offline. And the cloud can continue to process device updates even if they come in via a non standard route (the photo galleries you created on vacation got synced to your laptop before they got synced to the cloud, but it doesn't matter, all 3 endpoints can still manage the same dataset.)

Couchbase Channels takes the guesswork out of manages lots of databases per user. Currently, today, Aug 22nd, the code barely works, I'm just happy to have a proceedural framework for writing them in place, and some meaningful tests. The tests are in a weird style. Feel free to out compete them in other tests for weirder style.

This repo really contains an application. The application has already spawned an extension to some other frameworks I started on some time ago. So here is how it works.

The app code looks like this:

control.safe("channel", "ready", function(doc) {
    var channel_db = urlDb(doc.syncpoint);
    channel_db.insert({
        _id : 'description',
        name : doc.name
    }, errLog);
});

control.unsafe("device", "new", function(doc) {
  var confirm_code = Math.random().toString().split('.').pop(); // todo better entropy
  sendEmail(doc.owner, confirm_code, function(err) {
    if (err) {
      errLog(err)
    } else {
      doc.state = "confirming";
      doc.confirm_code = confirm_code;
      db.insert(doc, errLog);      .
    }
  });
});

So basically it listens to the Couch _changes feed and pattern matches against doc.type and doc.state. It runs your code on the documents that match it.

There are two modes: safe and unsafe. Safe mode is for functions which are safe to be run twice by accident. This can happen if you have multiple workers on the same database. Safe functions are safe because if they run in multiple processes concurrently (or they crash in the middle and are re-run) they will not create unwanted side-effects. In fancy talk, they are "idempotent".

Unsafe mode is for functions that must not be run twice by racing bots, for instance if you are sending an email use unsafe mode or there's a chance you might send it twice concurrently.

In workloads where you have a high degree of concurrent activity, and the work each transaction does is expensive, you are more likely to want to use unsafe mode for everything. But for normal stuff if you just keep your functions idempotent and lightweight, you are just as well off having some bots duplicate work that gets thrown out, than trying to coordinate bots via Couch MVCC.

It is expected that your functions will, as part of their operation, save their triggering document back to the database, usually with a new state. In this example I'm using nano

The goal is to keep your call backs for each individual state as small as possible. Each should be a transaction. This makes the scope for errors introduced by retries limited.

TODO: Basically Everything

  • refactor code for readability
  • package some of the libraries in a proper npm embedded way (maybe docstate just rides with stately)
  • come up with a better way to test it
  • spec out the remaining doc workflows
  • build the parts the depend on being in an app (or document the API the app must meet)

License:

Apache 2.0 copyright 2011 Couchbase Inc author Chris Anderson jchris@couchbase.com