/covercouch

Per-document ACL engine for CouchDB.

Primary LanguageJavaScriptMIT LicenseMIT

Cover Couch 0.1β

CoverCouch implements per-document r/w/d ACL for CouchDB. CoverCouch acts as proxy – original CouchDB REST API kept untouched, but all requests to Couch – r/w/d, _changes feed, _view, _update, _list or other fn call, replication – everything is filtered.

Document ACL is defined using creator,owners and acl properties of a doc. Their values, combined by _design/acl/_view/acl view function, reflect final ACL for a doc.

Also CoverCouch implements per-method fine-grained ACL – some paths like _update/someFnName can be restricted for several roles or users. CoverCouch can even restrict on query basis – for example we can allow attachments=true only for several roles.

All these rules, ACL view function and other ACL-related stuff are stored in _design/acl design doc. This ddoc defines access rules for particular CouchDB bucket.

Buckets that have no ACL ddoc, behave as native CouchDB.

Other CoverCouch features:

  • multi-worker, workers are independent,
  • has rate-locker, rejects excessive activity early,
  • very fast – atomic ACL resolve is sync and takes <10µs,
  • non-polling replies without attaches are gzipped in most cases,
  • docs can inherit ACL from parent docs
  • syncs with other CouchDBs and PouchDBs.

Special note: reduce and _list work fine, since they are emulated and ingest only filtered _view feeds.

Quick start

CoverCouch 0.1 is standalone app, it’s not a module right now. To install and run CoverCouch:

  • CouchDB 1.6–1.7 and node.js 0.10.35+ required, never tested with Couch 2.x
  • $ git clone git://github.com/ermouth/covercouch.git folderName
  • $ cd folderName
  • $ npm install
  • Edit general settings in /cvr/config.js
  • Run $ node covercouch

For buckets listed in couch.preload section of /cvr/config.js, design docs _design/acl are created automatically (if no present). Default ddoc template is located in /cvr/ddoc.js.

Now you have CouchDB wrapped with r/w/d ACL.

Per-document ACL

Below text describes ACL behavior with default _design/acl ddoc. You can write your own implementation of it.

Per-document ACL is defined using creator,owners and acl properties of a doc. Also its parent property may point to ‘parent’ doc – in this case ACL is inherited from parent, if any.

All these properties are optional. If the first three are skipped, doc assumed to be free for r/w/d by any bucket user.

doc.creator string

Format is "userName" or "u-userName". User, that can perform any action with the doc, if op requested was not restricted on path basis.

Creator, once set, can not be changed by non-admins. Non-admin can not set creator for new doc other than himself.

doc.owners array

List of users and roles, who have very same permissions as creator, but they can not:

  • delete the doc,
  • modify creator and owners properties.

This property must look like ["u-userName1", "r-role1", "r-role2", "u-userName2", ...].

doc.acl array

List of users and roles, that can read doc or attaches and call _update functions for the doc, that are not restricted on path basis. Format is same to owners.

doc.parent string

Pointer from ‘child’ doc to its ‘parent’, _id of ‘parent’ doc. Parent ACL is superimposed with doc ACL, the most permissive rules win.

Useful for comment-like docs – they may inherit ACL from parent post. Changes in parent ACL modify resulting access rules of children without changing child docs themselves.

Example docs

{
    "_id": "123abc", "_rev": "1-abcd", 
    "type":       "message",
    "creator":    "u-mom",
    "owners":     ["u-dad"],
    "acl":        ["r-Johnsons", "u-kitchener"],  
    "body":       "What about summer fence? Ain’t it too early?"
}
--
{
    "_id": "234def", "_rev": "1-7390", 
    "type":       "comment",
    "creator":    "u-jim",
    "parent":     "123abc",  
    "body":       "Ok, unboxed it."
}

Important edge case

Please note, that _update/function/docid requests are validated using READ document permissions, not WRITE.

Updates assumed safe – in general they change only several properties of the doc and control values received. Access to _update functions themselves can be limited using per-bucket restrictions.

This combination allow readers, for example, mark doc as read or add some other data to doc using appropriate _update. Compared with general ‘write document’, that can totally destruct the doc, _update functions modify docs in controllable way.

Choice between r-w-u-d and r-w-d was made when I analyzed how real sets of these permissions might look like. In nearly every case read permissions were equal to update permissions – so special set of update permissions was removed.

Per-bucket ACL and restrictions

Design doc _design/acl may have properties restrict and/or dbacl:

  • Object restrict allow to fine-tune permissions for particular CouchDB REST functions.
  • Object dbacl is superimposed with any doc-defined ACL during access rights resolution.

Example:

{
    "_id": "_design/acl", "_rev": "1-2345",
    "views":{"acl":{"map":"function(doc){...}"}},
    "acl": [],
    "restrict":{
        "*": ["r-marketing", "r-sales", "u-boss", "u-cfo"],
        "get":{
            "*attachments=true": ["u-cfo"]
        },
        "post":{
            "*attachments=true": ["u-cfo"]
            "_update/approveBudget": ["u-cfo"]
        },
        "put":{
            "*": []
        }
    },
    "dbacl":{
        "_r": ["u-cfo", "u-boss"],
        "_w": ["u-boss"]
    }
}

Array restrict.* have special meaning – it restricts users and roles, that have access to the bucket. Main difference between CouchDB security object and restrict.* is that buckets, inaccessible for user, are eliminated from /_all_dbs reply.

Objects restrict.get, restrict.post and so on limit access to particular CouchDB API functions. Their keys are path fragments. Two wildcards are possible for keys:

  • * is one or more characters;
  • + is one or more characters, other than /.

Above example ddoc’s restrict means that:

  • only marketing and sales depts, boss and CFO see this bucket;
  • no one (except admins, surely) can put doc or attach into bucket directly;
  • only CFO can call approveBudget update function (from unspecified ddoc);
  • only CFO can fetch data with attaches included.

Example dbacl property means, that CFO and boss can read any doc from bucket regardless of rules in per-doc ACL. Boss also can write into any doc.

Properties acl, creator and owners, defined for design doc, only restrict access to ddoc itself, it’s body and attaches, not to functions it expose. Above example ddoc is marked invisible for all users except admins with "acl":[].

How request is processed

Generally, request is processed by several middlewares. Each processor evaluates some restrictions and pass request through, or modifies and then pass, or rejects it.

General sequence for bucket-related request:

  1. Rate locker rejects request if thread is out of capacity or remote client makes too many requests.
  2. Session manager checks user creds or session and reject invalid.
  3. DB locker rejects request if user have no permissions to deal with requested bucket.
  4. Method locker rejects request if user have no rights to exec requested method and/or query.
  5. If create/write requested, input data is filtered. Docs, that user have no permissions to write into, are eliminated from request.
  6. Request is passed to CouchDB
  7. CouchDB applies own security rules and validate_doc_update from _design/acl, that denies invalid ACL-related properties changes.
  8. CouchDB response is filtered, docs that user is not allowed to read, are eliminated.
  9. Response is sent or piped to user.

Processors and mappings between CouchDB API routes and flow chains are contained in /cvr/router.js and /cvr/restmap.js files.

Some technical details

RAM

CoverCouch is memory-intensive. Entire bucket ACL is memcached on first access or start. Moreover, each worker has its’s own ACL cache, they are not shared.

This approach allows to resolve ACL synchronously in microseconds – but it costs ~300–500 bytes of RAM for each doc, and you should multiply result by number of workers.

So if you have 1M doc DB that need per-doc ACL (very rear case in CouchDB world), you need 500Mb+ of RAM for each worker.

Also when CoverCouch pipes, it need about 3 times more RAM, then two subsequent rows transmitted. Be careful if you inline 100Mb attach in JSON – you may need to wire ~400Mb to process pipe slice.

Fetch/resend vs pipe

Fetch/resend strategy is used for ‘not very long’ requests that can produce set of rows. CoverCouch fetches entire CouchDB response, filters it and resends gzipped reply to client.

‘Not very long’ means that no inlined attachments expected and request has some range limiting keys (startkeyendkey,keys or key).

Fetch/resend strategy allows to send response faster (sometimes much faster) due to compression and unnecessary response fragmentation removal.

Pipe strategy is used for potentially ‘long’ requests: feeds, or requests with attaches inlined, or with no query limits.

Single-doc and attachment GETs are also piped.

Auto restart

Each worker restarts daily at an hour, defined in workers.reloadAt conf key. Restart takes back frozen and leaked memory and terminates hung feeds. Sibling threads never restart simultaneously – min gap is defined in workers.reloadOverlap.

Limitations

List functions

Since _list functions are emulated inside CoverCouch, they do not support provides() inside. Gonna fix it in 0.2.

Authorization methods

Only cookie and basic auth supported. Request with user:pwd@domain.name are treated as they have no auth in URL.

Length of _id

Length of _id property is limited to 200 chars by default to speed up regexp, that digs out doc _ids from pipe without parsing JSON.

Doc _id length limit is defined in couch.maxIdLength conf property. This limit does not in any way restricts creation of docs with longer _ids. The limitation means ‘we assume DB has no docs with ids longer, than 200 chars’.

Weird behavior of limit query param

Since CouchDB response is filtered, we can not expect, that limit param works properly in all cases. CouchDB can, for example, send 10 docs – and they all may be eliminated from response by ACL.

To avoid this behavior do not use CoverCouch as an intentional filter, ACL engine was not intended to be a filter.

For example, do not use ACL-filtered _all_docs to retrieve all user docs. Much better way is to make special view for it and them tap it with key range. Also this approach is much faster.

Same for limit. Use special views and key ranges, not limit, to fetch predictable set of docs.

Futon

Futon is visible only for admins. Also please note, that Logout link in Futon does not work since it use _:_@your.couch.url/_session auth syntax.

No COPY method

COPY request processors are not yet implemented.

Known issues and plans

Tests suits and demos are underway. Same for interactive ddoc JSON editor (see current version at http://cloudwall.me/etc/json-editor.html).

Also going to implement precache-free ACL mode – async and more slow, but less memory demanding.

Please, feel free to open issues or contribute.


© 2015 ermouth. CoverCouch is MIT-licensed.