/massive-js

A data mapper for Node.js and PostgreSQL.

Primary LanguageJavaScriptOtherNOASSERTION

Massive.js: A Postgres-centric Data Access Tool

node Build Status Coverage Status Greenkeeper badge npm

Massive.js is a data mapper for Node.js that goes all in on PostgreSQL and fully embraces the power and flexibility of the SQL language and relational metaphors. Providing minimal abstractions for the interfaces and tools you already use, its goal is to do just enough to make working with your data as easy and intuitive as possible, then get out of your way.

Massive is not an object-relational mapper (ORM)! It doesn't use models, it doesn't track state, and it doesn't limit you to a single entity-based metaphor for accessing and persisting data. Massive connects to your database and introspects its schemas to build an API for the data model you already have: your tables, views, functions, and easily-modified SQL scripts.

Here are some of the high points:

  • Dynamic query generation: Massive's versatile query builder supports a wide variety of operators, all generated from a simple criteria object.
  • Low overhead: An API built from your schema means no model classes to maintain, super-simple bulk operations, and direct access to your tables without any need to create or load entity instances beforehand.
  • Document storage: PostgreSQL's JSONB storage type makes it possible to blend relational and document strategies. Massive offers a robust API to simplify working with documents: objects in, objects out, with document metadata managed for you.
  • Relational awareness: Massive does not traverse relationships or build model graphs, but deep inserts can create related entities and junctions transactionally, and the decompose option allows you to map the results of complex views and scripts to nested object trees.
  • Transactions: New in v5, use db.withTransaction to execute a callback with full Massive API support in a transaction scope, getting a promise which fulfills if it commits or rejects if it rolls back.
  • Postgres everything: Commitment to a single RDBMS lets us use it to its full potential. Massive supports array fields and operations, regular expression matching, foreign tables, materialized views, and more features found in PostgreSQL but not in other databases.

Full Documentation

Full documentation including API docs is available on GitHub Pages.

Contributing

See CONTRIBUTING.md.

Table of Contents

Installation

npm i massive --save

Starting with version 3.0.0, Massive requires ES6 support and uses Promises exclusively. If you need a callback-based API or are using a pre-6.x release of Node.js, install massive@2 or download version 2.x from the Releases page.

Examples are presented using the standard then() construction for compatibility, but use of ES2017 async and await or a flow control library such as co to manage promises is highly recommended.

Connecting to a Database

Once installed, require the library and connect to your database with a parameter object or connection string:

const massive = require('massive');

massive({
  host: '127.0.0.1',
  port: 5432,
  database: 'appdb',
  user: 'appuser',
  password: 'apppwd'
}).then(db => {...});

The returned db object constitutes an API for your schema, with tables, views, functions, and scripts attached. Read on to walk through using them, or consult the documentation for specifics.

A Brief Example

Let's say we have a database for a software testing application. This database contains a tests table and an issues table, where one test may have many issues. In a separate auth schema, it contains a users table referenced by the others to represent a user running a test and discovering issues. There is a test_stats view which calculates statistics on aggregate issue information for individual tests, and a user_tests view which returns all users with their associated tests; and there is a copy_tests function which clones a test for reuse.

Our testing application can leverage the API Massive builds for almost everything it needs to do, but there is one feature that we haven't been able to integrate as a database function yet: the ability to, in one call, clear a test's issues and update its status to signify that it has been restarted. Eventually, we'll get there, but for now it's a SQL script in our application's /db directory, resetTest.sql.

Persistence

After we initialize and connect Massive, all these entities are available on the instance. First, we need to create a user:

db.auth.users.save({
  username: 'alice',
  password: 'supersecure'
}).then(alice => {...});

alice is a JavaScript object containing the username and password we specified. But our users table has more columns than that: first and foremost, there's a primary key, an id column, which uniquely identifies this user record. A user also has a role, a created_at timestamp defaulting to the current date on insert, and an updated_at timestamp to track when the record was last modified. So in addition to the fields we provided, alice has an id, a role, a created_at, and an updated_at. Since we didn't specify values for role or updated_at, these are null.

When Alice resets her password, we issue a save again, including alice's primary key value:

db.auth.users.save({
  id: 1,
  password: 'evenmoresecure'
}).then(alice => {...});

The save will search by the primary key in the object and modify only those fields we include. Since Alice's username isn't changing, we don't need to include that in the object we're passing. However, it won't hurt anything if we do, so we could have simply modified the original alice object and passed that in instead.

Now that alice exists in the system, she can start a test. However, we don't want to bother with updating those from the same code path. insert will take care of that:

db.tests.insert({
  name: 'application homepage',
  url: 'http://www.example.com',
  user_id: alice.id
}).then(test => {...});

Last bit of housekeeping: alice still doesn't have a role, and we may have added more users without roles as well. Let's perform a bulk update to ensure that we're giving people the right access levels:

db.auth.users.update({
  'role is': null
}, {
  role: 'default'
}).then(users => {...});

Retrieval

Some time later, we want to retrieve that test. But we don't have the object returned from save, so we need to go back to the database with the primary key:

db.tests.findOne(1).then(test => {...});

In the mean time, Alice has been busy and discovered several problems which are now stored in the issues table. We can see how many she's found with count:

db.issues.count({test_id: 1}).then(total => {...});

Since Postgres' count returns a 64-bit integer and JavaScript only handles up to 53 bits, total will actually be a string, although thanks to JavaScript's weak typing this generally doesn't matter. Next, let's actually pull out the issue data:

db.issues.find({
  test_id: 1
}, {
  order: [{field: 'created_at', direction: 'desc'}]
}).then(issues => {...});

The first argument to find is a criteria object. Records are matched against each criterion; a key or takes an array of nested criteria objects, at least one of which must be fully matched to qualify for inclusion in the resultset.

The second argument defines options to modify the query behavior; here, we're sorting the issues most recent first. There are many other options which affect query shape and results processing, and the options object can be used with almost all of the retrieval and persistence functions. The output of our find call is the issues array, which contains all records in that table matching the criteria we passed to find.

There are other retrieval functions: where allows us to write more complex WHERE clauses than those find can generate based on the criteria object, and search performs a full-text search against multiple fields in a table. The documentation has more information on these.

Deletion

After review, it turns out that one of the issues Alice discovered was actually the application working as designed, so she needs to delete the isssue. We can do that with destroy:

db.issues.destroy(3).then(issues => {...});

The issue has been deleted, and the record returned -- in an array this time, since destroy can be used with a criteria object just like we used find to retrieve multiple issues.

Functions and Scripts

Bob wants to start testing the homepage, but doesn't want to go through the entire setup process. Fortunately, there's a copy_test function which will let him build on Alice's work, if he passes in the test id and his userid to assign the clone to himself:

db.copy_test(test.id, bob.id)
  .then(test => {...})

There's an important note here: this example assumes that Massive has been initialized with enhancedFunctions. With this flag enabled, Massive detects the shape of database functions' output, and will return a single record object -- or even a scalar value -- as appropriate. Since copy_test only makes one copy, it does return the record object. Without enhancedFunctions, this invocation would return an array containing the single record.

Shortly after Bob starts testing, the application is redeployed underneath him, invalidating the results he's gathered so far. He could delete issues with destroy either individually or in bulk, but it's faster to use the resetTest script. This works exactly as if it were a database function, except that enhancedFunctions does not perform any result type introspection, so the results will always be an array:

db.resetTest(test.id).then(tests => {...});

Views and Resultset Decomposition

After Alice has finished testing, she wants to see how her results compare to Bob's. We can query the test_stats view just like we did the issues and tests tables, with exactly the same API functions -- the only difference is that, since it's a view, we can't persist data to it.

db.test_stats.find({
  url: 'http://www.example.com'
}).then(stats => {...})

stats is an array of records from the view which match the criteria object.

Many views (or scripts!) combine related results from multiple tables. The user_tests view is one such. Rows might look like this:

user_id username test_id name
1 alice 1 first
1 alice 2 second
2 bob 3 third

Databases are limited to working with this kind of information in terms of flat tables and relationships, and when you have a situation where Alice has multiple tests, that means Alice appears twice in the output. In JavaScript, however, we're more accustomed to working with object graphs where, instead of parent entities (users) being duplicated, the descendant entities (tests) are nested. Something like this:

[{
  id: 1,
  username: 'alice',
  tests: [{
    id: 1,
    name: 'first'
  }, {
    id: 2,
    name: 'second'
  }]
}, {
  id: 2,
  username: 'bob',
  tests: [{
    id: 3,
    name: 'third'
  }]
}]

Massive can transform any view or script result into an object graph with the decompose option. The value of decompose is a schema which represents the desired output format. To generate the structure above:

db.user_tests.find({}, {
  decompose: {
    pk: 'user_id',
    columns: {
      user_id: 'id',
      username: 'username'
    },
    tests: {
      pk: 'test_id',
      columns: {
        test_id: 'id',
        name: 'name'
      },
      array: true
    }
  }
}).then(...)

Documents

The tests table represents a fairly limited picture of what exactly Alice and Bob are doing. An individual test may have a lot more data associated with it, and this data could be wildly different depending on what precisely is being evaluated, so simply adding more columns to tests isn't really an ideal solution. Postgres' JSONB functionality allows for a more free-form approach than relational databases otherwise support. Working with JSONB fields is certainly possible with the suite of standard table functions, but Massive also allows the dynamic creation and usage of dedicated document tables with a separate set of functions based on the relational data persistence and retrieval functionality.

We can create a document table dynamically by calling saveDoc:

db.saveDoc('test_attributes', {
  productVersion: '1.0.5',
  testEnvironment: 'production',
  web: true,
  accessibilityStandards: ['wcag2a', 'wcag2aa']
}).then(attributes => {...});

The attributes document is exactly what we passed in, with the addition of an autogenerated primary key. The key is never stored in the document body itself, but is automatically unwrapped when you persist the document.

Once the document table has been created, it's available just like any other table. You can retrieve the document again with the primary key, or query for an array of documents matching criteria:

db.test_attributes.findDoc(1)
  .then(attributes => {...});

db.test_attributes.findDoc({web: true})
  .then(matchingDocuments => {...});

Count documents with criteria:

db.test_attributes.count({web: true})
  .then(total => {...});

Perform a full-text search against the values in the document:

db.test_attributes.searchDoc({
  fields : ["testEnvironment", "environment", "applicationStatus"],
  term : "production"
}.then(matchingDocuments => {...});

Persistence functions are also adapted for document tables. You can update/insert a document with saveDoc; if the argument contains an id field, it will update the existing document in the database. If the argument contains no id field then it will insert a new document into the database. Either way, it returns the current state of the document.

This is not a true upsert! saveDoc, like save, determines whether to emit an INSERT or an UPDATE based on whether the data you pass it contains a primary key. If you are generating primary keys manually, use insert instead -- if you specify a value for the primary key, it will execute an UPDATE whether or not the row actually exists in the database, and if it does not the result will be null.

attributes.requiresAuthentication = true;

db.test_attributes.saveDoc(attributes)
  .then(attributes => {...});

Note that saveDoc replaces the entire document. To change fields without having to retrieve the document, use updateDoc:

db.test_attributes.updateDoc(1, {
  requiresAuthentication: false
}).then(attributes => {...});

updateDoc, like saveDoc, returns the current version of the entire document. updateDoc can also perform bulk operations with a criteria object and a changes object, just like the relational update:

db.test_attributes.updateDoc({
  web: true
}, {
  browser: 'Chrome'
}).then(changedAttributesDocs => {...});

When used with a criteria object, updateDoc returns an array containing all updated documents.

Arbitrary Queries

Last but not least: sometimes you just need to write some SQL. Alice's and Bob's passwords are both stored as plain text, because we were originally more focused on getting up and running than we were on doing things right. Now it's time to rectify this, especially since we've started adding new users through a system that hashes and salts passwords with a hash database function and our application login expects passwords to be hashed. So we need to ensure that all our users have hashed passwords, which we can do with an ad-hoc query in the REPL:

db.query(
  'update users set password = hash(password) where id < $1 returning *',
  [3]
).then(users => {...});

The value returned is an array of rows, assuming the query returns anything. query is most useful for one-offs like this, or for testing when you don't want to have to reload the database API to get changes to a script file. Once the query is ready for regular use, though, it's best to put it in a file in your scripts directory so you have all your scripts in a central location.

REPL

Massive.js ships with a REPL (read-evaluate-print loop), an interactive console that lets you connect to a database and execute JavaScript code. The easiest way to run it is to install globally:

npm i -g massive

You can then fire up a connection and start writing JavaScript:

massive -d appdb

db > db.listTables();
[ 'tests',
  'users' ]

db > db.tests.find({user_id: 1}).then(tests => {...});

In addition to the tables collection, the views and functions collections are also exposed on the database object.

When invoking functions, you may omit the then if you just want to see output -- Massive provides a resolver which logs the results to make it easy to query with the REPL.

Exit the REPL by pressing Ctrl-C twice.

Older Versions

Install Massive.js v2: npm install massive@2

Documentation for Massive.js 2.x is at readthedocs.

Release versions are tagged and available here.