This is a library aimed to work with DynamoDB as if it was a Graph. The idea came from the "Advanced Design Patterns for Amazon DynamoDB (DAT403-R)" talk from Rick Houlihan on 2017 AWS re:Invent conference. Close to the end he describes a way to use a DynamoDB table to represent a graph. I found that idea very interesting, so I wanted to create a library that would let me interact with that kind of table structure. I added only two things to the pattern presented on this talk:
- The concept of a
tenant
. - A way to handle the amount of GSI partitions to use.
So, each node can belong to a tenant
by concatenating the random ID of the
node (a cuid
in this case), with the tenant
id (also a CUID). The GSI Keys
are also prepended with the tenant ID, so that you can check for the data only
in the proper GSI partitions.
To control the number of GSI Keys you can use the maxGSIK
option.
This library is constructed on top of anther library, which is in charge of handling the communication with DynamoDB. When instantiating a new Model, you can provide a DynamoDB DocumentClient driver, or, just let it instantiate one for you. This will work, provided you have configured the AWS SDK correctly. Having the ability to pass your own driver, simplifies the way the library can be tested.
As mentioned before the dynamodb-graph
library is used to interact with the
DynamoDB table, which must have a schema similar to this.
## DynamoDB table.
The schema for the DynamoDB table, written as a CloudFormation template is the
following:
```yaml
AWSTemplateFormatVersion: "2010-09-09"
Resources:
Graph:
Type: "AWS::DynamoDB::Table"
Properties:
AttributeDefinitions:
-
AttributeName: "Node"
AttributeType: "S"
-
AttributeName: "Type"
AttributeType: "S"
-
AttributeName: "Data"
AttributeType: "S"
-
AttributeName: "GSIK"
AttributeType: "S"
KeySchema:
-
AttributeName: "Node"
KeyType: "HASH"
-
AttributeName: "Type"
KeyType: "RANGE"
ProvisionedThroughput:
ReadCapacityUnits: "1"
WriteCapacityUnits: "1"
TableName: "GraphExample"
GlobalSecondaryIndexes:
-
IndexName: "ByType"
KeySchema:
-
AttributeName: "GSIK"
KeyType: "HASH"
-
AttributeName: "Type"
KeyType: "RANGE"
Projection:
NonKeyAttributes:
- "Data"
- "Target"
- "MetaData"
ProjectionType: "INCLUDE"
ProvisionedThroughput:
ReadCapacityUnits: "2"
WriteCapacityUnits: "2"
-
IndexName: "ByData"
KeySchema:
-
AttributeName: "GSIK"
KeyType: "HASH"
-
AttributeName: "Data"
KeyType: "RANGE"
Projection:
NonKeyAttributes:
- "Type"
- "Target"
- "MetaData"
ProjectionType: "INCLUDE"
ProvisionedThroughput:
ReadCapacityUnits: "2"
WriteCapacityUnits: "2"
As you can see from the CloudFormation template, the table needs to use the
Node
as the hash key, and the Type
as the sort key. The table also must
include two GSI, one indexed by GSIK
and Type
, and the other by GSIK
and
Data
. They should be named ByType
, and ByData
respectively.
The GSI keys are created by dynamodb-graph
, combining the node id, tenant id,
and the maxGSIK
value. This way you can provision few GSIK partitions at the
beginning and grow them further along. Take into account that the maxGSIK
should never be decreased. Doing so will make some nodes unavailable.
Install the library on your project using npm
or yarn
.
npm install --save dynamodb-graph-model
yarn install dynamodb-graph-model
Then you can import it to your project using require
. To interact with a
model you must fist configure it, by passing some required and optional
values. Here is the JSDoc that defines the types of all the options.
/**
* Factory functions that returns a model, than can talk to a DynamoDB table
* that is used to represent a directed graph.
* @param {object} options
* @property {any} [data] - Node main data.
* @property {DynamoDBGraph} [db] - DynamoDB Graph object. Useful for testing.
* @property {DocumentClientDriver} [documentClient] - DynamoDB DocumentClient
* driver.
* @property {EdgesMap} [edges]=[] - Map of node edges.
* @property {object[]} [history]=[] - History of the model.
* @property {number} [maxGSIK] - Maximum number of GSIK.
* @property {string} [node] - Node of the current model.
* @property {boolean} [log] - If set, all updates will include a CreatedAt or
* UpdatedAt property generated along them.
* @property {PropertyMap} [properties]=[] - Map of node properties.
* @property {string} [table] - Table name. If not provided, it will try to pull
* it from an environment variable called
* TABLE_NAME.
* @property {string} [tenant=''] - Tenant identifier.
* @property {string} type - Node type.
*/
The table name can also be taken from an environment table called TABLE_NAME
.
Here is an example of how to instantiate a new Model
.
var AWS = require('aws-sdk');
var Model = require('dynamodb-graph-model');
// Configure the AWS SDK however you like.
AWS.config.update({ region: 'us-east-1' });
// On this example, we will provide our own DynamoDB Document Client driver.
var documentClient = new AWS.DynamoDB.DocumentClient();
// The table can be passed in as a parameter or stored inside the TABLE_NAME
// environment variable.
var table = process.env.TABLE_NAME;
// If we provide a GSIK value smaller than 2, then only 1 GSIK partition will
// be created. Since this value is important, it is asked explicitly for it.
var Book = Model({ type: 'Book', documentClient, maxGSIK: 0 });
var Author = Model({ type: 'Author', documentClient, maxGSIK: 0 });
Promise.all([
Book.create({
data: 'Elantris',
properties: [
{
Type: 'Published',
Data: '21/04/2005'
},
{
Type: 'PublishedBy',
Data: 'Tor Books'
}
]
}),
Author.create({
data: 'Brandon Sanderson',
properties: [
{
Type: 'Gender',
Data: 'Male'
},
{
Type: 'Born',
Data: '19/12/1975'
}
]
})
])
.then(([book, author]) => {
return book.connect({ type: 'Author', target: author });
})
.then(book => {
console.log(book.Edges);
// [{Node: ..., Data: 'Brandon Sanderson', Target: ..., Type: 'Author'}]
})
.catch(error => {
console.log(error);
});
// To get a node we use the `get` method, providing the Node.
Book.get('cjbfbo53x0000v3vm5egmtkr7')
.then(book => {
console.log(book.type);
// Book
console.log(book.data);
// Mistborn
console.log(book.properties);
// [{Type: 'Published', Data: '21/04/2005'}, ...]
console.log(book.edges);
// [{Type: 'Author', Data: '21/04/2005', Target: 'cjbfbo53x00v3vm5egmtkr7']
})
.catch(error => {
console.log(error);
});
// To get a list of models we use the `collection` method.
Book.collection()
.then(books => {
books.forEach(book => {
console.log(book.data);
});
// Elantris
// Mistborn
})
.catch(error => {
console.log(error);
});
All the functions return a promise, with a new model in the result, or a list of models. I was thinking if it would be useful to mutate the initial model, but ended up opting not to. Might change in the future, or I'll might add that feature under a new flag.
TODO
I tried to include information on each function as a JSDoc comment. I plan in
the future to transform it into a proper documentation page. I wish there was
something like Sphix
for JavaScript.
I am using jest
to test the library. So, just clone the repo, install the
dependencies, and run yarn test
or npm run test
to run them.
git clone git@github.com:guzmonne/dynamodb-graph.git
yarn install
yarn test
MIT