/ttd-serverless

Things Todo serverless application

Primary LanguageJavaScriptMIT LicenseMIT

Things Todo serverless

Backend side of the things todo app which lets you find events and activities for kids. It does work, but it's not a production ready solution - it lacks, for instance, a proper error handling and DLQ setup.

Getting Started

For building and deploying you need a Serverless framework. This project also depends on AWS infrastructure set up by ttd-infra.

Prerequisites

  • serverless framework
  • AWS infrastructure (ttd-infra)

Deploying

The project is organized by services and monorepo approach has been used. However, the GraphQL API cannot be split and be part of the services, so it's deployable as one unit (in fact there are 4 different APIs for different actors, namely organisers, authenticated users, guests and audit).

Each module is deployable by the following command

sls --stage test deploy

where stage is the name of the environment you want to deploy into. You can also deploy only individual functions

sls --stage test deploy -f <function_name_here>

or update lambda function configuration

sls --stage test deploy -f <function_name_here> --update-config

Destroying environment

To remove all services execute the following command

sls --stage test remove

Elasticsearch setup

Elasticsearch requires an additional step after setting up a domain (done by ttd-infra).

Elasticsearch analyzers and mappings need to be setup. You can do that by invoking esThingsToDoSetup function.

services/search>sls --stage test invoke -f esThingsToDoSetup -l

A convenience lambda function for deleting indices is also provided - esDeleteIndices

Tests

Most of the tests are integration tests and require AWS environment (The tests assume the targeting services/servers are available during the test - they don't start these services themselves). You can use a real AWS cloud (but be aware of the costs) or you can start up a mock/local server to mimic AWS environment. It has been tested against Localstack and DynamoDB Local. However a local Elasticsearch was also needed because Localstack didn't support Elasticsearch 7+ format yet. The services used by tests are:

  • DynamoDB
  • Elasticsearch
  • SQS
  • SNS
  • SES

Setting up test environment

The easiest way to setup a testing environment is by using docker, e.g. to setup Localstack with DynamoDB, SQS, SNS and SES ports exposed call the following command

docker run -d --name localstack -p 4575:4575 -p 4576:4576 -p 4579:4579 -p 4569:4569 localstack/localstack

To start a local Elasticsearch instance

docker run -d --name elasticsearch -p 9200:9200 -p 9300:9300 -e "discovery.type=single-node" docker.elastic.co/elasticsearch/elasticsearch:7.5.0

Elasticsearch setup is automatically done by the tests. However, DynamoDB requires a manual step to set up tables (it could be automated, but I store DynamoDB schema/index information in terraform). The following command sets up DynamoDB tables and indices based on a terraform configuration.

ttd-infra/non-prod/local/dynamoDb>terragrunt apply

Running the tests

To run tests type

npm test -- --runInBand

Due to non transactional nature of the DynamoDB it's important to execute tests sequentially (runInBand flag), so the tests don't interfere with each other.

Architecture

The project (this and ttd-terraform/ttd-infra duo) consists of several AWS lambda functions and AWS services.

  • Cognito - authentication and authorization
  • AppSync - managed GraphQL service
  • DynamoDB - schemaless database
  • Elasticsearch service - managed Elasticsearch
  • Route 53 - DNS
  • S3 - serving static assets and hosting web app
  • SES - sending emails
  • CloudWatch - logs, alarms and triggers
  • SNS - pub/sub messaging
  • SQS - queue service
  • Lambda - functions
  • CloudFront - CDN
  • Pinpoint - analytics

All services but Elasticsearch follow serverless paradigm. The high-level architecture of the project is presented below.

alt text

Single table design

My first iteration of the DynamoDB tables design closely resembled a familiar relational data model, i.e. each entity type had its own table. It did work, but it 'wasted' precious read/write capacities. Then I came across this fantastic Advanced Design Patterns for DynamoDB talk and read, in the official AWS Best practices guidelines, the following note

In general, you should maintain as few tables as possible in a DynamoDB application. Most well-designed applications require only one table. (...) A single table with inverted indexes can usually enable simple queries to create and retrieve the complex hierarchical data structures required by your application.

I found the single table design with overloaded indices interesting and gave it a try (However, I do think it focuses mostly on one dimension only, i.e. cost effectiveness, and makes other things harder, e.g. code maintenance, data extraction. Additionally, the new, on-demand, payment scheme alleviates that need - you don't "see" read/write capacities anymore).

I ended up with the following design.

id sk (gsi1pk) gsi1sk ... other attributes
usrId USR
ttdId TTD STATUS#orgId
chlId CHL#usrId STATUS
ttdId FAV#usrId STATUS
bkgId BKG STATUS#usrId#date#ttdId

The identified access patterns are as follows

access pattern index conditions dynamoDb operation
get user profile main table id = :usrId and sk = "USR" GetItem
get thing todo by id main table id = :usrId and sk = "TTD" GetItem
get all my drafts gsi-1 sk = 'TTD' and gsi1sk = "DRAFT#orgId" Query
get all my published gsi-1 sk = 'TTD' and gsi1sk = "PUBLISHED#orgId" Query
get all my rejected gsi-1 sk = 'TTD' and gsi1sk = "REJECTED#orgId" Query
get pending things todo gsi-1 sk = 'TTD' and begins_with(gsi1sk, "PENDING") Query
get child by id main table id = :chlId and sk = "CHL#usrId" GetItem
get all my children gsi-1 sk = "CHL#usrId" and gsi1sk = "ACTIVE" Query
get my favorite by id main table id = :ttdId and sk = "FAV#usrId" GetItem
get all my favorites gsi-1 sk = "FAV#usrId" and gsi1sk = "ACTIVE" Query
get all things todo by ids (avoid N+1 select problem) main table id = :ttdId and sk = "TTD" GetBatchItem
get all organisers by ids (avoid N+1 select problem) main table id = :usrId and sk = "USR" GetBatchItem
get booking by id main table id = :bkgId and sk = "BKG" GetItem
get all my bookings (user) gsi-1 sk = "BKG" and begins_with(gsi1sk, "ACTIVE#usrId") Query

Note: to accomodate queries needed by organisers to handle bookings (not included in the current design) an additional global index would have to be introduced.

Troubleshooting

Serverless framework is de-facto AWS CloudFormation wrapper. In case of errors it is often easier to find a root cause of the problem in the AWS CloudFormation console.

Debug

Running serverless commands in a debug and verbose mode

SLS_DEBUG=* sls --stage test deploy --verbose

Acknowledgments

  • Production-Ready Serverless online course on Safari by Yan Cui
  • fanout blockify function based on aws-lambda-fanout

License

This project is licensed under the MIT License - see the LICENSE.MD file for details