OpenAPI 3.0

Albert Notebook

Instructions

Abstract

Albert Notebook enables users to create free-flow content using various notebook components, including text, images, chemical drawings, mentions, and files, with plans to support Albert entities in the future. The notebook also supports content reordering and maintains a version history.

alt

Exercise (45 Min)

Design a backend for the Notebook (API and database schema) to implement the specified functionality. Utilize OpenAPI 3.0 to define all REST endpoints. Use DynamoDB with a single table design to store and retrieve transactional data. The API should be optimized for patching the notebook, retrieving version history, and reordering notebook content.

Design

Target Stack

The backend system is designed assuming the following technologies:

  • Cloud Provider: AWS
  • IaC: CloudFormation
  • Database: DynamoDB
  • API Design: RESTful
  • Backend Service(s):
    • AWS API Gateway
    • AWS Lambda
    • Node.js
    • TypeScript
    • OneTable (DynamoDB "ORM")

Access Patterns

The primary access patterns as defined by the requirements:

Access Pattern Operation Key Attributes API Endpoint
Patching the notebook
For a given notebook, create a new revision based on the previous with the specified partial changes.
PutItem notebookId PATCH /notebook/:id
POST /notebook/:id/revisions
Retrieving version history
For a given notebook, fetch a list of revisions.
Query notebookId, revision GET /notebook/:id/revisions
Reorder notebook content
For a given notebook, create a new revision based on the previous, with its content reordered.
PutItem notebookId POST /notebook/:id/revisions

Future Access Patterns

Additional access patterns that may be required in the future:

Access Pattern Operation Key Attributes API Endpoint
Creating a new notebook
Create a new notebook with an initial revision.
PutItem notebookId POST /notebooks
Retrieving the latest version
For a given notebook, fetch the latest revision.
Query notebookId GET /notebook/:id
Retrieving a specific version
For a given notebook, fetch a specific revision.
Query notebookId GET /notebook/:id/revisions/:revisionId
Reverting to a specific version
For a given notebook, revert to a specific revision.
PutItem notebookId POST /notebook/:id/revisions/:revisionId/revert
Deleting a notebook
Delete a notebook and all its revisions.
Delete notebookId DELETE /notebook/:id

Entities

The core entities involved in the previously identified access patterns:

Entity RelationshipSchemas
%%{
  init: {
    "theme": "default",
    "fontFamily": "monospace"
  }
}%%
erDiagram
    Notebook {
        %% UUID id
        %% string title
        %% DateTime createdAt
        %% string createdBy
        %% DateTime updatedAt
        %% string updatedBy
        %% Revision[] revisions
    }

    Revision {
        %% string id
        %% string notebookId
        %% string createdAt
        %% string createdBy
        %% Element[] elements
    }

    Element {
        %% UUID id
        %% string type
        %% number order
        %% string value
    }

    User {
        %% UUID id
        %% string name
    }

    Notebook ||--o{ Revision : "has"
    Notebook ||--o{ User : "createdBy"
    Notebook ||--o{ User : "updatedBy"
    Revision ||--o{ Element : "contains"
    Revision ||--o{ User : "createdBy"
Loading
type Notebook = {
  id: UUID;
  title: string;
  revisions: Revision[];
  createdAt: DateTime;
  createdBy: User;
  updatedAt: DateTime;
  updatedBy: User;
};

type Revision = {
  id: UUID;
  elements: Element[];
  createdAt: DateTime;
  createdBy: User;
};

type Element = {
  id: UUID;
  type: string;
  order: number;
  value: string | Record<string, unknown>;
};

type User = {
  id: UUID;
  name: string;
};

API

See the following external resources for API documentation:

Persistence

Notebooks can be persisted using a single-table design in DynamoDB. Any given notebook would have a minimum of three items (one for each facet noted below):

  • one capturing the unversioned notebook metadata
  • one for each revision, capturing revision specific metadata
  • one for each element within a revision

Single-Table Design

Primary Key Attributes
PK: notebookId SK: selection type title latest revision element
49f7335c-d7a8-4f1c-b2cc-... METADATA notebook My Notebook 1
v000001#METADATA revision 1
v000001#000001 element

{...}

49f7335c-d7a8-4f1c-b2cc-... METADATA notebook Another Notebook 8
v000001#METADATA revision 1
v000001#000001 element

{...}

...
v000008#000032 element

{...}

Composite Primary Key (Partition Key + Sort Key)

Partition Key: notebookId

The notebookId can serve as the partition key with a reasonably high cardinality while still providing a logical collection of the notebook's contents.

Sort Key: selection

For the sort key, a combination of the revision number and the element order number can be used.

This will allow for efficient querying of the notebook's revisions and elements. It will also allow us to store multiple facets or item types within the same table using special keywords such as METADATA when outside the context of a specific revision or element.

Note: Because the sort key will be a string containing the composite of two numbers, sorting will be lexicographical rather than numerical. To restore numerical sorting, the numbers can be zero-padded to a fixed length.

Facets

While a typical single table design would lead to a fully denormalized structure, it is possible to normalize the data to some extent. The lack of a strict schema outside the primary key attributes means we can store multiple item types within the same table.

The table will consist of three different facets or item types:

Notebook Metadata

Any unversioned metadata that applies to the entire notebook, such as the title and modification fields.

SchemaExample
type NotebookItem = {
  notebookId: UUID;
  selection: string;
  type: "notebook";
  title: string;
  latest: number;
  createdAt: DateTime;
  createdBy: {
    id: UUID;
    name: string;
  };
  updatedAt: DateTime;
  updatedBy: {
    id: UUID;
    name: string;
  };
};
{
  "notebookId": "3fb3cad6-5d1c-4ec8-9bfc-e8dcd6017741",
  "selection": "METADATA",
  "type": "notebook",
  "title": "My Notebook",
  "latest": 1,
  "createdAt": "2021-01-01T00:00:00Z",
  "createdBy": {
    "id": "c17214445723N8knra0ni9qxlittxjxzf",
    "name": "Alice Smith"
  },
  "updatedAt": "2024-07-20 17:06:25-07:00",
  "updatedBy": {
    "id": "fe9c478a-f7b7-4b71-91b8-4fe2a3810210",
    "name": "Bob Johnson"
  }
}
Revision Metadata

Each revision will have its own metadata record, identified by the revisionId the METADATA keyword in place of an element order number.

SchemaExample
type RevisionItem = {
  notebookId: UUID;
  selection: string;
  type: "revision";
  revision: number;
  createdAt: DateTime;
  createdBy: {
    id: UUID;
    name: string;
  };
};
{
  "notebookId": "bd995cf2-6c8c-4c2f-99ab-4de5ba4e312e",
  "selection": "v000001#METADATA",
  "type": "revision",
  "revision": 1,
  "createdAt": "2024-07-20 17:06:25-07:00",
  "createdBy": {
    "id": "fe9c478a-f7b7-4b71-91b8-4fe2a3810210",
    "name": "Bob Johnson"
  }
}
Element

Each element within a revision will have its own record, identified by the revision number and the element order number.

SchemaExample
type ElementItem = {
  notebookId: UUID;
  selection: string;
  type: "element";
  element: {
    order: number;
    type: string;
    value: string | Record<string, unknown>;
  };
};
{
  "notebookId": "3fb3cad6-5d1c-4ec8-9bfc-e8dcd6017741",
  "selection": "v000001#000001",
  "type": "element",
  "element": {
    "order": 1,
    "type": "text",
    "value": "Hello, World!"
  }
}

Indexes

The primary key is sufficient to cover all the primary access patterns, so no secondary indexes are required.

LSIs

For future access patterns, LSIs are to be considered as a last resort. While they can be used to optimize certain queries, they come with additional constraints to the indexed table and are less flexible than GSIs.

GSIs

Unlike LSIs, GSIs can be added to existing tables as new access patterns emerge.

Optimization Considerations

  • GSI's add storage costs and increase write overhead/latency. For infrequent access patterns, consider just scanning the table.
  • Use DynamoDB's built-in TTL feature to automatically delete old revisions can help manage storage costs.
  • Optimize large attributes by grouping them into separapte rows/items so that small attribute updates do not require rewriting the entire item.

References