Albert Notebook enables users to create free-flow content using various notebook components, including text, images, chemical drawings, mentions, and files, with plans to support Albert entities in the future. The notebook also supports content reordering and maintains a version history.
Design a backend for the Notebook (API and database schema) to implement the specified functionality. Utilize OpenAPI 3.0 to define all REST endpoints. Use DynamoDB with a single table design to store and retrieve transactional data. The API should be optimized for patching the notebook, retrieving version history, and reordering notebook content.
The backend system is designed assuming the following technologies:
- Cloud Provider: AWS
- IaC: CloudFormation
- Database: DynamoDB
- API Design: RESTful
- Backend Service(s):
- AWS API Gateway
- AWS Lambda
- Node.js
- TypeScript
- OneTable (DynamoDB "ORM")
The primary access patterns as defined by the requirements:
Access Pattern | Operation | Key Attributes | API Endpoint |
---|---|---|---|
Patching the notebook For a given notebook, create a new revision based on the previous with the specified partial changes. |
PutItem |
notebookId |
PATCH /notebook/:id POST /notebook/:id/revisions |
Retrieving version history For a given notebook, fetch a list of revisions. |
Query |
notebookId , revision |
GET /notebook/:id/revisions |
Reorder notebook content For a given notebook, create a new revision based on the previous, with its content reordered. |
PutItem |
notebookId |
POST /notebook/:id/revisions |
Additional access patterns that may be required in the future:
Access Pattern | Operation | Key Attributes | API Endpoint |
---|---|---|---|
Creating a new notebook Create a new notebook with an initial revision. |
PutItem |
notebookId |
POST /notebooks |
Retrieving the latest version For a given notebook, fetch the latest revision. |
Query |
notebookId |
GET /notebook/:id |
Retrieving a specific version For a given notebook, fetch a specific revision. |
Query |
notebookId |
GET /notebook/:id/revisions/:revisionId |
Reverting to a specific version For a given notebook, revert to a specific revision. |
PutItem |
notebookId |
POST /notebook/:id/revisions/:revisionId/revert |
Deleting a notebook Delete a notebook and all its revisions. |
Delete |
notebookId |
DELETE /notebook/:id |
The core entities involved in the previously identified access patterns:
Entity Relationship | Schemas |
---|---|
%%{
init: {
"theme": "default",
"fontFamily": "monospace"
}
}%%
erDiagram
Notebook {
%% UUID id
%% string title
%% DateTime createdAt
%% string createdBy
%% DateTime updatedAt
%% string updatedBy
%% Revision[] revisions
}
Revision {
%% string id
%% string notebookId
%% string createdAt
%% string createdBy
%% Element[] elements
}
Element {
%% UUID id
%% string type
%% number order
%% string value
}
User {
%% UUID id
%% string name
}
Notebook ||--o{ Revision : "has"
Notebook ||--o{ User : "createdBy"
Notebook ||--o{ User : "updatedBy"
Revision ||--o{ Element : "contains"
Revision ||--o{ User : "createdBy"
|
type Notebook = {
id: UUID;
title: string;
revisions: Revision[];
createdAt: DateTime;
createdBy: User;
updatedAt: DateTime;
updatedBy: User;
};
type Revision = {
id: UUID;
elements: Element[];
createdAt: DateTime;
createdBy: User;
};
type Element = {
id: UUID;
type: string;
order: number;
value: string | Record<string, unknown>;
};
type User = {
id: UUID;
name: string;
}; |
See the following external resources for API documentation:
Notebooks can be persisted using a single-table design in DynamoDB. Any given notebook would have a minimum of three items (one for each facet noted below):
- one capturing the unversioned notebook metadata
- one for each revision, capturing revision specific metadata
- one for each element within a revision
Primary Key | Attributes | |||||
---|---|---|---|---|---|---|
PK: notebookId | SK: selection | type | title | latest | revision | element |
49f7335c-d7a8-4f1c-b2cc-... | METADATA | notebook | My Notebook | 1 | ||
v000001#METADATA | revision | 1 | ||||
v000001#000001 | element |
|
||||
49f7335c-d7a8-4f1c-b2cc-... | METADATA | notebook | Another Notebook | 8 | ||
v000001#METADATA | revision | 1 | ||||
v000001#000001 | element |
|
||||
... | ||||||
v000008#000032 | element |
|
The notebookId
can serve as the partition key with a reasonably high
cardinality while still providing a logical collection of the notebook's
contents.
For the sort key, a combination of the revision
number and the element order
number can be used.
This will allow for efficient querying of the notebook's
revisions and elements. It will also allow us to store multiple facets or item
types within the same table using special keywords such as METADATA
when
outside the context of a specific revision or element.
Note: Because the sort key will be a string containing the composite of two numbers, sorting will be lexicographical rather than numerical. To restore numerical sorting, the numbers can be zero-padded to a fixed length.
While a typical single table design would lead to a fully denormalized structure, it is possible to normalize the data to some extent. The lack of a strict schema outside the primary key attributes means we can store multiple item types within the same table.
The table will consist of three different facets or item types:
Any unversioned metadata that applies to the entire notebook, such as the title and modification fields.
Schema | Example |
---|---|
type NotebookItem = {
notebookId: UUID;
selection: string;
type: "notebook";
title: string;
latest: number;
createdAt: DateTime;
createdBy: {
id: UUID;
name: string;
};
updatedAt: DateTime;
updatedBy: {
id: UUID;
name: string;
};
}; |
{
"notebookId": "3fb3cad6-5d1c-4ec8-9bfc-e8dcd6017741",
"selection": "METADATA",
"type": "notebook",
"title": "My Notebook",
"latest": 1,
"createdAt": "2021-01-01T00:00:00Z",
"createdBy": {
"id": "c17214445723N8knra0ni9qxlittxjxzf",
"name": "Alice Smith"
},
"updatedAt": "2024-07-20 17:06:25-07:00",
"updatedBy": {
"id": "fe9c478a-f7b7-4b71-91b8-4fe2a3810210",
"name": "Bob Johnson"
}
} |
Each revision will have its own metadata record, identified by the revisionId
the METADATA
keyword in place of an element order number.
Schema | Example |
---|---|
type RevisionItem = {
notebookId: UUID;
selection: string;
type: "revision";
revision: number;
createdAt: DateTime;
createdBy: {
id: UUID;
name: string;
};
}; |
{
"notebookId": "bd995cf2-6c8c-4c2f-99ab-4de5ba4e312e",
"selection": "v000001#METADATA",
"type": "revision",
"revision": 1,
"createdAt": "2024-07-20 17:06:25-07:00",
"createdBy": {
"id": "fe9c478a-f7b7-4b71-91b8-4fe2a3810210",
"name": "Bob Johnson"
}
} |
Each element within a revision will have its own record, identified by the
revision
number and the element order
number.
Schema | Example |
---|---|
type ElementItem = {
notebookId: UUID;
selection: string;
type: "element";
element: {
order: number;
type: string;
value: string | Record<string, unknown>;
};
}; |
{
"notebookId": "3fb3cad6-5d1c-4ec8-9bfc-e8dcd6017741",
"selection": "v000001#000001",
"type": "element",
"element": {
"order": 1,
"type": "text",
"value": "Hello, World!"
}
} |
The primary key is sufficient to cover all the primary access patterns, so no secondary indexes are required.
For future access patterns, LSIs are to be considered as a last resort. While they can be used to optimize certain queries, they come with additional constraints to the indexed table and are less flexible than GSIs.
Unlike LSIs, GSIs can be added to existing tables as new access patterns emerge.
- GSI's add storage costs and increase write overhead/latency. For infrequent access patterns, consider just scanning the table.
- Use DynamoDB's built-in TTL feature to automatically delete old revisions can help manage storage costs.
- Optimize large attributes by grouping them into separapte rows/items so that small attribute updates do not require rewriting the entire item.