Implement CDG storage
Opened this issue · 0 comments
Proposal:
Data Structures:
Metadata: Represents the metadata for a node, including an ID, the associated node ID, and a list of children IDs.
LogEntry: Represents an entry in the log, containing the node ID, metadata ID, vector clock, and metadata block location.
Graph Storage:
Each metadata entry will be stored in a separate block on disk.
The log will contain entries referencing the metadata IDs and their corresponding block locations.
Two indexes will be maintained:
metadataId to metadataBlock: Maps metadata IDs to their respective block locations for quick retrieval.
nodeId to logEntry: Maps node IDs to their corresponding log entries for efficient lookup.
Writing to the Log:
When a new node is inserted into the graph, create a metadata entry and store it in a new block on disk.
Create a log entry referencing the metadata ID and the block location of the metadata.
Update the metadataId to metadataBlock index and the nodeId to logEntry index.
Reading from the Log:
Given a node ID, retrieve the corresponding log entry from the nodeId to logEntry index.
Retrieve the metadata block location from the log entry.
Use the metadataId to metadataBlock index to retrieve the metadata block based on the metadata ID.
Modifying Metadata:
When a node's metadata is modified (e.g., adding or removing children), update the corresponding metadata block on disk.
Update the metadataId to metadataBlock index if the block location changes.