zandaqo/structurae

Storing graph nodes

mrjjwright opened this issue · 2 comments

I am building a pretty big graph built up of sub smaller graphs with the nodes as ObjectViews. Since the graph relates the objects in myriads ways I assume that the storage solution I choose for the objects, i.e. how they are serialized to a structure out of memory, like JSON, can be purely pragmatic. The objects don't need to be stored in some sort of hierarchy. Would there be some benefit in storing them in CollectionViews or something like that? It would be nice if they were stored in a way that they could be deserialized trivially and safely back into a graph. Of course the graph itself is easy to serialize

The view structures like ObjectView are interfaces for underlying ArrayBuffers. When instantiated separately without providing an existing ArrayBuffer, e.g. ObjectView.from({}) they will create a new ArrayBuffer for each view. Creating an ArrayBuffer is a more involved and slower process than a normal object, so it's a good idea to have as few ArrayBuffers as possible.

In case of multiple ObjectViews, I'd consider storing them in an ArrayView if they are of the same type, or a CollectionView of multiple ArrayViews if there are multiple types. This will result in a single ArrayBuffer for all stored ObjectViews. The major downside is that the size of the buffer (hence, the maximum number of ObjectViews in it) will be fixed upon its creation. To increase it, you'll have to create a new ArrayBuffer and copy the data since ArrayBuffers are not re-sizable. In other words, you are taking over memory management.

The question of using binary structures is the question of how dynamic is the underlying data. In this case, if the shape and amount of nodes in the graph may vary greatly, going binary would mean either pre-allocating too much memory or writing complicated algorithms for memory management. However, if the shapes are similar (ideally the same) and the amount is predictable, going binary and storing it all in a singe ArrayBuffer would minimize memory usage, and allow exchanging this data with servers, workers, and WebAssembly without serialization penalties. The C guy in me would also mention memory locality benefits, but I cannot vouch for JS engines to consistently utilize it.

Ok, one thing I glean from this is that you seem to be saying it's actually better for the objects to have a root ArrayBuffer, and hence a root ObjectView which dictates their binary layout so that a new ArrayBuffer does not need to be allocated for each object. A naive approach of creating lots of different kind of objects via .from({}) and storing them all separately would be expensive. If such a root arrayBuffer, or at least a few root binary layouts, can't be described and maintained easily, it's probably time to pause and consider what I am doing.