Refactor Arango class
Opened this issue · 2 comments
A couple of issues with the current Arango class need work:
- The database handle itself should be passed around in the persistent spidergram context, rather than the wrapper.
- Database initialization should be explicit, accepting an array of classes and metadata rather than relying on the current
Vertice.types.set(...)
mechanism. - Database initialization could/should be expanded to include graph, analysis, index, and route creation.
Worth considering whether to use interface tricks to add helper methods to the database class, move all the current Arango class methods to a separate helper class, etc.
Recent refactoring on the 3.0-dev branch (now merged into main) has addressed some of the first issue — the Arango class is now 'ArangoStore' and it's been reworked to follow the semantics of Crawlee's DataStore and KeyValueStore classes.
const storage = ArangoStore.open('database-name')
returns an ArangoStore instance with itsactiveDb
property set to the chosen instance. If the database exists, it will open it. If not, it will create it and initialize it with the appropriate collections.storage.push(entities[])
saves an arbitrary pile of SpiderGram entities to Arango. Duplicates are ignored by default, but setting the second parameter totrue
will overwrite entities with the same key. This behavior might be reversed before we go live, or be moved the more explicitoptions = { overwrite: true }
style.
The intent of matching Crawlee's semantics with the static open()
method and blind push()
methods is to make moving between the different data storage tools easier; in theory Spidergram entities can be saved to any datastore that supports serialized JSON data.
Still undecided: Initializing the Arango database with the desired list of classes is still TBD. An 11ty-style config initialization process that sets up the initial Spidergram Context could include an explicit array of entity Types, whose metadata/meta functions could supply the initialization information. We'll have to experiment.
Update — the Project class now takes options for model entities in its 'graph' section. Although they're not actually passed on to ArangoStore yet, they're one piece of more reliable setup and configuration of new graphs. We need to:
- Capture graph entity information (collection name, vertice or graph identity, indexes, Arango validation information, JSON transformer/constructor functions, etc) on the class metadata.
- Pass an array of classes into ArangoStore's initialization code rather than hard-coding our core entities. Iterate over those classes' metadata to determine what collections, indexes, etc should be created to set up a new DB.
- Use the incoming
_collection
property of an Arango JSON payload, and the class metadata information, to determine which constructor to call when rehydrating results.