brikteknologier/seraph

Using internal unique ID?

jacktho opened this issue · 2 comments

I've been investigating seraph tonight and it looks great but I have some confusion. I'm likely missing something.

It seems seraph is using Neo4j's internal ids? So if I do a db.save and pass that an object that has id in it. If that id is the same as neo4j's internal id for a node, that node will be updated correct?

The Neo4j docs say stuff like "Returns the unique id of this node. Ids are garbage collected over time so they are only guaranteed to be unique during a specific time span: if the node is deleted, it's likely that a new node at some point will get the old id. Note: this makes node ids brittle as public APIs."

I thought setting the id option during seraph initialization would do what I want. Which is to use a normal node property for the id. However, if I change that to say differentID and then pass { differentID: 123, something: 'else' } into db.save. The node with a neo4j internal id of 123 is updated. I expected either the node with a normal property of differentID: 123 to be updated or a new node to be created with said property. Am I misunderstanding the docs?

"id (default = "id"): the name of the attribute seraph will add to new nodes when they are created and that it will use to find nodes when performing updates with node.save and the like."

Thanks!

First off, the id option just changes which property on an object the id is saved on, as you found.

I would love for it to work the way you suggest, but Neo4j are rather self-contradicting on this matter. They say you shouldn't use the internal IDs in an API, but their very own API uses internal IDs, and provides no good ways to use a unique property or something else to fetch an object. Until Neo4j changes their REST API to support unique properties as IDs, our hands are tied.

It is possible, if you really want to do it the "right" way, you can set a uniqueness constraint for a property with a specific label, and maintain your objects with the correct labels. seraph-model can help you with that. But it's a bit of a pain in the ass to do it this way, because you need to use a cypher query to do every fetch and update. You can never just .read and .save. This is a direct reflection of how the neo4j API itself works. A little ridiculous, right?

If neo4j really wanted their id's to be internal, they wouldn't have created a whole REST API that is based upon referencing nodes and relationships by that very id.

Oh bummer. After a look at their REST API, I see what you mean. Up to this point I have only used the Cypher endpoint so I guess the REST API weirdness is the part I was missing.

Following your response, I tried to get a feel for how often Neo4J reclaims the ids of deleted nodes. I gave it the smallest heap size that Neo4j would let me run it with so that GC would run a lot. I recursively created and deleted half a million+ nodes in various ways and was never able to get Neo4j to start re-using the ids of deleted nodes while it was running. However, it did reclaim the old ids a few times after restarting the DB but at least it does not feel like Neo4j is quick to re-use ids of recently deleted nodes.

Thanks again.