Changing Parent / Ancestor
Closed this issue · 5 comments
GoogleCodeExporter commented
I know that once you set the parent/ancestor you can't change it because it
implies changes to all the objects key's all down the tree.
Your design of consistency based on the parent makes a lot of sense, it
simplifies a lot what has to be made consistent, and allows a great deal of
optimizations/simplifications that I'm guessing are the reason behind it.
The problem rises when for some reason the parent/ancestor has to changed, that
implies the recreation of all the objects and that cost might pay off
considering the benefits.
Of course, this only makes sense when you aren't expected to change the parent.
but that might have to happen. By design we are supposed to use another parent
and place the real parent in some property, but that breaks all the benefits.
When its a simple tree, the recreation of objects is trivial, but sometimes its
not.
So it would be nice if that feature could be supported at the database level,
as it could optimize that recreation a lot and depending on the implementation
it could handle even more difficult things like references.
For instance, if at the database level instead of serializing the full key you
use a binary id, then all you have to change is the label of that binary id.
I opened the issue originally at
http://code.google.com/p/googleappengine/issues/detail?id=8045 but I was told
to open it here.
I'm not sure it will hold, but any improvement beyond "use another parent" or
"do it yourself" is a must.
The problem is how much overhead you can avoid at the driver level, if you
can't, I'm guessing that most serious use cases will become too heavy, over the
transaction's limits.
Even if you can't use it because the graph is too big, it would always be very
helpful to update smaller graphs.
Original issue reported on code.google.com by spi...@gmail.com
on 6 Sep 2012 at 1:26
GoogleCodeExporter commented
Hm... Supporting this in the underlying Datastore is out of the question so
we're now debating whether we can do this in userland in NDB.
I think if you just want to move an entity from parent A to parent B, it's easy
to do it yourself:
def reparent(oldkey, newparent):
newkey = ndb.Key(oldkey.kind(), oldkey.id(), parent=newparent)
ent = oldkey.get()
ent._key = newkey
ent.put()
oldkey.delete()
However there are still several problems with this.
1. If there already is an entity with key == newkey, it will be overwritten.
2. If oldkey.id() is numeric, we ought to add a reservation for the same id in
the new parent using alloc_id().
3. We ought to use a transaction; this has to be an XP transaction because two
entity groups are affected.
4. Validation: oldkey.parent() != newparent.
5. Check that ent is not None.
I'm not sure what other functionality you are asking for -- Do you want to move
multiple entities? Move intermediate parents along as well? Please help design
a possible API by specifying your requirements.
Original comment by guido@google.com
on 11 Sep 2012 at 2:53
GoogleCodeExporter commented
The problem wasn't in changing the entity's parent, as you could also recreate
them, the question is more about the children and if there is a more efficient
way that you can do it, under the hood.
Regarding my kind of functionality, lets consider the example:
global -> user -> blogs -> posts -> comments » user
Say that for some reason you had two users that now you want to merge in a new
one.
That implies two kinds of changes:
1. update all the child blogs to have as new parent the other user
2. update all the references in all comments made by that user
AFAIK if you update the parent of an entity you have to update also the keys of
all the children recursively.
CASE 1. That would imply traveling down the graph.
I don't think it could/should be done in a single transaction, perhaps for some
simple cases you can do it all at once in a single transaction.
I would use some kind of iteration, perhaps a task that gets two keys and moves
all the children from one to the other, in a depth first way. Like a search
algorithm.
CASE 2. I don't know if there is a way to get all the references to a key and
to update them also. But I would also do it iteratively in terms of
transactions, getting one by one and updating an entity at a time.
Regarding the points you ask:
1. 2.
Well, I was thinking more in recreating the entire key in the moving entity, as
there could be some parent specifics information.
And then just re-parenting all its children, as there, local key's part should
remain valid, as the local context is basically the same.
3. Well, this kind of changes without transactions are always very dangerous.
Regarding moving multiple entities, I don't mind moving one at a time, the only
question is the children.
Your last question, moving intermediate parents, I'm not sure what you mean,
perhaps, you are talking about the children. If that's the case, i see it as a
recursion, like I said above.
I hope I understood all your questions and that I was useful.
Count me in to give you all the help I can.
Original comment by spi...@gmail.com
on 11 Sep 2012 at 5:20
GoogleCodeExporter commented
Alas, there is no way to change those keys without copying all the entities.
There is also no way to find all the references to a key, unless you know that
all those references occur in a specific kind at a specific property -- then
you can just query for those.
So I do not think there is a shortcut to do what you are asking -- one way or
another all the entities whose key changes must be copied (and the old ones
deleted).
I propose to close this issue as won't-fix, since this is really not something
we want to encourage, and there is nothing we can do that you can't already do
just as well yourself.
Original comment by guido@google.com
on 11 Sep 2012 at 5:42
GoogleCodeExporter commented
I agree, I just hopped that under the hood you had less overhead.
As this being something bad to encourage, I believe that having a global entity
group and a parent property that you can change easily is a design that will
imply a big performance penalty and the loss of the local consistency concept.
I believe that with this design all the nice things of the local consistency
concept are lost.
Sometimes the requirements say "may" and you have to support it, no matter how
unlikely it is to happen.
Feel free to close it, for me its something really important missing in that
beautiful consistency design, because it would allow you to update the graph in
a very elegant and consistent way, being always inside that local consistency.
But I understand, the database wasn't designed for that and its not that simple.
Too bad that under the hood its not just like SQL where you can update directly
all the references with a query for each table and do it in a single
transaction, but yes, in SQL you only had to change the parent.
Original comment by spi...@gmail.com
on 11 Sep 2012 at 6:22
GoogleCodeExporter commented
If you want to argue about the Datastore's underlying design, the NDB tracker
is *definitely* not the right forum.
Original comment by guido@google.com
on 11 Sep 2012 at 6:26
- Changed state: WontFix