solid/data-interoperability-panel

Data registration vs. container

woutermont opened this issue · 4 comments

From the definition of the Data Registration, it would seem that the registration itself is the container in which its shapetree-adhering instances are stored. After all, it has a number of metadata predicates (relevant agents, timestamps etc.), but none that points to a container.

Am I right in this assessment, and (if so) why is that the case?

It seems to preclude a division of responsibility, e.g., where I keep my registration metadata on one host, and have it point to my data on another. At first I thought maybe the idea is to have the RS use it for knowing which shapetree a container has to adhere to, but then this would seem to conflict with the idea of "planting trees" from the ShapeTrees spec.

After discussing it during our meeting and thinking about it further. While adding an explicit reference to LDP Container, rather than having Data Registration itself being that container, adds some flexibility. It also adds some complexity. With this change, whenever we create a new Data Registration we MUST provide a link to LDP Container. In most cases, the user wouldn't want to be bothered to specify or create a new container for each new data registration. As we consider this change we should consider the following:

Each Data Registry would require interop:defaultRootContainer which would be an LDP Container in which by default a new LDP Container gets automatically created for each new Data Registration in that Data Registry.

Creating a new Data Registry happens on a few occasions (per user) so adding some extra burden here seems fine. Creating new Data Registrations will happen much more often so we should keep as frictionless as we can.

I think this approach can give us a nice balance between flexibility and putting the burden on users.

Thanks for the elaboration, @elf-pavlik. I do not immediately follow why the user would need to select which LDP Container should be used, however.

With the current state of the spec (Data Registrations being LDP Containers), they also do not need to do so: the Data Registration is simply created as a semi-random LDP Container. I think we could simply keep this way of provisioning, by which the AA creates a semi-random LDP Container for the user, but then have the Data Registration point there instead.

Am I missing something?

We actually had a similar conversation a while ago #72

I think we could simply keep this way of provisioning, by which the AA creates a semi-random LDP Container for the user, but then have the Data Registration point there instead.

We should always assume that the user has multiple storages, how would AA know in which storage container should be created?

We should always assume that the user has multiple storages, how would AA know in which storage container should be created?

Indeed. You suggested an interop:defaultRootContainer triple for each DataRegistry, but assume this would still need to be set by the user, am I correct?

I recall that somewhere (though unable to find it atm) we talked about the AA always being in the loop about new pod provisioning (either the user adds one manually, or the RS notifies the AA automatically). This way, the AA would always now about new pods, which are exactly the interop:defaultRootContainer we need. Only in the manual case would the user indeed need to set (probably copy) the root URL in the AA.