RubenVerborgh/solid-server-architecture

Is this architecture compatible with Trellis?

Closed this issue · 8 comments

If we compare this architecture with https://github.com/trellis-ldp/trellis/wiki/Trellis-Architecture, are they the same?

CC @acoburn @pmcb55

Trellis can be a concrete implementation of the ResourceStore interface.

Yep, absolutely (re. Ruben's comment above). Further to that, I think I'd also simplify some of the interface method names, but I'll raise a separate issue for that.

OK, so in that vision, the answer to #4 would be yes...

So maybe the word 'ResourceStore' can mean different things, depending on the context?

There are a lot of similarities between this architecture and the Trellis architecture; in fact, I'd say they are almost identical. They both separate the LDP layer from the resource persistence (storage) layer. In Trellis, the ResourceStore interface is called ResourceService. The LdpHandler is called TrellisHttpResource. Similarly, the HTTP runtime is separate as are Authorization and Authentication (AuthN is external, AuthZ is a pluggable filter that is handled by the HTTP runtime.) Another similarity is that the ResourceStore / ResourceService is an interface that supports multiple, pluggable implementations.

From my reading of the proposed architecture, the differences are as follows:

  1. The Storage layer in Trellis does not deal with representations (only "resources") -- that layer in Trellis returns streams of Triples (in the case of NonRDFSource, the relevant service returns an InputStream); converting those triples into representations happens at the HTTP layer and is done dynamically.

  2. The Storage layer does not support a modifyResource method (e.g. sparql-update). All resource modifications are decomposed into ::replace() calls at the storage layer. This means that the storage layer does not need to know anything about Sparql-Update (or LD-Patch, etc). Architecturally, this is similar to the item above, as all the RDF-based manipulation happens in the HTTP layer: the storage layer just deals with Triples and Graphs (not representations). It also means that the storage layer can be implemented easily on just about any key-value store.

I'd say that the big advantage of putting this type of RDF processing in the HTTP layer rather than the storage layer becomes especially useful when LDP structures such as direct containers are supported (I realize that's not required in Solid, but it's a nice feature).

converting those triples into representations happens at the HTTP layer and is done dynamically

Makes sense to me too, @RubenVerborgh is that a change you would be willing to make?

A side note, when I write:

the storage layer just deals with Triples and Graphs (not representations)

I mean that the storage interface deals with Triples and Graphs. Clearly, the backend requires some sort of representation, since the bits need to make it to storage somehow, but those representations don't necessarily need to be concrete RDF serializations. This way, the storage layer has flexibility to do whatever it needs to do with the bits, so long as those bits can later be retrieved as Triples.

So maybe the word 'ResourceStore' can mean different things, depending on the context?

Not for me at least. The ResouceStore in my architecture is an interface that gives access to resources and their representations (those latter two terms used in their HTTP/REST definition).

There are a lot of similarities between this architecture and the Trellis architecture; in fact, I'd say they are almost identical.

I think they are two different abstraction levels actually. Trellis seems to be closer to what I would consider one (type of) implementation of the ResourceStore interface; it makes assumptions on how the underlying system works. For good reason, but it does. ResourceStore for me is a broader interface.

1. The Storage layer in Trellis does not deal with representations (only "resources")

I've been very puristic with the REST terminology for resource and representation. I think what you call a "resource" is actually a "representation" in my terminology.

Does the difference matter? Maybe not that much. In my architecture, it is simply reflected by the fact that there is ResourceIdentifier and Representation (but no Resource); in yours, that might be Iri and Resource. So we're saying the same things.

Were the distinction is going to matter is when there are different representations of a resource; then my terminology is more accurate (in a REST sense). You actually see this in the Trellis architecture:

/* Try to retrieve a resource at a particular moment in time (Memento) */
CompletableFuture<Resource> get(IRI, Instant);

a resource at a particular moment (/ with a particular content type / in a particular shape) is nothing but a representation :-) So you have this concept, but just not named as such.

2\. The Storage layer does not support a `modifyResource` method (e.g. sparql-update). All resource modifications are decomposed into `::replace()` calls at the storage layer. [...] It also means that the storage layer can be implemented easily on just about any key-value store.

Agreed, but as you know, at the cost of small patches possibly being much more expensive than they should be. Which is an acceptable trade-off for many cases; but my architecture does not want that trade-off. It is very possible that implementers will internally implement modifyResource through a replace; but I don't aim to mandate that. This leaves room for optimization of very common cases (at the cost of that complexity). Still easy to implement with key/value; you just replace instead.

I'd say that the big advantage of putting this type of RDF processing in the HTTP layer rather than the storage layer becomes especially useful when LDP structures such as direct containers are supported

One doesn't exclude the other.

Makes sense to me too, @RubenVerborgh is that a change you would be willing to make?

It doens't even need to be a change, it can be supported as-is cfr. #4 (comment).

A side note, when I write:

the storage layer just deals with Triples and Graphs (not representations)

I mean that the storage interface deals with Triples and Graphs. Clearly, the backend requires some sort of representation, since the bits need to make it to storage somehow, but those representations don't necessarily need to be concrete RDF serializations.

Copy that; I currently leave the storage interface the freedom to do either. (Concrete implementations of Representation could be documents or triple streams internally.)

Comparing in 3c3cd23.