Another take on non-RDF payloads (aka file upload)
Opened this issue · 7 comments
Describe the requirement
We've approached the problem of non-RDF media types a few times already. Unfortunately it seems that each time it was not focused enough. Either mixed with collections (#187) or lacking broader context (#186).
Looking back at both, I think they are on track, but need a little more refinement.
For this issues, I'd like to focus on expects
only and not returned representations
Hydra-agnostic example
I would distinguish 3 kinds of requests coming to a Hydra API:
-
RDF payloads - such that are currently described by
expects
andClass
. -
Non RDF-payloads
Directly uploading an image instead of RDF:
POST /movie/123/poster-image HTTP/2 Content-Type: image/png ...File bytes...
-
multipart/form-data
Submitting multiple images and RDF data:
PUT /movie/123/image-gallery HTTP/2 Content-Type: multipart/form-data; boundary=----hydra-content ----hydra-content Content-Disposition: form-data; filename="poster.png" Content-Type: image/png ...Poster image... ----hydra-content Content-Disposition: form-data; filename="cast.jpeg" Content-Type: image/jpeg ...Image of actors... ----hydra-content Content-Type: application/ld+json { "@type": "mov:Gallery", "description": { "@value": "Pictures for movie /movie/123" } } ----hydra-content
Hydra should allow describing operations which expect
both kinds of file uploads.
Proposed solutions
It is important to keep support for the current expect
semantics.
I propose that we extend the existing structure with a media-type description. Unfortunately it is not possible to have it both ways without revolutionising the structure, so the vocab will have to remove rdfs:range
from expects
and use schema:rangeIncludes
instead.
{
"@id": "hydra:expects",
- "range": "hydra:Class",
+ "schema:rangeIncludes": [
+ "hydra:Class",
+ "hydra:RequestSpecification"
+ ]
},
Example of hydra:Class
usage
{
"@type": "Operation",
"expects": {
"@type": "RequestSpecification",
"content": {
"@type": "SupportedClassContent",
"class": "mov:Movie"
}
}
}
This would be equivalent to "expects": "mov:Movie"
and both should be supported at least for a while.
Example of non-RDF payload
{
"@type": "Operation",
"expects": {
"@type": "RequestSpecification",
"content": {
"@type": "RawContent",
"supportedContentType": [ "image/png", "image/jpeg" ]
}
}
}
supportedContentType
could also use a more elaborate structure though I'm not conviced it's necessary.
Example of multipart
{
"@type": "Operation",
"expects": {
"@type": "RequestSpecification",
"content": {
"@type": "MultipartContent",
"allowedParts": [
{
"supportedContentType": [ "image/png", "image/jpeg" ],
"maxCount": 2
},
{
"@type": "SupportedClassContent",
"class": "mov:Movie",
"minCount": 1,
"maxCount": 1
}
]
}
}
}
Above interpreted as:
- allowing 0-2 image parts
- requiring one RDF-part with
mov:Movie
allowedParts
has same domain ascontent
, extended with multipart-specific bit such as the min/max count
MultipartContent
would have to become part of the core vocabulary.
Implications
The consequences of such design are far reaching:
-
By introducing
RequestSpecification
we can directly describe HTTP requests (such as by usingexpectHeader
) -
The
content
predicat can be an extension point we've talked about, allowing 3rd party vocab to describe bodies using SHACL. Something like
ShaclContentSpecification subclassOf ContentSpecification
-
It will even be possible to define operations which expect markdown, plain text or any other textual format
Alternative solutions
Here's how Open API does that for file uploads and multipart requests. For example
requestBody:
content:
multipart/form-data: # Media type
schema: # Request payload
type: object
properties: # Request parts
id: # Part 1 (string value)
type: string
format: uuid
address: # Part2 (object)
type: object
properties:
street:
type: string
city:
type: string
profileImage: # Part 3 (an image)
type: string
format: binary
Note that id
, address
and profileImage
will be separate request parts.
I think this looks like a great proposal. Flexible without being too complex. 👍
Just as a quick note - I think the approach taken in #186 is less revolutionary, but I can see both approaches has some similarities (RequestSpecification
vs MediaTypedResource
used to provide custom description of the payload).
As for multiple files upload - bare in mind that there are several possibilities:
- server may provide multiple expected classes - it may be understood that one of those classes can be provided, provided resource is of all of those classes or there can be several resources matching those criteria
- it is both possible and doable to provide all resources in RDF (i.e. Base-64 payloads being a values of some kind of predicate) - it would be an alternative to multipart/form-data approach; in both cases client needs to do some serious processing in order to rig up a payload
- cardinality should be provided in some more unified way - indeed hydra does not tackle this matter in any way.
I'll provide more feedback later
I think the approach taken in #186 is less revolutionary
Indeed, but I think we need revolutionary
but I can see both approaches has some similarities
Definitely inspired by the former proposals, but I intend a flexible solutions
server may provide multiple expected classes
would the min/max cardinalities cover that? Having multiple 0-1 parts, each for a different class...
it is both possible and doable to provide all resources in RDF
Short answer: 🤮
Long answer: you'd need to invent/resuse even more terms to describe objects of those properties. With multipart/form-data we're using same approach everyone else on the web uses.
And cannot agree with the serious processing.
cardinality should be provided in some more unified way
We could use those terms for property cardinalities.
On the other hand the multipart support could be its own auxiliary spec, with its own specific terms. Much like SHACL will definitely be an independent extension and shapes have their own cardinality lingo.
Haven't thought this trough yet, but looks good at first impression 👍
Indeed, but I think we need revolutionary
Not really - there are a couple of other specs built on top of hydra that has some more implementations. We shall keep as much of the backward compatibility as possible. I'd still like to downgrade the mentioned rdfs:range from hydra:Class
to hydra:Resource
so either RequestSpecification
or MediaTypedResource
from #186 (or whatever name would it be) fits by being a hydra:Resource
Definitely inspired by the former proposals, but I intend a flexible solutions
Well - those approaches also claimed to be flexible.
would the min/max cardinalities cover that? Having multiple 0-1 parts, each for a different class...
I meant we need to think it over carefully. There are several places that would benefit from cardinality specifications. There are also other vocabs that already provide these semantics.
Long answer: you'd need to invent/resuse even more terms to describe objects of those properties.
With multipart/form-data we're using same approach everyone else on the web uses.
Quite the opposite - hydra:property
already exists. I'm not claiming that pushing base-64 files through RDF payloads is a nice and clean approach. I'm just saying that handcrafting a sculpture with multipart content that is also not that common (older web API frameworks may not provide support out of the box) is neither a pretty one.
And cannot agree with the serious processing.
I remember I tried to send a multipart requests in a browser and it end up with not so nice code. Maybe something has changed since that time, but it is not something a browser can provide out of the box. Maybe there are already some JS libraries to make it easier, but it still requires some heavy stuff written that uses file API, buffers and other quite fresh JS elements available in modern browsers.
In general - it feels like 'RequestSpecification'/'supportedContentType' related part is somehow similar to terms presented in #186 and both should meet same criticism and alternate ideas, i.e. @angelo-v 's approach with more generic constraint-like specifications (experiment provided with #187).
As for the multipart content - it looks like it was created solely to meet some particular requirement and feels it was not well considered. It seems to be heavily coupled with HTTP and it does not tackle various scenarios (i.e. pre-uploading like in web mail clients where attachments can be uploaded before sending an email).
I'd still like to downgrade the mentioned
rdfs:range
fromhydra:Class to
hydra:Resourceso either
RequestSpecificationor
MediaTypedResourcefrom #186 (or whatever name would it be) fits by being a
hydra:Resource`
I concur. The only issue I have with just the "downgrade" is that we'd completely lose any semantics. Replacing that with rangeIncludes
give back some of that hint fo what kinds of descriptions are expected.
We shall keep as much of the backward compatibility as possible.
Yes, I definitely wish to keep [] hydra:expects some:Class
a valid construct.
Well - those approaches also claimed to be flexible.
Like I said, #187 is confusing in how it brings collections into the mix. And #186 is just a tad too narrow in scope. I opened this to offer a more open solution which can potentially include SHACL and possibly unexpected extensions.
Let's find middle ground.
I meant we need to think it over carefully. There are several places that would benefit from cardinality specifications. There are also other vocabs that already provide these semantics.
Maybe let's ignore multipart for now. If we can get the basic structure extensible "enough", then such an extension can be developed on the side without invading the core.
In general - it feels like 'RequestSpecification'/'supportedContentType' related part is somehow similar to terms presented in #186 and both should meet same criticism and alternate ideas, i.e. @angelo-v 's approach with more generic constraint-like specifications (experiment provided with #187).
It is similar. I hoped to gather the best of both ideas.
I concur. The only issue I have with just the "downgrade" is that we'd completely lose any semantics.
Replacing that withrangeIncludes
give back some of that hint fo what kinds of descriptions are expected.
But it breaks existing clients and specs.
And #186 is just a tad too narrow in scope.
Well, baby steps. I feel this issue is to wide
Maybe let's ignore multipart for now. If we can get the basic structure extensible "enough",
then such an extension can be developed on the side without invading the core.
Yep - sounds reasonable.
It is similar. I hoped to gather the best of both ideas.
I'm opened. I'll invite community on the mailing list tomorrow (I've reached my limit today) to the discussion.