How is a package (the file) identified in the Catalogue?
jbonnet opened this issue · 15 comments
By the trio name+version+vendor? If so, is the filename in the metadata?
Hi @jbonnet,
Package files (son-packages) are currently identified by the a uuid (same way as descriptors are). This uuid is now included in the package descriptor meta-data when it is stored in the catalogue. It takes a new field called "son_package_uuid". Thus, every stored package descriptor now is linked to its stored package file.
Regarding son-packages metadata, it includes the filename, but not necessarily using the trio name+vendor+version. Is the sender (in this case the GK) who defines the filename of the file when filling the HTTP headers of the submission, as shown below:
headers[HTTP_CONTENT_DISPOSITION] = "attachment; filename=<filename>"
So, if the file is 'named tc.sonata.1.0.son', then its metadata filename will be called that way. The Catalogues stores the filename internally in a field called "grid_fs_name" (due to the gridFS lib).
Here is an example of the metadata for sonata-demo.son file (used in integration tests)
[{"created_at":"2017-02-16T13:04:46.048+00:00","grid_fs_id":"58a5a36eaf8bef220b000000","grid_fs_name":"sonata-demo.son","md5":"9c8a66d5caa756bbb5d2ed4c56e27214","updated_at":"2017-02-16T13:04:46.048+00:00","uuid":"6dc04296-ea9b-4113-bf4b-8e9d2d42c9b3"}]
The son-packages API does not support yet queries based on filename (it does for "grid_fs_name"). This will be addressed soon.
Best regards
Hello again, @dang03
The son-packages API does not support yet queries based on filename (it does for "grid_fs_name").
You mean but it does support searches ongrid_fs_id
?
Also: who is generating the value for themd5
field shown in the above example? The Catalogue?
Hello @jbonnet,
Exactly, right now the API support queries based on current fields, such grid_fs_id
for the file uuid or grid_fs_name
for the filename.
However, to get the file itself, now it is only available through /son-packages/:id/?
using grid_fs_id
(named son_package_uuid
in the package descriptor).
If we need to retrieve a file by name then some changes are required.
The 'md5' is generated by the Catalogue, using GridFS lib.
@dang03
Fetching by name might get the wrong package file, due to the wrong file name being used, no? What happens when a file is uploaded with the same name as one existing file (but from another package)?
Better would be to support fetching package files through the trio vendor/name/version...
@jbonnet
Package file meta-data does not have these fields yet (vendor
, name
, version
). But this is not intended to be like this. The problem we found is when you POST a file, the filename must be defined in the attachment; filename
HTTP header when sending binary data through a HTTP call.
Thus, if you need vendor
, name
, version
fields on the file meta-data, there's no other option than passing these as query parameters, e.g.:
POST /api/v2/son-packages?vendor=x&name=y&version=z
Header HTTP_content_disposition: attachment; filename=<filename.son>
To enable this, some work is required on Catalogue side.
@dang03
Hmm...
What about passing those query parameters (vendor
, name
and version
) as HTTP headers as well? Not the most beautiful thing in the world, but a POST
with query parameters look... strange (it's not really a query, is it?).
From the outside (GK's API), I was thinking of providing an interface like:
POST /api/v2/packages
package file (already in place), returning the package meta-data, including it's UUID;GET /api/v2/packages/:uuid
gets the meta-dataGET /api/v2/packages/:uuid/file
(or.../download
) gets the package file
What do you think? By the way, is it son-access who needs to use this API?
Hi @jbonnet,
Yes, using custom HTTP headers for these fields (vendor
, name
and version
) is feasible. Let's do it that way then.
Having all packages-related under the same API makes sense (instead of using another for /api/v2/son-packages
).
In deed, son-access is the component that requires this API. In order to be in line, son-access will have to be updated to work with the packages
GK API.
@dang03
Still another option would be to include those parameters in the url, like in:
POST /api/v2/packages/vendor/:vendor_name/name/:name/version/:version
@dang03
I'm back to this, sorry...
Should we have the package file name in the package meta-data? Or even in the data?
@jbonnet
No problem. I'll be back to this too, as vendor
, name
, version
meta-data fields are not yet implemented.
I need to know which way (from the ones we discussed) do you prefer in order to send vendor
, name
, version
with the package file:
a) POST /api/v2/son-packages/vendor/:vendor_name/name/:name/version/:version
b) Included in form of custom headers:
att = request.env['HTTP_CONTENT_DISPOSITION']
vendor = request.env['HTTP_VENDOR']
name = request.env['HTTP_NAME']
version = request.env['HTTP_VERSION']
@dang03
Sorry, I've missed this one...
Now that I know a bit more of the Catalogue's API, I don't think we should have POST
with the trio... because it is already within the package, right? What should happen if we POST
a package with one vendor
+name
+version
into a different trio?
Hi @jbonnet,
It makes sense to avoid POSTing
with the trio, because if you save a package with a different trio it will create a inconsistent package.
However this was more like an option, not to store a package file itself by the name trio, but to add some more info in its Catalogue meta-data.