w3c-ccg/vc-examples

A Standard HTTP API for converting a CSV to a credential + options for the HTTP API

OR13 opened this issue · 13 comments

OR13 commented

We need a way to take a CSV, and produce an HTTP Issue request.

Individual and batch based

OR13 commented

See also, https://github.com/json-schema-form for a mapping to an HTML form.... what if we did both :)

OR13 commented

We need a solution for CSV (with extraneous fields) => VC + Options (with Required fields).

Vendors would handle side effects associated withe input CSV, construct the payload to the issuer, get the response, and handle side effects.

the handling of side effects does not need to be standardized... the CSV translation, and head representation does need to be defined.

OR13 commented

@peacekeeper @dlongley @jandrieu @msporny we have wanted something like this, but it seems like not exactly an http thing, and not exactly a VC Data Model thing...

I imagine this a an endpoint

POST /csv-to-vc { csv } => { vc + options }

Its a utility API, which implements a pure function which transforms a CSV to JSON-LD.

Most vendors will probably have their own way to doing this, and that is where vendor lock in risk exists, and its the main reason for markus's original API design.... I imagine the path to the desired state would be:

  1. CSV => VC + Options
  2. VC + Options => VC + Proof

There are 2 cases, one where its a single VC issued, and one where their is a batch issued...

POST /csv-batch-issuer/credentials {csv} => { array of vcs } (an endpoint which transparently applies the above)

OR13 commented

I'm assuming we will solve this by considering it as an alternative to the compose and issue api, and assigning myself and markus.

Most vendors will probably have their own way to doing this, and that is where vendor lock in risk exists, and its the main reason for markus's original API design

Yes, the idea of the original API design was exactly that API callers probably DO NOT want to compose a JSON-LD credential themselves, but rather start with a simpler data structure that they understand, i.e. plain JSON in the form of {"claim-name": "claim-value", ...}.

In the simplest implementation of that API (which is now called composeAndIssueCredential in the latest version), that's all that would be required. The assumption is that clients can covert their own internal data (whether it's in CSV, XML, SQL records, etc.) into that simple JSON structure and then use the API.

The proposal here in this issue on the other hand seems to be that vendors would provide a custom /csv-to-vc endpoint that can convert CSV to JSON-LD. I think that definitely works as a short term solution, but then wouldn't we have to keep adding new endpoints for every possible data format a client may want to use..?

Jumping in quickly to point out that I think these "compose and issue" APIs are most likely a huge problem. We have already provided an API that allows an issuer to convert a CSV file into a known format -- that is, a Verifiable Credential.

The issue with the composeAndIssueCredential is that it assumes that organizations can meet their use cases with a "simpler" API. Most of the customer's we've interacted with are either 1) willing to just use the VC format, or 2) want to just output CSV and have some conversion process that takes their data and magically transforms it to a VC. I say "magically" because the input CSV files are so varied that it's difficult to envision a composeAndIssueCredential that would meet all of the use cases we've seen.

There are approaches, such as the work done in the CSV on the Web -- https://www.w3.org/2013/csvw/wiki/CSV-LD -- that have looked into this before. This is a non-trivial problem space, and I don't see the current approach taken in composeAndIssueCredential as meeting the use cases we've seen in the field... in fact, we've seen nothing short of a custom conversion program for the use case that is capable of meeting the needs of real customers at this point in time.

That's at least my major push back - the proposed "simple" mechanism is not aligned with customer use cases, it's academic. Integrating w/ customer data is always more complicated and have other inputs that are important to take into account (like receiptIds, customer PII to use when contacting an individual, transforms to date formats, transforms to field values, etc.)

I think we should keep that in mind... this isn't a simple problem space, even though it may seem like that at first. We're talking about an arbitrarily programmable business logic engine here... not a simple key-value mapping system.

OR13 commented

assuming a key value mapping of flat json: https://www.npmjs.com/package/flat

what prevents us from creating JSON-LD credentials from CSV ?

assuming a key value mapping of flat json

That's the crux of the issue, every time we've made that assumption, our customers have surprised us with data transforms, crazy date/time formats that they're unwilling to change, hierarchical data formats, data transforms on the enums they're using, etc. It typically requires custom code that either they need to write, or the vendor needs to write. If they're writing the custom code, they've been happy to just use a VC template format we give them. if we're writing the custom code, it's a few lines of code at best, and at worst, there is a really complex business rules engine that goes into play -- "Oh, 01AB means 'the number is in units or weight depending on if the entry in field Custom_1 is 'Q' or 'C', and if it's 'C' the unit of measure can be found in field Custom_2 and can be any of 12 different values". While I've changed the values in the sentence previously, this was a real use case from a real customer.

A simple flat JSON key-value mapping is not going to cut it, so I hesitate to standardize something that we don't put a ton of more thought into.

I'd also like to point out that this repo is definitely not the right place for this conversation. It should probably be moved to transmute/vc-http-api.

I would also point out that I don't see how you can properly create the necessary context for the custom fields of the claim as you have no format info (ie. integer vs. boolean vs. string).

OR13 commented

@lrosenthol ... but you have that type information, if your required @contextis crafted correctly....

You could also use credentialSchema: https://w3c-ccg.github.io/vc-json-schemas/

... to enforce the types... if you don't like the "type system" the VC Data Model supports by itself.

OR13 commented

I'm pretty sure that a flat map of the POST body JSON object is sufficient for a number of reasons.

  1. If it isn't then the HTTP API is not sufficient.
  2. If there are formatting issues handling credential / options... they are all also present in the HTTP API.

Now the part that for sure would be a problem is that the column headers look terrible... and editing nested tables in excel would make your eyes bleed.

But from a pure data modeling perspective, you can convert an array of POST bodies... into a CSV and back.

OR13 commented

Please move any further discussion here if needed https://github.com/w3c-ccg/vc-http-api