sapio-lang/sapio

Consider using Dhall as an internal representation instead of json-schema

Closed this issue · 3 comments

Not sure of the relative benefits and costs of this. But it occurred to me during some investigation of the artifacts. JSON Schema is well specified but it is extremely verbose and while the overall structure (json) is familiar, the ceremony around the structure of the document produced by sapio-cli contract api --file=X makes it difficult to tell the high level details of a contract API.

My understanding of the purpose of the json-schema based API suggests that this is primarily a document that needs to be ingested by software and only secondarily is intended for human scrutiny.

Dhall is a configuration language that has nearly isomorphic properties to json but supports both types and terminating functions to avoid repetition. This may make it easier to extract meaning from the document when it is read by humans.

Dhall has interfaces into rust as well.

Reasons not to do this:

  • dhall is less well-known than json-schema (most likely)
  • humans are scrutinizing this so infrequently that the added benefits are little, moving this far down the priority list
  • there are constraints that need to be specified that are not able to be specified in the dhall type language
  • dhall types don't typically have descriptions. If the description field is necessary it may mean dhall is unusable for this use case, though this requires further investigation.

Reasons to do this:

  • more concise schema documents
  • the ability to specify higher order schemas
  • dhall <> rust bridge is well used, and therefore unlikely to have implementation gotchas.

the main reason to use JSONSchema afaik are:

  1. Compatibility with packages like RJSF https://rjsf-team.github.io/react-jsonschema-form/
  2. https://docs.rs/schemars/latest/schemars/trait.JsonSchema.html schemars lets us derive JsonSchema for a given rust struct.

Maybe it might make sense to document the requirements here.

It sounds like your first point is better stated as "it is important that whatever representation it is, that we can easily get it into a forms library or otherwise. It must be sufficiently descriptive to generate UI's solely from the schema". Separately we also have the statement "RJSF is a huge help to be able to outsource that functionality and so for the time being this seemed like the most sensible choice"

for the second one, I'm not completely sure that there is a fundamental requirement here. Making the standardization of the format depend on what traits are easily derivable in Rust at the time of writing, while a sensible choice, doesn't seem like a requirement.

Does this look like a fair analysis on what you're saying?

I suppose another way to think of it is that we're just looking for an "obvious" engineering choice, and serializing to/from JSON is pretty straightforward for that as a base serialization layer. Anything else can be built on top of that. But it happens that JSON has really good support in a lot of places without any new dependencies.

json-schema is nice as a API describer because it is not actually the underlying repr, it's just documentation. So we're really just trying to require that the modules are self-documenting. The actual representation is governed by the deserialization process, which can differ (usually, more restrictively) than the JSONschema (e.g., string must be a valid public key might not be checkable in a regex).

If we want to migrate to Dhall, one nice way to experiment with that would be to do as follows:

#[derive(JsonSchema, Serialize, Deserialize)]
struct DHallString(String);
impl TryFrom<DHallString> for MyType {
   // ...
}

Register[[MyType, DHallString], "logo.png"]

We can even implement a custom json-schema type for DHallString (as is done in sapio studio for custom::filename) like so

{
            DHallString: {
                format: 'custom::dhall_string',
                type: 'string',
                data: /* the dhall string */
            },
}

and generate a widget for it that in the frontend connects to a DHall Widget, giving it first-class support.