This repo contains a design draft and some exploratory code to discuss the schema language in use in the data-pipes project.
We want to offer consumer's of data-nuggets the capacity to introspect their data. To achieve this goal, we propose to associate a type schema to each data-nugget construct.
Although schemas provide powerful introspection capabilities for code manipulation, they are often pretty gruesome to handle for the user. We want to avoid this as much as possible. Our schema system should therefore be expressive and easy to manipulate by its users.
To achieve this, we propose to develop a custom literal schema representation highly inspired by the GraphQL schema definition language.
We propose to extend the immutability principle of data-nuggets, to the schema. In this first iteration, this means that the syntax tree associated to the schema property will be an Immutable.js Record tree.
Our schema system should provide, at the very least, the following functionalities:
- both a literal and a structural representation.
- a parsing mechanism which translates literal representations to structural representations.
- a generation mechanism to produce literal representations from structural representations.
- a simple schema inference mechanism. (Given a value return a best guess schema).
We should also keep in mind other potential uses (validation, etc.)
In this first code exploration, we provide a very basic Type system.
It is incomplete and only supports concrete types.
In other words, we support the following primitive scalar types: integer, float, string (TODO: define numeric type constraints and complete primitive type support).
We support two composite types: Object and List.
Note that we assume the composite types to be immutable.js data structures (Map and List respectively).
Note also that we do not, yet, support nullable types.
As noted previously, this representation should be as expressive and human-friendly as possible. We are in this regard heavily inspired by the GraphQL schema definition language.
Before producing a formal definition, we propose a heuristic approach via some simple code exploration.
Here is a what a simple example would look like:
type Root{
people:[Person]
}
type Person{
name:String,
age:Integer,
counts:[Integer]
}
Note that by convention each schema defines a Root objectType. This is the entry point for the schema.
Finally, we should keep in mind that this representation could also serve as a serialization format.
The structural representation is a javascript object tree (based on Immutable.js Records). The following snippet illustrates how we could hand-build the syntax tree for the previously defined schema:
let PersonType = Type.ObjectType("Person", {name:Type.StringType,
age:Type.IntegerType,
counts:Type.ListType(Type.IntegerType)});
let RootType = Type.ObjectType("Root", {people:Type.ListType(PersonType)});
If we inspect the root of this schema, RootType, we obtain the following structure.
// pretty printed RootType
{
"name": "Root",
"kind": "OBJECT",
"fields": {
"people": {
"kind": "LIST",
"ofType": {
"name": "Person",
"kind": "OBJECT",
"fields": {
"name": {"name": "String", "kind": "SCALAR"},
"age": {"name": "Integer", "kind": "SCALAR"},
"counts": {
"kind": "LIST",
"ofType": {"name": "Integer", "kind": "SCALAR"}
}
}
}
}
}
}
This repo contains a first exploration of the proposal in the form of a set of node.js files. This exploration serves only to provide a general shape and a concrete basis for discussion. It is in no way definitive or production ready.
- type.js contains a set of Records to represent a type syntax tree
- schemaInferal.js contains a function to generate a best guess syntax tree from a value
- typeInferal.js is a dependency of schemaInferal
- generator.js contains a function which given a syntax tree generates a literal representation
- parser.js contains a parser returning a syntax tree for a given literal representation
- example.js contains some code illustrating the use of the different modules.
Here are some points in need of discussion.
- type system features
- cyclical references
The current type system is limited to representing concrete unrelated types.
It also only supports a finite set of scalars.
What features do we need to support to make our type system useful?
We can not represent cycles in our syntax trees. i.e. The following is a problem:
type Person{
name:String
friend:Person
}
To provide support for cycles, we have to modify the structure of the syntax trees.