valentinpalkovic/prisma-json-schema-generator

Include "required" and "additionalProperties" keys to the object definitions

aflatoon2874 opened this issue · 10 comments

Feature Request

Include "required" and "additionalProperties" keys to the object definitions. The field list of "required" key should be as per the prisma model definition. The "required" is very much needed as the default behaviour of json schema specs is to allow empty objects to pass the validation.
For example:

{
  "definitions": {
    "address": {
      "type": "object",
      "properties": {
        "street_address": { "type": "string" },
        "city":           { "type": "string" },
        "state":          { "type": "string" }
      },
      "required": ["street_address", "city", "state"]
    }
  },
  "additionalProperties": false
}

@valentinpalkovic I saw your comment on the commit that removed the required functionality that you had implemented earlier. I have a suggested approach to address your concern.

I use id (autoincrement) as the primary key for my tables but do not expose it to the front-end app instead I expose cuid. Therefore, effectively unique key for my biz logic is cuid.

Organization Model
generator jsonSchema {
  provider = "node node_modules/prisma-json-schema-generator"
  primaryKey   = "cuid"     // Optional, if set use this field as the primary key otherwise use the prisma model primary key
  fieldsNotRequired = [ "createdAt", "updatedAt", "updatedBy" ]  // Optional
}

model Organization {
  id               Int                  @id @default(autoincrement())
  address          String
  name             String               @unique
  organizationType String               @map("org_type")
  status           String
  cuid             String               @unique @default(cuid())
  createdAt        DateTime             @default(now()) @map("created_at")
  updatedAt        DateTime             @updatedAt @map("updated_at")
  updatedBy        String?              @default("system") @map("updated_by")

  @@map("sys_t_organization")
}
Schema should look like this when primaryKey parameter is not set
{
  "$schema": "http://json-schema.org/draft-07/schema#",
  "definitions": {
    "Organization": {
      "type": "object",
      "properties": {
        "id": {
          "type": "integer"
        },
        "address": {
          "type": "string"
        },
        "name": {
          "type": "string"
        },
        "organizationType": {
          "type": "string"
        },
        "status": {
          "type": "string"
        },
        "cuid": {
          "type": "string"
        },
        "createdAt": {
          "type": "string",
          "format": "date-time"
        },
        "updatedAt": {
          "type": "string",
          "format": "date-time"
        },
        "updatedBy": {
          "type": [
            "string",
            "null"
          ]
        }
      },
      "if": {
        "properties": {
          "id": {
            "const": null
          }
        }
      },
      "then": {
        "required": [
          "address",
          "name",
          "organizationType",
          "status"
        ]
      },
      "else": {
        "anyOf": [
          {
            "required": [
              "address"
            ]
          },
          {
            "required": [
              "name"
            ]
          },
          {
            "required": [
              "organizationType"
            ]
          },
          {
            "required": [
              "status"
            ]
          }
        ]
      }
    },
  },
  "type": "object",
  "properties": {
    "organization": {
      "$ref": "#/definitions/Organization"
    }
  }
}
Schema should look like this when primaryKey parameter is set
{
  "$schema": "http://json-schema.org/draft-07/schema#",
  "definitions": {
    "Organization": {
      "type": "object",
      "properties": {
        "id": {
          "type": "integer"
        },
        "address": {
          "type": "string"
        },
        "name": {
          "type": "string"
        },
        "organizationType": {
          "type": "string"
        },
        "status": {
          "type": "string"
        },
        "cuid": {
          "type": "string"
        },
        "createdAt": {
          "type": "string",
          "format": "date-time"
        },
        "updatedAt": {
          "type": "string",
          "format": "date-time"
        },
        "updatedBy": {
          "type": [
            "string",
            "null"
          ]
        }
      },
      "if": {
        "properties": {
          "cuid": {
            "const": null
          }
        }
      },
      "then": {
        "required": [
          "address",
          "name",
          "organizationType",
          "status"
        ]
      },
      "else": {
        "anyOf": [
          {
            "required": [
              "address"
            ]
          },
          {
            "required": [
              "name"
            ]
          },
          {
            "required": [
              "organizationType"
            ]
          },
          {
            "required": [
              "status"
            ]
          }
        ]
      }
    },
  },
  "type": "object",
  "properties": {
    "organization": {
      "$ref": "#/definitions/Organization"
    }
  }
}
NOTE

Please note that when id or cuid is null then it will be treated as a new (create) request, therefore it will force all required fields to be present. And when id or cuid is not null then it will be treated as an update request, hence uses anyOf to enforce at least one of the fields to be present for updation.

Also note that this mitigates your concern of prisma select or include field selections in the result to be sent to the client. It will always pass as the result wil have either the id or the cuid, therefore else path will be followed.

fieldsNotRequired parameter

If this parameter is set then omit these fields from the required key values in both the if paths.

In principle a good idea. But you only treat the cases of creating or editing a new entity from the frontend. How should the schema validation looks like, if the user wants to fetch an entity with some specific fields (Graphql case). The frontend client can fetch any fields and does not have to fetch all of them at once. Therefore marking fields as required might only be true for (in REST language) PUT/POST requests. Therefore it is impossible at build/compile time to know, which fields are actually fetched by the client. Secondly, you cannot be sure, which fields have to be sent by the client and which can be delivered by other microservices or even be extracted from the user session information (information saved in JWT). The Prisma schema doesn't answer these types of questions and I have the feeling that your provided solution would cover one small use-case out of a bunch of possible use-cases.

Also, in the use-case of a backend-for-frontend solution, where most of the business logic lays in the backend, additional fields, which are not persisted in the database, might be sent to the frontend (persisted ui states, session related information,...). Marking additionalProperties: false would disallow this and would only allow sending database-related fields to the client and back.

I use graphql not rest api. In my opinion, all cases will be covered by setting additionalProperties: true with the suggested approach. There will be rare cases where front-end does not require the primary key along with other fields, in such a case of course the else path will not be followed and validation will throw errors. This case will be effective only when you want to validate the response of a controller/action before passing it to graphql that does not include the primary key.

The current generated schema is of no use as it will pass any object

  • empty objects
  • objects that do not have any of the properties in the schema

How do you propose to use the current state of generated schema for input validation in the resolver?

To be a bit more specific to my previous comment (because you haven't answered my concerns yet):

  1. Your proposed solution only works, if client-side data is validated. Any data, which is sent by the backend to the client cannot be validated, because the client could only request a subset of data. The validation would fail in the below example because all the other as required marked fields are not provided.
query getOrganization() {
  organization {
    id
    address
    name
  }
}
  1. Your proposed solution ignores computed fields, which derives out of database data.
query getOrganization() {
  organization {
    id
    address
    name
    someComputedValueWhichIsNotSavedIntoDatabase
  }
}

someComputedValueWhichIsNotSavedIntoDatabase is not saved in the prisma model and therefore setting additionalProperties: false the validation would fail.

  1. What happens, if the client wants to do a delete mutation? Then only the id or cuid is provided
mutation deleteOrganization($input: DeleteOrganizationInput) {
  id
}

Your proposed solution: "And when id or cuid is not null then it will be treated as an update request, hence uses anyOf to enforce at least one of the fields to be present for updation." wouldn't work.

But maybe the JSON schema validation is only used for client-data validation. In this case, my concerns mentioned in 1. and 2. don't matter. I just don't know currently, how the schema is used to validate data. Maybe you could provide some examples and use-cases which could show the usage of the generated JSON schema.

The current generated schema is of no use as it will pass any object

  • empty objects
  • objects that do not have any of the properties in the schema

Currently, that's true. Only known fields are validated and checked, whether the right type is passed or not. Unknown fields or empty objects are not validated.

Point 1. The solution can be enhanced to include all the fields in the else path of anyOf and setting the additionalProperties: true as default. Then the client will be able to validate the api response with additional computed fields as well. The only case where it will fail is in case the id or cuid is not inluded in the graphql query. It will become mandatory to include the id or cuid to make it work.

Point 2. Is addressed by setting additionalProperties: true as default.

Point 3. Why anyone would like to do an explicit schema validation for delete mutation. The id or cuid will be inherently validated by graphql automatically, just use the value as-is to delete the record.

Here's my solution to this problem, a simple utility function:

export const requireSchema = <T>(
    schema: { type: string; properties: T },
    required: (keyof T)[]
) => {
    return { ...schema, required }
}

It takes a schema definition with a list of required params. Here's an example using it with fastify:

server.route<{ Body: Lead }>({
    method: "POST",
    url: "/leads",
    schema: {
        body: requireSchema(schema.definitions.Lead, [
            "first_name",
            "last_name",
            "email",
        ]),
    },
    handler: async (req, reply) => {
        // ...
    },
})

This allows to enforce which fields are necessary on a route-by-route basis instead of just based on the database. Since it makes use of typescript generics, the array of required fields is validated against the schema, so its type-safe. With this method you could make the mistake of leaving out a field required by the database, but at that point Prisma will throw an error anyway. For me this is a decent middle-ground.

This library isn't usable without required

This library isn't usable without required

I think we have to talk about the abilities of the Prisma schema and its main purpose.

Per docs, the Prisma schema consists of three parts:

Data sources: Specify the details of the data sources Prisma should connect to (e.g. a PostgreSQL database)

Generators: Specifies what clients should be generated based on the data model (e.g. Prisma Client)

Data model definition: Specifies your application models (the shape of the data per data source) and their relations

The Prisma model definitions "represent the entities of your application domain, map to the tables (relational databases like PostgreSQL) or collections (MongoDB) in your database, form the foundation of the queries available in the generated Prisma Client API and when used with TypeScript, Prisma Client provides generated type definitions for your models and any variations of them to make database access entirely type-safe." - https://www.prisma.io/docs/concepts/components/prisma-schema/data-model

By only parsing the Prisma model, it is not possible to figure out, which endpoints will be created, whether it is a GraphQL endpoint or a plain REST endpoint. The actual code of your backend defines, which data the user has to send to create, read, update or delete data. Already mentioned techniques like computed fields makes things even more unpredictable. It would be necessary to parse your endpoints to create JSON schema definitions for each of them.

As @auderer already mentioned, you need validation on a route-by-route base, instead of purely relying on the type definition of the database models. The generated JSON schema can help you, but currently, it's still your responsibility to set the "required" flag on a route-by-route base.

Let's take a look, what would happen if all the fields would be marked as required:

Read

Depending on where the schema validation happens, it can cause huge problems, if all the fields are marked as required (except the fields which have the attributes @id, @default or @updatedAt). Talking about GraphQL, the client is able to fetch only a subset of a specific model, which is required for a specific purpose. Talking about REST, you might create REST endpoints, which will only deliver a subset of the defined model to the client.

Update

Updating a model can happen in a lot of different ways. The client might send only a subset of data (patch, a partial update) or it has to update all of the fields at once (put operation). The Prisma model doesn't define this behaviour, but your programming code does. It's impossible to guess, whether all the fields are required or only a subset without analyzing your endpoints.

Create

Creating a new entry in your database requires all the field information for a model. But it doesn't mean, that all the defined model fields have to be sent by the client. Maybe some information is aggregated or fetched from a different source. This means, that the client might only send a subset of the needed information and the rest is calculated/aggregated/pulled from another source.

As you can see, it is not possible to just mark all the fields as required, because a couple of cases are not covered.

@ShivamJoker Nevertheless, if you have some smart ideas to tackle the mentioned issues, I would be happy to continue the discussion and maybe find a way of supporting your needed use case. Maybe you can start by describing your issue in detail with some examples to continue the discussion upon it.

image

I have to add additionalProperties to the schema if i have extra properties, otherwise the prisma will give me a 5xx.

Mix prisma json schema with other keys is a not good idea.