frees-io/freestyle-cassandra

Spike - Schema Validator

Closed this issue · 3 comments

Research and discussion about the Schema Validator feature

Overview

The idea behind this issue is to have an operation like this:

trait SchemaValidator {
  def validateStatement(st: Statement): ValidatedNel[_]
}

class DefaultSchemaValidator(sdp: SchemaDefinitionProvider) extends SchemaValidator {
  def validateStatement(st: Statement): ValidatedNel[_] {
    // Check
    // * Table name
    // * Field/Column names
    // * Field/Column types
  }
}

trait SchemaDefinitionProvider {
  def schemaDefinition: SchemaDefinition
}

implicit def schemaValidator(implicit sdp: SchemaDefinitionProvider): SchemaValidator =
  new DefaultSchemaValidator(sdp)

This would allow abstract us over the "Schema Definition" and "Schema Validator"

With the above definition, we could have a query executor asking for an implicit SchemaValidator. For example:

def getAll[T](query: Query)(implicit sv: SchemaValidator): F[Either[Error, Seq[T]] = ???

Notes:

  • Statement could be "SELECT" but also "INSERT", "UPDATE", ...
  • Query is a subtype of Statement
  • SchemaDefinitionProvider could be defined programmatically as fixed objects, defined by some kind of parser that reads SQL schema files, or whatever other approaches.

Some comments

  1. Do we want to have support for more than one version of CQL?
  2. We definitely want a cql interpolator and throw errors at compile time.
    2.1. What's the best way to achieve that?
    2.2. Is this possible with the above approach?

Existing Libraries

Troy

Type-safe & compile-time-checked wrapper around the Cassandra driver.

Troy contains some functionalities that could be used for this feature:

The module cql-ast contains a complete definition of all possible values for schemas, tables, statements, ...
We could use this module for our schema and statement definition.

The module cql-parser contains a CQL parser for files. This could be used to create a definition of the SchemaDefinitionProvider that reads CQL files from file system.

The module troy-schema contains some Schema related classes and validators. This could be used to create a definition of the SchemaValidator.

Some thoughts about using Troy:

  • We'll be closely coupled to the Troy project
  • We'll be tied to CQL v3.4.3
  • The schema validation is not exposed as a functionality
  • To my knowledge, the schema validation doesn't support type checking

I've prepared a proof of concept using Troy

Quill

To my knowledge, Quill doesn't provide Schema validation either a Schema model. So, it can be discarded for this topic.

Phantom

Phantom provides a model for keyspace and tables and a schema validator. It also adds support in the pro version for parsing cql schema files.

The problem is that phantom doesn't expose these features as a standalone function or something similar. It's very integrated and coupled to the library.

Conclusions

We could use Troy to implement all desired functionality because it provides all the needed pieces.

On the other hand, it's likely Quill & Phantom will implement the free monad and DBIO. We could implement some integrations once this feature is ready.

@raulraja the spike is ready, could you please take a look? Thanks

Let's move forward with those troy modules for the time being and see what limitations they impose.