Improve specification validation errors
philippjfr opened this issue · 1 comments
When a dashboard or component specification is malformed the errors are not always clear. This issue should track any unclear error messages to make sure we improve them. The goal will be to provide a lumen validate
command that can validate a YAML spec to the best of its ability without pulling any data from the Source
.
Goals
- Add a
lumen validate
command that can provide useful validation without explicitly instantiating a full dashboard - Implement
validate
methods for all components - Validation errors are highly precise and informative including the ability to highlight the exact part of the specification where the error occurred.
Design
The overall idea is that each component implements a validate
classmethod which can validate the contents of a specification without instantiating the component and without pulling in data but is still able to attempt to resolve references and variables. Additionally it should be able to highlight the specific part of the specification that caused the validation error.
The validate method therefore must accept the specification for the component itself but also be given a validation context (to resolve references and variables).
Let us take a simple dashboard spec as an example:
config:
title: Palmer Penguins
theme: dark
layout: tabs
variables:
data:
type: constant
default: https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2020/2020-07-28/penguins.csv
sources:
penguins:
type: file
cache_dir: ./cache
tables:
penguins: $variables.data
pipelines:
penguins:
source: penguins
table: penguins
filters:
species:
type: widget
field: species
island:
type: widget
field: island
sex:
type: widget
field: sex
expr:
type: param
parameter: scatter.selection_expr
Validation must happen in the following order:
variables
config
sources
pipelines
a.filters
b.transforms
targets
a.pipeline
b.facet
b.views
Now let us step through the process of validation:
def validate(dashboard_spec):
validation_context = {'variables': {}, 'sources': {}, 'pipelines': {}, 'targets': []}
for var_name in dashboard_spec.get('variables', {}).items():
spec_path = f'variables.{var_name}'
context['variables'][var_name] = Variable.validate(
dict(var_spec, name=var_name), context
)
context['config'] = Config.validate(dashboard_spec['config'], context, dashboard_spec, 'config')
for source_name, source_spec in dashboard_spec.get('sources', {}).items():
spec_path = f'sources.{source_name}'
context['sources'][source_name] = Source.validate(
dict(source_spec, name=source_name), context
)
....
and the validate signature is always:
def validate(cls, spec, context, full_spec=None, spec_path=None):
"""
Validates the component specification given the validation context.
Arguments
-----------
spec: dict
The specification for the component being validated.
context: dict
Validation context contains the specification of all previously validated components, e.g. to allow resolving of references.
Returns
--------
Validated specification.
ValidationError
We should implement a ValidationError
which can be given an error message, the validation context, the full spec and the path and then generates a helpful error message pointing to the issue in the context of the full specification.
Task List
- ValidationError
- Implement validate for Dashboard
- Implement validate for Config
- Implement validate for Variable
- Implement validate for Source
- Implement validate for Filter
- Implement validate for Transform
- Implement validate for SQLTransform
- Implement validate for View
- Implement validate for Target
This issue has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.