Build Status Build status

CSV Schema Definition and Validation

Install

pip install csv-schema

Usage

usage: csv-schema [-h] [--version] {validate-config,validate-csv,generate-config} ...

CSV Schema

optional arguments:
  -h, --help            show this help message and exit
  --version             show program's version number and exit

Commands:
  {validate-config,validate-csv,generate-config}
    validate-config     Validates the CSV schema JSON configuration file.
    validate-csv        Validates a CSV file against a schema.
    generate-config     Generate a CSV schema JSON configuration file.

Definition

The CSV file schema definition is defined in a JSON file in the following format.

File Definition:

{
  "name": null,
  "description": null,
  "filename": {
    "regex": null
  },
  "columns": []
}
Property Description
name The name of the schema.
description The description of the schema.
filename Properties for the name of the CSV filename to validate.
columns List of column definitions.

filename

{
  "regex": null
}
Property Description
regex Regular expression to validate the name of the CSV file being validated.

Column Definitions:

string

{
  "type": "string",
  "name": null,
  "required": true,
  "null_or_empty": false,
  "regex": null,
  "min": null,
  "max": null
}
Property Description
type The column type.
name The name of the column.
required Whether or not the column is required in the file.
null_or_empty Whether or not the value can be null (missing) or an empty string.
regex Regular expression to validate the column value.
min The minimum length of the string. null for no limit.
max The maximum length of the string. null for no limit.

integer

{
  "type": "integer",
  "name": null,
  "required": true,
  "null_or_empty": false,
  "regex": null,
  "min": null,
  "max": null
}
Property Description
type The column type.
name The name of the column.
required Whether or not the column is required in the file.
null_or_empty Whether or not the value can be null (missing) or an empty string.
regex Regular expression to validate the column value.
min The minimum value. null for no limit.
max The maximum value. null for no limit.

decimal

{
  "type": "decimal",
  "name": null,
  "required": true,
  "null_or_empty": false,
  "regex": null,
  "min": null,
  "max": null,
  "precision": 2
}
Property Description
type The column type.
name The name of the column.
required Whether or not the column is required in the file.
null_or_empty Whether or not the value can be null (missing) or an empty string.
regex Regular expression to validate the column value.
min The minimum value. null for no limit.
max The maximum value. null for no limit.
precision The decimal point precision.

enum

{
  "type": "enum",
  "name": null,
  "required": true,
  "null_or_empty": false,
  "values": []
}
Property Description
type The column type.
name The name of the column.
required Whether or not the column is required in the file.
null_or_empty Whether or not the value can be null (missing) or an empty string.
values Fixed set of constants.

Development

Setup

pipenv --three
pipenv shell
make pip_install
make build
make install_local

See Makefile for all commands.

Update README Definitions

The schema definition is auto generated by running scripts/generate_md_help.py. Copy/paste the output into the README.md file.

Testing

  • Create and activate a virtual environment:
  • Run the tests: make test