๐ Need to clarify if `required` means a required column OR also required valid data within column
Opened this issue ยท 3 comments
As an implementer of GMNS, I'd like to understand if the required constraint applies to values or columns (or both)
Frictionless spec is really just checking for the column presence and allows for missing values.
If we don't want missing values, then we need to assert pattern or enum or other constraints.
Our intent was to use the term required
to require column presence and prohibit missing values. I think this lines up with the definition used by Frictionless Table Schema's constraints:
Property | Type | Applies to | Description |
---|---|---|---|
required
|
boolean | All |
Indicates whether this field cannot be null . If required is false (the default), then null is allowed. See the section on missingValues for how, in the physical representation of the data, strings can represent null values.
|
Is the issue actually in the frictionless
python package's implementation of the term?
@dtemkin-volpe flagging for your work with frictionless. I know the python package has been updated since I made this comment last year, has it changed what it means by "required"?
From what I can tell, "required" just means that the field can't be null, and "missingValues" is an array of values that when processed by the frictionless
python package, equate to null. For example, we list an empty string as a "missingValue" in the node
table, and you can see in the cambridge_intersection.sqlite
file that these are represented as null values and not as empty strings. So, if required
is true and ""
is in missingValues
, then ""
can't be a possible value in that entry (or at the very least, it'll raise an error when we try to create an SQLite database).