Unexpected "missing-label" error with option `header_case = False`
amelie-rondot opened this issue · 0 comments
amelie-rondot commented
Overview
In the of migration from v4 to v5 of frictionless-py in validata.fr, we experienced an unexpected missing-label
error when validating a tabular data with header_case=False
dialect option and using a column which is lower case instead of upper case as in the schema fields.
For example:
data = [["aa", "BB"], ["a", "b"]]
schema = {
"$schema": "https://frictionlessdata.io/schemas/table-schema.json",
"fields": [
{"name": "AA", "constraints": {"required": True}},
{"name": "bb", "constraints": {"required": True}}
]
}
Using python, the validation report is invalid containting two missing-label
errors:
if __name__ == "__main__":
schema = frictionless.Schema.from_descriptor(schema)
report = frictionless.validate(resources.Resource(
source=source,
schema=frictionless.Schema.from_descriptor(schema),
dialect=frictionless.Dialect(header_case=False),
detector=frictionless.Detector(schema_sync=True)
))
# Expect valid report
print(report)
Output:
{'valid': False,
'stats': {'tasks': 1, 'errors': 2, 'warnings': 0, 'seconds': 0.004},
'warnings': [],
'errors': [],
'tasks': [{'name': 'memory',
'type': 'table',
'valid': False,
'place': '<memory>',
'labels': ['aa', 'BB'],
'stats': {'errors': 2,
'warnings': 0,
'seconds': 0.004,
'fields': 4,
'rows': 1},
'warnings': [],
'errors': [{'type': 'missing-label',
'title': 'Missing Label',
'description': 'Based on the schema there should be a '
"label that is missing in the data's "
'header.',
'message': "There is a missing label in the header's "
'field "AA" at position "3"',
'tags': ['#table', '#header', '#label'],
'note': '',
'labels': ['aa', 'BB'],
'rowNumbers': [1],
'label': '',
'fieldName': 'AA',
'fieldNumber': 3},
{'type': 'missing-label',
'title': 'Missing Label',
'description': 'Based on the schema there should be a '
"label that is missing in the data's "
'header.',
'message': "There is a missing label in the header's "
'field "bb" at position "4"',
'tags': ['#table', '#header', '#label'],
'note': '',
'labels': ['aa', 'BB'],
'rowNumbers': [1],
'label': '',
'fieldName': 'bb',
'fieldNumber': 4}]}]}
Expected behaviour
According to the documentation of HeaderCase
Dialect
parameter, I was expected a valid report.
Other details and experimentations
Used Frictionless version 5.16.1, last commit on main branch
Same result with command line validation.
I have put "schema-sync" to reproduce more closely our use case, but it does not seem to be related with the actual issue.