frictionlessdata/frictionless-py

Unexpected "missing-label" error with option `header_case = False`

amelie-rondot opened this issue · 0 comments

Overview

In the of migration from v4 to v5 of frictionless-py in validata.fr, we experienced an unexpected missing-label error when validating a tabular data with header_case=False dialect option and using a column which is lower case instead of upper case as in the schema fields.

For example:

data = [["aa", "BB"], ["a", "b"]]
schema = {
        "$schema": "https://frictionlessdata.io/schemas/table-schema.json",
        "fields": [
            {"name": "AA", "constraints": {"required": True}},
            {"name": "bb", "constraints": {"required": True}}
        ]
    }

Using python, the validation report is invalid containting two missing-label errors:

if __name__ == "__main__":
    schema = frictionless.Schema.from_descriptor(schema)
    report = frictionless.validate(resources.Resource(
        source=source,
        schema=frictionless.Schema.from_descriptor(schema),
        dialect=frictionless.Dialect(header_case=False),
        detector=frictionless.Detector(schema_sync=True)
    ))

    # Expect valid report
    print(report)

Output:

{'valid': False,
 'stats': {'tasks': 1, 'errors': 2, 'warnings': 0, 'seconds': 0.004},
 'warnings': [],
 'errors': [],
 'tasks': [{'name': 'memory',
            'type': 'table',
            'valid': False,
            'place': '<memory>',
            'labels': ['aa', 'BB'],
            'stats': {'errors': 2,
                      'warnings': 0,
                      'seconds': 0.004,
                      'fields': 4,
                      'rows': 1},
            'warnings': [],
            'errors': [{'type': 'missing-label',
                        'title': 'Missing Label',
                        'description': 'Based on the schema there should be a '
                                       "label that is missing in the data's "
                                       'header.',
                        'message': "There is a missing label in the header's "
                                   'field "AA" at position "3"',
                        'tags': ['#table', '#header', '#label'],
                        'note': '',
                        'labels': ['aa', 'BB'],
                        'rowNumbers': [1],
                        'label': '',
                        'fieldName': 'AA',
                        'fieldNumber': 3},
                       {'type': 'missing-label',
                        'title': 'Missing Label',
                        'description': 'Based on the schema there should be a '
                                       "label that is missing in the data's "
                                       'header.',
                        'message': "There is a missing label in the header's "
                                   'field "bb" at position "4"',
                        'tags': ['#table', '#header', '#label'],
                        'note': '',
                        'labels': ['aa', 'BB'],
                        'rowNumbers': [1],
                        'label': '',
                        'fieldName': 'bb',
                        'fieldNumber': 4}]}]}

Expected behaviour

According to the documentation of HeaderCase Dialect parameter, I was expected a valid report.

Other details and experimentations

Used Frictionless version 5.16.1, last commit on main branch

Same result with command line validation.
I have put "schema-sync" to reproduce more closely our use case, but it does not seem to be related with the actual issue.