frictionlessdata/frictionless-py

Adding --parallel CLI flag breaks validation

diego-oncoramedical opened this issue · 0 comments

Overview

Adding --parallel to frictionless validate breaks validation and fails. I could really use this flag because multiple files in this dataset are ≥ 2GB, some approaching 7GB.

Details

My package passes validation by running this command:

frictionless validate --trusted package.json

However, if I add --parallel it fails immediately:

/tmp # frictionless validate --trusted --parallel package.json
╭─ Error ───────────────────────────────────────────────────────╮
│ name 'Resource' is not defined                                │
╰───────────────────────────────────────────────────────────────╯

I have the following package definition file, here formatted as YAML instead of JSON for readability:

resources:
  - encoding: utf-8
    format: csv
    mediatype: text/csv
    name: medical_observations
    path: /var/dataset/output/Observations_20230917102633.csv
    schema: /app/schemas/medical/observations.yaml
    type: table
  - encoding: utf-8
    format: csv
    mediatype: text/csv
    name: medical_medications
    path: /var/dataset/output/Medications_20230917102633.csv
    schema: /app/schemas/medical/medications.yaml
    type: table
  - encoding: utf-8
    format: csv
    mediatype: text/csv
    name: medical_vitals
    path: /var/dataset/output/Vitals_20230917102633.csv
    schema: /app/schemas/medical/vitals.yaml
    type: table
  - encoding: utf-8
    format: csv
    mediatype: text/csv
    name: medical_patient
    path: /var/dataset/output/Patient_20230917102632.csv
    schema: /app/schemas/medical/patient.yaml
    type: table
  - encoding: utf-8
    format: csv
    mediatype: text/csv
    name: medical_toxicity
    path: /var/dataset/output/Toxicity_20230917102635.csv
    schema: /app/schemas/medical/toxicity.yaml
    type: table
  - encoding: utf-8
    format: csv
    mediatype: text/csv
    name: medical_encounter
    path: /var/dataset/output/Encounter_20230917102627.csv
    schema: /app/schemas/medical/encounter.yaml
    type: table
  - encoding: utf-8
    format: csv
    mediatype: text/csv
    name: medical_problem
    path: /var/dataset/output/Problem_20230917102634.csv
    schema: /app/schemas/medical/problem.yaml
    type: table

Environment

This is running on the official Docker image for Python on Alpine 3.19; Python versions tried are 3.12.2, 3.11.7, and 3.10.13.

uname -a:

Linux 3ff4bcba25f0 6.6.12-linuxkit #1 SMP Fri Jan 19 08:53:17 UTC 2024 aarch64 Linux

frictionless --version:

5.16.1

pip3 freeze:

annotated-types==0.6.0
attrs==23.2.0
certifi==2024.2.2
chardet==5.2.0
charset-normalizer==3.3.2
click==8.1.7
colorama==0.4.6
frictionless==5.16.1
humanize==4.9.0
idna==3.6
isodate==0.6.1
Jinja2==3.1.3
jsonschema==4.17.3
markdown-it-py==3.0.0
marko==2.0.2
MarkupSafe==2.1.5
mdurl==0.1.2
petl==1.7.14
pydantic==2.6.1
pydantic_core==2.16.2
Pygments==2.17.2
pyrsistent==0.20.0
python-dateutil==2.8.2
python-slugify==8.0.4
PyYAML==6.0.1
requests==2.31.0
rfc3986==2.0.0
rich==13.7.0
setuptools==69.1.0
shellingham==1.5.4
simpleeval==0.9.13
six==1.16.0
stringcase==1.2.0
tabulate==0.9.0
text-unidecode==1.3
typer==0.9.0
typing_extensions==4.9.0
urllib3==2.2.1
validators==0.22.0
wheel==0.42.0