sayari-analytics/pyisic

fix: invalid codes in concordances

Closed this issue · 1 comments

Description

Briefly describe the issue you are experiencing or the feature you want to see added.

  • there are codes in theses concordances that are not valid as described in their respective standard
    • NAICS2017_to_ISIC4
    • TSIC2552_to_ISIC3
    • KSIC10_to_ISIC4
    • SKD2002_to_SKD2008
    • SKD2008_to_SKD2002
    • CNAE2_to_ISIC4
    • NACEBEL2003_to_NACEBEL2008
  • for example, ISIC4 code "111" does not exist but is listed in the NAICS2017_to_ISIC4 concordance.

Todo

  • add a test case that tests all of the codes listed in all of the concordances against their appropriate standard to check if there are any errors
  • fix the above concordances

Environment

  • OS - macOS Monterey
  • Python version - 3.7
  • Package version - 0.1.8

Details

If necessary, describe the problem you have been experiencing in more detail.

Code To Reproduce

If reporting a defect, please include a code sample to reproduce the defect and outline the expected behavior.

standard, code = next(iter(pyisic.ToISIC4("111110", pyisic.Standards.NAICS2017)))
print(standard, code)
print(pyisic.ISIC4[code])
# outputs
# Standards.ISIC4 111
# Traceback (most recent call last):
#   File "/path/to/file.py", line 110, in <module>
#     print(pyisic.ISIC4[code])
# KeyError: '111'

# should output
# Standards.ISIC4 111
# {'code': '0111', 'description': 'Growing of cereals (except rice),, leguminous crops and oil seeds', 'category': <Category.CLASS: 4>}

Here are all of the invalid codes:

{
    "NAICS2017 => ISIC4": {
        "ISIC4": [
            "899",
            "312",
            "127",
            "610",
            "113",
            "810",
            "520",
            "114",
            "112",
            "145",
            "144",
            "163",
            "149",
            "146",
            "510",
            "121",
            "123",
            "122",
            "129",
            "119",
            "240",
            "893",
            "710",
            "125",
            "116",
            "891",
            "126",
            "220",
            "892",
            "729",
            "124",
            "164",
            "150",
            "311",
            "128",
            "111",
            "115",
            "130",
            "230"
        ],
        "NAICS2017": [
            "0"
        ]
    },
    "TSIC2552 => ISIC3": {
        "ISIC3": [
            "3666",
            "9306",
            "2729",
            "2526"
        ],
        "TSIC2552": [
            "01210",
            "03113",
            "01259",
            "01450",
            "01301",
            "08931",
            "03224",
            "01169",
            "03122",
            "03222",
            "01239",
            "03213",
            "27502",
            "01115",
            "01500",
            "01251",
            "01223",
            "01111",
            "01420",
            "01630",
            "01299",
            "01269",
            "01495",
            "03225",
            "01463",
            "08104",
            "01462",
            "05200",
            "01272",
            "01249",
            "01252",
            "01619",
            "01140",
            "03119",
            "01282",
            "01135",
            "06100",
            "01229",
            "01224",
            "09100",
            "07299",
            "02100",
            "03129",
            "02300",
            "01491",
            "01629",
            "03219",
            "07210",
            "08103",
            "01150",
            "01492",
            "01430",
            "01612",
            "01494",
            "09900",
            "01114",
            "07292",
            "07300",
            "01411",
            "01139",
            "08932",
            "01112",
            "01442",
            "03121",
            "01134",
            "07291",
            "01496",
            "01291",
            "01199",
            "25119",
            "02400",
            "03115",
            "03223",
            "03211",
            "01412",
            "07100",
            "01461",
            "08991",
            "01493",
            "03112",
            "08910",
            "01289",
            "08999",
            "01261",
            "03229",
            "01469",
            "01136",
            "08920",
            "02200",
            "01226",
            "03114",
            "01132",
            "01192",
            "03214",
            "01193",
            "08101",
            "01133",
            "01194",
            "01611",
            "03212",
            "01131",
            "01225",
            "01302",
            "01113",
            "08102",
            "01281",
            "01222",
            "01499",
            "03111",
            "01700",
            "25201",
            "01161",
            "01241",
            "01279",
            "01122",
            "01262",
            "06200",
            "01227",
            "01221",
            "01441",
            "01228",
            "01292",
            "01231",
            "01191",
            "01271",
            "01121",
            "03221",
            "05100",
            "01640",
            "01419",
            "01621"
        ]
    },
    "KSIC10 => ISIC4": {
        "KSIC10": [
            "14300",
            "64911",
            "06020"
        ]
    },
    "SKD2002 => SKD2008": {
        "SKD2002": [
            "17.200",
            "64.110",
            "96.000",
            "62.300",
            "15.620",
            "63.400",
            "97.000",
            "20.520"
        ]
    },
    "SKD2008 => SKD2002": {
        "SKD2002": [
            "17.200",
            "64.110",
            "96.000",
            "62.300",
            "15.620",
            "63.400",
            "97.000",
            "20.520"
        ]
    },
    "CNAE2 => ISIC4": {
        "ISIC4": [
            "6022",
            "6021"
        ]
    },
    "NACEBEL2003 => NACEBEL2008": {
        "NACEBEL2003": [
            "",
            "3310"
        ],
        "NACEBEL2008": [
            ""
        ]
    }
}