datasets/country-codes

Use Goodtables to continuous data validation

Closed this issue · 10 comments

Suggestion to use Goodtables.io. See example of datasets-br/state-codes, that is using. You can run offline by goodtables-py#cli.

Warnings

  • Data Package "datapackage.json" has a validation error Descriptor validation error: 'title' is a required property at "sources/0" in descriptor and at "properties/sources/items/required" in profile
  • ... same warning for "sources/1", "sources/2"... "sources/6".

Errors

The datatype of column ISO4217-currency_minor_uni must be an array or list, idem ISO4217-currency_numeric_code. The type "number" is about one number.

[27,12] [type-or-format-error] The value "2,2" in row 27 and column 12 is not type "number" and format "default"
[27,14] [type-or-format-error] The value "356,064" in row 27 and column 14 is not type "number" and format "default"
[59,12] [type-or-format-error] The value "2,2" in row 59 and column 12 is not type "number" and format "default"
[59,14] [type-or-format-error] The value "192,931" in row 59 and column 14 is not type "number" and format "default"
[72,12] [type-or-format-error] The value "2,2" in row 72 and column 12 is not type "number" and format "default"
[72,14] [type-or-format-error] The value "222,840" in row 72 and column 14 is not type "number" and format "default"
[101,12] [type-or-format-error] The value "2,2" in row 101 and column 12 is not type "number" and format "default"
[101,14] [type-or-format-error] The value "332,840" in row 101 and column 14 is not type "number" and format "default"
[127,12] [type-or-format-error] The value "2,2" in row 127 and column 12 is not type "number" and format "default"
[127,14] [type-or-format-error] The value "426,710" in row 127 and column 14 is not type "number" and format "default"
[153,12] [type-or-format-error] The value "2,2" in row 153 and column 12 is not type "number" and format "default"
[153,14] [type-or-format-error] The value "516,710" in row 153 and column 14 is not type "number" and format "default"
[169,12] [type-or-format-error] The value "2,2" in row 169 and column 12 is not type "number" and format "default"
[169,14] [type-or-format-error] The value "590,840" in row 169 and column 14 is not type "number" and format "default"

PS: please change the URL http://data.okfn.org/data/country-codes (at this project's home) to a correct one.

Thanks for the suggestion and guidance, @ppKrauss

I've got a PR open for this. Will probably go ahead and merge in a couple hours but would welcome your thoughts as I'm not super familiar with goodtables

also #58 fixed the URL

Thanks again!

@ppKrauss i added the goodtables badge, but goodtables has not updated for recent commits. couldn't find much documentation so not sure if any other action is needed to get this working properly.
any ideas?

Hi @ewheeler I see also something strange about "title", but there are errors that psql (COPY) see also as error: 2,2 is not a floating number. Is a problem in the CSV columns like ISO4217-currency_minor_unit.

The value "2,2" in row 2 and column 12 is not type "number" and format "default"
The value "332,840" in row 2 and column 14 is not type "number" and format "default"
...

@ppKrauss ISO4217-currency_minor_unit is type "string" so not sure why your seeing that..

https://github.com/datasets/country-codes/blob/master/datapackage.json#L131

ops, sorry... git pull ok. Well, all perfect with terminal goodtables datapackage.json!

To remove error messages at online goodtables.io, perhaps only when you do a next commit.

... Perhaps it is an internal error (delay of versions) at goodtables.io... They are developing/syncing with the v1.0 FrictionLessData.

Hi @roll , now I see that you was in contact with this dataset... Can you help here?

At terminal goodtables datapackage.json is fine, the problem is at https://goodtables.io/github/datasets/country-codes

roll commented

Hi @ppKrauss,

can you please sync with probably @anuveyatsu (https://gitter.im/datahubio/chat). I think this repo just have to be re-activated on goodtables.io but I don't have access to the datasets org on Github.

Hi @ppKrauss and @roll
I've just activated it 😄

roll commented

@anuveyatsu
Thanks! And NOW (upd.) it's green.

I think we can close this issue 😄

Success! Thanks @roll and @anuveyatsu !


Hi @ewheeler , now we can close here and focus on Wikidata, #53