ramSeraph/opendata

LGD: ZIP file not compatible with unzip

Closed this issue · 3 comments

The ZIP_LZMA spec, used by the lgd scraper, is not supported by unzip, which is what most unix tooling (such as GitHub Actions) will have available by default.

Not a big issue, but perhaps worth documenting?

I am aware of the problem.. I let it be because it led to smaller file sizes. 7zip works on the files. I will update the documentation reflect the same.

The documentation has been updated at https://ramseraph.github.io/opendata/lgd/ . UI is not my strong suite, I hope that it is readable enough.

In general the current form in which the data is being exported was supposed to be a stop-gap measure. The final format was meant to be parquet files one for each component with all the historic data. I never got around to completing that.. so I decided to atleast just export the data as is, so that people can build on top of it.

The data has also not been really validated except for some spot checks I have done.

If you can provide an example of how the parquet structure was planned, I could perhaps take a stab at it.