Checksum for PubLayNet_PDF.tar.gz
conjuncts opened this issue · 1 comments
conjuncts commented
Hello,
I tried downloading the pdf dataset, but I only unzipped around 10% before I ran into a data corruption issue. Are checksums or data splits available for the PubLayNet_PDF.tar.gz?
themanoftalent commented
It sounds like you're encountering issues with downloading the PubLayNet dataset. Unfortunately, without specific details about where you're downloading the dataset from, it's challenging to provide a precise solution for me. However, I can offer some general advice for ya.
- Check for Official Sources: Ensure that you're downloading the dataset from the official source. This is very typical.
- Checksums: Check if the dataset provider offers checksums for the files.
- Data Splits: Some datasets are split into multiple parts for easier downloading. Ensure that you've downloaded all parts.
- Redownload: If you suspect the downloaded file is corrupted, try downloading it again. It works sometimes.
Akif, the outlier