Issue with pelias download wof: Corrupted SQLite during whosonfirst full planent Data Download
taminoelgert opened this issue · 2 comments
Describe the bug
When attempting a full planet build using Kubernetes, the pelias download wof command consistently throws the following error after downloading the whosonfirst data:
error: [whosonfirst] error downloading whosonfirst-data-admin-latest.db.bz2
Error: Command failed: curl -sA 'pelias-whosonfirst/0.0.0-development' https://data.geocode.earth/wof/dist/sqlite/whosonfirst-data-admin-latest.db.bz2 | lbunzip2 > /data/whosonfirst/sqlite/whosonfirst-data-admin-latest.db
lbunzip2: stdin: compressed data error: bad block header magic
Steps to Reproduce
- use full planent config
- start whosonfirst container with
./bin/download
command (pelias download wof)
Expected behavior
The pelias download wof command should download the whosonfirst data without encountering any errors.
Environment (please complete the following information):
- Kubernetes environment with 32 cores and 64 GB RAM (on Kubernetes nodes).
- Local environment with 24 cores and 32 GB RAM.
- OS: [e.g. Linux]
- Docker version 24.0.7, build afdd53b
Pastebin/Screenshots
pelias config:
{
"logger": {
"level": "info",
"timestamp": true
},
"esclient": {
"apiVersion": "7.x",
"hosts": [
{
"protocol": "https",
"host": "geocoder-es-http",
}
]
},
"acceptance-tests": {
"endpoints": {
"docker": "http://pelias-api:4000/v1/"
}
},
"api": {
"services": {
"placeholder": {
"url": "http://pelias-placeholder:4100"},
"interpolation": {
"url": "http://pelias-interpolation:4300"},
"libpostal": {
"url": "http://pelias-libpostal:4400"}
}
},
"imports": {
"adminLookup": {
"enabled": true
},
"geonames": {
"datapath": "/data/geonames",
"countryCode": "ALL"
},
"openstreetmap": {
"download": [
{
"sourceURL": "https://planet.openstreetmap.org/pbf/planet-latest.osm.pbf"}
],
"leveldbpath": "/tmp",
"datapath": "/data/openstreetmap",
"import": [
{
"filename": "planet-latest.osm.pbf"
}]
},
"openaddresses": {
"datapath": "/data/openaddresses",
"files": [
]
},
"polyline": {
"datapath": "/data/polylines",
"files": [
"extract.0sv"]
},
"whosonfirst": {
"datapath": "/data/whosonfirst",
"importPostalcodes": true
},
"interpolation": {
"download": {
"tiger": {
"datapath": "/data/tiger"
}
}
}
}
}
Additional context
The issue can also be reproduced locally in a Docker environment by following the same steps up to the pelias download all command. Subsequent steps, such as placeholder prepare, fail because "the SQLite is corrupted."
References
Thank you for your assessment
Hi @taminoelgert, I wasn't able to reproduce this issue.
It might have been an intermittent connection issue with our CDN provider https://bunny.net/
Could you please confirm if the issue has resolved itself?
aria2c https://data.geocode.earth/wof/dist/sqlite/whosonfirst-data-admin-latest.db.bz2
03/11 15:29:45 [NOTICE] Downloading 1 item(s)
*** Download Progress Summary as of Mon Mar 11 15:30:47 2024 ***
=============================================================================
[#b8d2ec 6.2GiB/8.0GiB(78%) CN:1 DL:92MiB ETA:19s]
FILE: /tmp/whosonfirst-data-admin-latest.db.bz2
-----------------------------------------------------------------------------
[#b8d2ec 7.9GiB/8.0GiB(98%) CN:1 DL:108MiB]
03/11 15:31:06 [NOTICE] Download complete: /tmp/whosonfirst-data-admin-latest.db.bz2
Download Results:
gid |stat|avg speed |path/URI
======+====+===========+=======================================================
b8d2ec|OK | 105MiB/s|/tmp/whosonfirst-data-admin-latest.db.bz2
Status Legend:
(OK):download completed.
lbunzip2 -t whosonfirst-data-admin-latest.db.bz2
echo $?
0
Thanks for the reply, I have just tried again and now it seems to be working without any problems. Thanks for the help though, I'll close the ticket then.