Weird error while running parallel import of Rheinland-Pfalz data
Closed this issue · 9 comments
Hey, I'm running import of the ALKIs data Rheinland-Pfalz in parallel, and I'm getting these errors.
Do you know what could be the cause of them?
Below you have config_file that I'm using:
epsg 25832
schema alkis_rheinland_pfalz
historie off
jobs 6
avoiddupes off
update
debug off
log /mnt/f/project_germany_100/rlp/alkis/ALKIS_Importer_Pipeline/import_logs/2024_06_06_085030/batch1_import_log_file.log
/mnt/f/project_germany_100/rlp/alkis/ALKIS_Importer_Pipeline/input_batched/batch1/bda_5750_1001_Huemmel.xml
…
…
…
And logs:
/mnt/f/project_germany_100/rlp/alkis/ALKIS_Importer_Pipeline/input_batched/batch1/bda_5750_1227_Monreal.xml ERROR: Ergebnis 1 bei /mnt/f/project_germany_100/rlp/alkis/ALKIS_Importer_Pipeline/input_batched/batch1/bda_5750_1227_Monreal.xml (bislang 1 Fehler)
/mnt/f/project_germany_100/rlp/alkis/ALKIS_Importer_Pipeline/input_batched/batch1/bda_5750_1227_Monreal.xml TIME: /mnt/f/project_germany_100/rlp/alkis/ALKIS_Importer_Pipeline/input_batched/batch1/bda_5750_1227_Monreal.xml mit 145MiB in 4m21s importiert (570kiB/s; Gesamt:1248kiB/s).
/mnt/f/project_germany_100/rlp/alkis/ALKIS_Importer_Pipeline/input_batched/batch1/bda_5750_1227_Monreal.xml REMAINING: 36GiB 71% 8h30m47s ETA:Thu Jun 6 20:42:12 CEST 2024
/mnt/f/project_germany_100/rlp/alkis/ALKIS_Importer_Pipeline/input_batched/batch1/bda_5750_1227_Monreal.xml
/mnt/f/project_germany_100/rlp/alkis/ALKIS_Importer_Pipeline/input_batched/batch1/bda_5750_1227_Monreal.xml ERROR 1: CREATE TABLE "alkis_rheinland_pfalz"."ax_besonderergebaeudepunkt" ( "ogc_fid" SERIAL, PRIMARY KEY ("ogc_fid"), "gml_id" VARCHAR(16), "anlass" varchar[], "beginnt" VARCHAR, "endet" VARCHAR, "advstandardmodell" varchar[], "sonstigesmodell" varchar[], "quellobjektid" VARCHAR, "zeigtaufexternes_art" varchar[], "zeigtaufexternes_name" varchar[], "zeigtaufexternes_uri" varchar[], "art" INTEGER, "hoehe" FLOAT8[], "obererbezugspunkt" INTEGER[], "untererbezugspunkt" INTEGER[], "punktkennung" VARCHAR, "sonstigeeigenschaft" varchar[], "zustaendigestelle_land" VARCHAR, "zustaendigestelle_stelle" VARCHAR, "bestehtaus" varchar[], "istteilvon" varchar[] )
/mnt/f/project_germany_100/rlp/alkis/ALKIS_Importer_Pipeline/input_batched/batch1/bda_5750_1227_Monreal.xml
/mnt/f/project_germany_100/rlp/alkis/ALKIS_Importer_Pipeline/input_batched/batch1/bda_5750_1227_Monreal.xml ERROR 1: Unable to write feature 1 from layer ax_besonderergebaeudepunkt.
/mnt/f/project_germany_100/rlp/alkis/ALKIS_Importer_Pipeline/input_batched/batch1/bda_5750_1227_Monreal.xml ERROR 1: Terminating translation prematurely after failed
/mnt/f/project_germany_100/rlp/alkis/ALKIS_Importer_Pipeline/input_batched/batch1/bda_5750_1227_Monreal.xml translation of layer ax_besonderergebaeudepunkt (use -skipfailures to skip errors)
parallel: This job failed:
import /mnt/f/project_germany_100/rlp/alkis/ALKIS_Importer_Pipeline/input_batched/batch1/bda_5750_1227_Monreal.xml
parallel: Starting no more jobs. Waiting for 5 jobs to finish.
/mnt/f/project_germany_100/rlp/alkis/ALKIS_Importer_Pipeline/input_batched/batch1/bda_5750_1226_Reudelsterz.xml
/mnt/f/project_germany_100/rlp/alkis/ALKIS_Importer_Pipeline/input_batched/batch1/bda_5750_1226_Reudelsterz.xml
/mnt/f/project_germany_100/rlp/alkis/ALKIS_Importer_Pipeline/input_batched/batch1/bda_5750_1226_Reudelsterz.xml
/mnt/f/project_germany_100/rlp/alkis/ALKIS_Importer_Pipeline/input_batched/batch1/bda_5750_1226_Reudelsterz.xml
/mnt/f/project_germany_100/rlp/alkis/ALKIS_Importer_Pipeline/input_batched/batch1/bda_5750_1226_Reudelsterz.xml ERROR: Ergebnis 1 bei /mnt/f/project_germany_100/rlp/alkis/ALKIS_Importer_Pipeline/input_batched/batch1/bda_5750_1226_Reudelsterz.xml (bislang 2 Fehler)
/mnt/f/project_germany_100/rlp/alkis/ALKIS_Importer_Pipeline/input_batched/batch1/bda_5750_1226_Reudelsterz.xml TIME: /mnt/f/project_germany_100/rlp/alkis/ALKIS_Importer_Pipeline/input_batched/batch1/bda_5750_1226_Reudelsterz.xml mit 63MiB in 4m26s importiert (246kiB/s; Gesamt:1253kiB/s).
/mnt/f/project_germany_100/rlp/alkis/ALKIS_Importer_Pipeline/input_batched/batch1/bda_5750_1226_Reudelsterz.xml REMAINING: 36GiB 71% 8h27m54s ETA:Thu Jun 6 20:39:24 CEST 2024
/mnt/f/project_germany_100/rlp/alkis/ALKIS_Importer_Pipeline/input_batched/batch1/bda_5750_1226_Reudelsterz.xml
/mnt/f/project_germany_100/rlp/alkis/ALKIS_Importer_Pipeline/input_batched/batch1/bda_5750_1226_Reudelsterz.xml ERROR 1: CREATE TABLE "alkis_rheinland_pfalz"."ax_besonderergebaeudepunkt" ( "ogc_fid" SERIAL, PRIMARY KEY ("ogc_fid"), "gml_id" VARCHAR(16), "anlass" varchar[], "beginnt" VARCHAR, "endet" VARCHAR, "advstandardmodell" varchar[], "sonstigesmodell" varchar[], "quellobjektid" VARCHAR, "zeigtaufexternes_art" varchar[], "zeigtaufexternes_name" varchar[], "zeigtaufexternes_uri" varchar[], "art" INTEGER, "hoehe" FLOAT8[], "obererbezugspunkt" INTEGER[], "untererbezugspunkt" INTEGER[], "punktkennung" VARCHAR, "sonstigeeigenschaft" varchar[], "zustaendigestelle_land" VARCHAR, "zustaendigestelle_stelle" VARCHAR, "bestehtaus" varchar[], "istteilvon" varchar[] )
/mnt/f/project_germany_100/rlp/alkis/ALKIS_Importer_Pipeline/input_batched/batch1/bda_5750_1226_Reudelsterz.xml
/mnt/f/project_germany_100/rlp/alkis/ALKIS_Importer_Pipeline/input_batched/batch1/bda_5750_1226_Reudelsterz.xml ERROR 1: Unable to write feature 1 from layer ax_besonderergebaeudepunkt.
/mnt/f/project_germany_100/rlp/alkis/ALKIS_Importer_Pipeline/input_batched/batch1/bda_5750_1226_Reudelsterz.xml ERROR 1: Terminating translation prematurely after failed
/mnt/f/project_germany_100/rlp/alkis/ALKIS_Importer_Pipeline/input_batched/batch1/bda_5750_1226_Reudelsterz.xml translation of layer ax_besonderergebaeudepunkt (use -skipfailures to skip errors)
parallel: This job failed:
import /mnt/f/project_germany_100/rlp/alkis/ALKIS_Importer_Pipeline/input_batched/batch1/bda_5750_1226_Reudelsterz.xml
parallel: Starting no more jobs. Waiting for 4 jobs to finish.
/mnt/f/project_germany_100/rlp/alkis/ALKIS_Importer_Pipeline/input_batched/batch1/bda_5750_1236_Allenz.xml
/mnt/f/project_germany_100/rlp/alkis/ALKIS_Importer_Pipeline/input_batched/batch1/bda_5750_1236_Allenz.xml
/mnt/f/project_germany_100/rlp/alkis/ALKIS_Importer_Pipeline/input_batched/batch1/bda_5750_1236_Allenz.xml
/mnt/f/project_germany_100/rlp/alkis/ALKIS_Importer_Pipeline/input_batched/batch1/bda_5750_1236_Allenz.xml
/mnt/f/project_germany_100/rlp/alkis/ALKIS_Importer_Pipeline/input_batched/batch1/bda_5750_1228_Kehrig.xml TIME: /mnt/f/project_germany_100/rlp/alkis/ALKIS_Importer_Pipeline/input_batched/batch1/bda_5750_1228_Kehrig.xml mit 144MiB in 7m5s importiert (349kiB/s; Gesamt:1246kiB/s).
/mnt/f/project_germany_100/rlp/alkis/ALKIS_Importer_Pipeline/input_batched/batch1/bda_5750_1228_Kehrig.xml REMAINING: 36GiB 71% 8h28m50s ETA:Thu Jun 6 20:43:28 CEST 2024
/mnt/f/project_germany_100/rlp/alkis/ALKIS_Importer_Pipeline/input_batched/batch1/bda_5750_1236_Allenz.xml ERROR: Ergebnis 1 bei /mnt/f/project_germany_100/rlp/alkis/ALKIS_Importer_Pipeline/input_batched/batch1/bda_5750_1236_Allenz.xml (bislang 3 Fehler)
/mnt/f/project_germany_100/rlp/alkis/ALKIS_Importer_Pipeline/input_batched/batch1/bda_5750_1236_Allenz.xml TIME: /mnt/f/project_germany_100/rlp/alkis/ALKIS_Importer_Pipeline/input_batched/batch1/bda_5750_1236_Allenz.xml mit 92MiB in 5m56s importiert (265kiB/s; Gesamt:1253kiB/s).
/mnt/f/project_germany_100/rlp/alkis/ALKIS_Importer_Pipeline/input_batched/batch1/bda_5750_1236_Allenz.xml REMAINING: 36GiB 71% 8h24m28s ETA:Thu Jun 6 20:39:07 CEST 2024
/mnt/f/project_germany_100/rlp/alkis/ALKIS_Importer_Pipeline/input_batched/batch1/bda_5750_1236_Allenz.xml
/mnt/f/project_germany_100/rlp/alkis/ALKIS_Importer_Pipeline/input_batched/batch1/bda_5750_1236_Allenz.xml ERROR 1: CREATE TABLE "alkis_rheinland_pfalz"."ax_besonderergebaeudepunkt" ( "ogc_fid" SERIAL, PRIMARY KEY ("ogc_fid"), "gml_id" VARCHAR(16), "anlass" varchar[], "beginnt" VARCHAR, "endet" VARCHAR, "advstandardmodell" varchar[], "sonstigesmodell" varchar[], "quellobjektid" VARCHAR, "zeigtaufexternes_art" varchar[], "zeigtaufexternes_name" varchar[], "zeigtaufexternes_uri" varchar[], "art" INTEGER, "hoehe" FLOAT8[], "obererbezugspunkt" INTEGER[], "untererbezugspunkt" INTEGER[], "punktkennung" VARCHAR, "sonstigeeigenschaft" varchar[], "zustaendigestelle_land" VARCHAR, "zustaendigestelle_stelle" VARCHAR, "bestehtaus" varchar[], "istteilvon" varchar[] )
/mnt/f/project_germany_100/rlp/alkis/ALKIS_Importer_Pipeline/input_batched/batch1/bda_5750_1236_Allenz.xml
/mnt/f/project_germany_100/rlp/alkis/ALKIS_Importer_Pipeline/input_batched/batch1/bda_5750_1236_Allenz.xml ERROR 1: Unable to write feature 1 from layer ax_besonderergebaeudepunkt.
/mnt/f/project_germany_100/rlp/alkis/ALKIS_Importer_Pipeline/input_batched/batch1/bda_5750_1236_Allenz.xml ERROR 1: Terminating translation prematurely after failed
/mnt/f/project_germany_100/rlp/alkis/ALKIS_Importer_Pipeline/input_batched/batch1/bda_5750_1236_Allenz.xml translation of layer ax_besonderergebaeudepunkt (use -skipfailures to skip errors)
parallel: This job failed:
import /mnt/f/project_germany_100/rlp/alkis/ALKIS_Importer_Pipeline/input_batched/batch1/bda_5750_1236_Allenz.xml
parallel: Starting no more jobs. Waiting for 2 jobs to finish.
/mnt/f/project_germany_100/rlp/alkis/ALKIS_Importer_Pipeline/input_batched/batch1/bda_5750_1238_Hausen.xml TIME: /mnt/f/project_germany_100/rlp/alkis/ALKIS_Importer_Pipeline/input_batched/batch1/bda_5750_1238_Hausen.xml mit 133MiB in 8m28s importiert (269kiB/s; Gesamt:1233kiB/s).
/mnt/f/project_germany_100/rlp/alkis/ALKIS_Importer_Pipeline/input_batched/batch1/bda_5750_1238_Hausen.xml REMAINING: 36GiB 70% 8h30m55s ETA:Thu Jun 6 20:50:46 CEST 2024
/mnt/f/project_germany_100/rlp/alkis/ALKIS_Importer_Pipeline/input_batched/batch1/bda_5750_1237_Berresheim.xml
/mnt/f/project_germany_100/rlp/alkis/ALKIS_Importer_Pipeline/input_batched/batch1/bda_5750_1237_Berresheim.xml
/mnt/f/project_germany_100/rlp/alkis/ALKIS_Importer_Pipeline/input_batched/batch1/bda_5750_1237_Berresheim.xml
/mnt/f/project_germany_100/rlp/alkis/ALKIS_Importer_Pipeline/input_batched/batch1/bda_5750_1237_Berresheim.xml
/mnt/f/project_germany_100/rlp/alkis/ALKIS_Importer_Pipeline/input_batched/batch1/bda_5750_1237_Berresheim.xml ERROR: Ergebnis 1 bei /mnt/f/project_germany_100/rlp/alkis/ALKIS_Importer_Pipeline/input_batched/batch1/bda_5750_1237_Berresheim.xml (bislang 4 Fehler)
/mnt/f/project_germany_100/rlp/alkis/ALKIS_Importer_Pipeline/input_batched/batch1/bda_5750_1237_Berresheim.xml TIME: /mnt/f/project_germany_100/rlp/alkis/ALKIS_Importer_Pipeline/input_batched/batch1/bda_5750_1237_Berresheim.xml mit 56MiB in 9m12s importiert (104kiB/s; Gesamt:1238kiB/s).
/mnt/f/project_germany_100/rlp/alkis/ALKIS_Importer_Pipeline/input_batched/batch1/bda_5750_1237_Berresheim.xml REMAINING: 36GiB 70% 8h28m17s ETA:Thu Jun 6 20:48:09 CEST 2024
/mnt/f/project_germany_100/rlp/alkis/ALKIS_Importer_Pipeline/input_batched/batch1/bda_5750_1237_Berresheim.xml
/mnt/f/project_germany_100/rlp/alkis/ALKIS_Importer_Pipeline/input_batched/batch1/bda_5750_1237_Berresheim.xml ERROR 1: CREATE TABLE "alkis_rheinland_pfalz"."ax_besonderergebaeudepunkt" ( "ogc_fid" SERIAL, PRIMARY KEY ("ogc_fid"), "gml_id" VARCHAR(16), "anlass" varchar[], "beginnt" VARCHAR, "endet" VARCHAR, "advstandardmodell" varchar[], "sonstigesmodell" varchar[], "quellobjektid" VARCHAR, "zeigtaufexternes_art" varchar[], "zeigtaufexternes_name" varchar[], "zeigtaufexternes_uri" varchar[], "art" INTEGER, "hoehe" FLOAT8[], "obererbezugspunkt" INTEGER[], "untererbezugspunkt" INTEGER[], "punktkennung" VARCHAR, "sonstigeeigenschaft" varchar[], "zustaendigestelle_land" VARCHAR, "zustaendigestelle_stelle" VARCHAR, "bestehtaus" varchar[], "istteilvon" varchar[] )
/mnt/f/project_germany_100/rlp/alkis/ALKIS_Importer_Pipeline/input_batched/batch1/bda_5750_1237_Berresheim.xml
/mnt/f/project_germany_100/rlp/alkis/ALKIS_Importer_Pipeline/input_batched/batch1/bda_5750_1237_Berresheim.xml ERROR 1: Unable to write feature 1 from layer ax_besonderergebaeudepunkt.
/mnt/f/project_germany_100/rlp/alkis/ALKIS_Importer_Pipeline/input_batched/batch1/bda_5750_1237_Berresheim.xml ERROR 1: Terminating translation prematurely after failed
/mnt/f/project_germany_100/rlp/alkis/ALKIS_Importer_Pipeline/input_batched/batch1/bda_5750_1237_Berresheim.xml translation of layer ax_besonderergebaeudepunkt (use -skipfailures to skip errors)
parallel: This job failed:
import /mnt/f/project_germany_100/rlp/alkis/ALKIS_Importer_Pipeline/input_batched/batch1/bda_5750_1237_Berresheim.xml
Why avoiddupes off
? I suppose there's duplicate data in the sets (at least AX_Bundesland
). BTW better diagnostics (ie. values of conflicting records) with usecopy off
AX_Bundesland
is a parameter in config file?
AX_Bundesland
is a parameter in config file?
No a conflicting record that probably appears in each set and will cause an error unless avoiddupes
is on.
I don't know, actually I thought that avoiddupes off
only raise warnings, and it will be nice to know if there are duplicates or not, but it's actually throwing errors and stopping the import.
So is it good practice to use avoiddupes on
as I understood correctly? I mean is there any downsides of using it?
Depends, avoiddupes
should not be necessary for normal deliveries, which shouldn't contain any duplicates - if they still contain duplicates then there should be a problem in the data.
In this case each zip is a separate delivery. Importing them into separate schemas should work without error and shouldn't need avoiddupes
. But you want to import them into one schema and hence the objects that reappear in several datasets will produce errors without avoiddupes
(eg. AX_Bundesland, AX_KreisRegion; maybe also other objects that intersects at the edges of the delivery). Hopefully in this case all data is from the same point in time, because otherwise you might find several versions of the same object in different deliveries (ie. same gml_id
different beginnt
), that would not violate any constraints, but might produce strange output later.
In addition to previous comment from above, I did run again and turn off PG_USE_COPY, so I can share here how the logs look like with usecopy off
avoiddupes
only works for already inserted rows. If a the same record appears in parallel transactions, none will see the other, in turn no inserts will be skip it and the conflict will arise, when the second transaction is committed.
So I'm guessing only solution to avoid this situation is run with jobs 1
?
Or is it possible to use parallel jobs and use skipfailures
?