TimSC/pycrocosm

Duplicate nodes appearing in planet dump

Closed this issue · 5 comments

TimSC commented

From a recent dump fosm-planet_20181016020000.o5m.gz provided by @4x4falcon running this query:

SELECT * FROM planet2_static_livenodes WHERE id=1000593496579;

gives me this:

      id       | changeset  | changeset_index | username | uid  | timestamp  | version | tags | geom
---------------+------------+-----------------+----------+------+------------+---------+------+----------------------------------------------------
 1000593496579 | 1000071674 |                 | Rosscoe | 4467 | 1457757963 |       1 | {}   | 0101000020E610000032F499FDCBEB624059750C7ED15440C0
 1000593496579 | 2000001771 |                 | Rosscoe | 4467 | 1539733222 |       2 | {}   | 0101000020E61000009241EE22CCEB6240B22A6794D35440C0
(2 rows)

I suspect a problem with the dump tool...

TimSC commented
SELECT * FROM planet_mod_nodeids WHERE id=1000593496579;

      id
---------------
 1000593496579
(1 row)

postgres is 9.5. This seems to limit the problem to a few SQL queries:

SELECT "planet2_static_livenodes".*, ST_X(geom) as lon, ST_Y(geom) AS lat FROM "planet2_static_livenodes" LEFT JOIN "planet2_mod_nodeids" ON "planet2_static_livenodes".id = "planet2_mod_nodeids".id WHERE "planet2_mod_nodeids".id IS NULL ORDER BY "planet2_static_livenodes".id;
SELECT "planet2_mod_livenodes".*, ST_X(geom) as lon, ST_Y(geom) AS lat FROM "planet2_mod_livenodes" ORDER BY "planet2_mod_livenodes".id;

Each statement is apparently producing one row. Only the second actually should be.

TimSC commented

Not all relevant tables were being locked. Attempted fix in f8ce0f4

TimSC commented

Latest version still has problems

ALTER TABLE "planet2_static_oldnodes" ADD PRIMARY KEY (id, version);
ALTER TABLE "planet2_static_oldways" ADD PRIMARY KEY (id, version);
ALTER TABLE "planet2_static_oldrelations" ADD PRIMARY KEY (id, version);
ALTER TABLE "planet2_static_livenodes" ADD PRIMARY KEY (id);
ERROR:  could not create unique index "planet2_static_livenodes_pkey"
DETAIL:  Key (id)=(1000593506563) is duplicated.

Timestamp of errant node seems to be about the time the dump was made.

TimSC commented

Error probably fixed by updating to latest pgmap. Reopen if problem occurs again.

TimSC commented

Fix is here: TimSC/pgmap#4

Dump fosm-planet_20190501174901.o5m.gz worked without any duplicates :)