Duplicate page and revision entries in iswiki (20140603)
glimmerphoenix opened this issue · 1 comments
glimmerphoenix commented
Found 107 duplicate page entries and 1000 duplicate revision entries in database for lang iswiki, date 20140603.
SELECT page_id from page GROUP BY page_id having count(*) >= 2;
...
107 rows in set (0.08 sec)
SELECT rev_id from revision GROUP BY rev_id having count(*) >= 2;
...
1000 rows in set (6.29 sec)
This should never occur, as each page and revision element is only parsed once, and there should be no duplicate elements in the compressed XML dump file.
Further inspection is required to determine if this is caused by a faulty dump file or is a problem with the parser.
glimmerphoenix commented
Apparently fictitious bug from consecutive executions. Closing for now.