OvertureMaps/data

Investigate WikiData and Wikipedia

Opened this issue · 1 comments

Wikidata and Wikipedia are both sources of PoIs, and should be investigated for potential merging with Overture.

Wikidata and Wikipedia are both sources of PoIs, and should be investigated for potential merging with Overture.

@upintheairsheep :

Be cautious with Wikidata's geodata.
I believe the wikidata database has not yet been completely cleaned up everywhere due to the Cebuano import, which caused duplicated Wikidata items for geographic places.

see: https://youtu.be/HaKuKRdJojc?t=161

"Duplicating Everywhere All at Once | Cebuano Wikipedia | Wikimania2023
Alex Lum : 28 Nov 2023
Five years ago, bots created millions of articles on several Wikipedia language editions, notablly cebuano Wikipedia and corresponding Wikidata items, resulting in thousands of duplicated Wikidata items for geographic places. This session will cover how this happened, use data visualisation to show the scope of the issues, and suggest some novel ways of cleaning up Wikidata, Wikipedia and the original data sources.

Five years ago, Lsjbot created millions of articles on several Wikipedia language editions, for which other bots created corresponding Wikidata items. The result has been hundreds of thousands of duplicated items for geographic places on Wikidata.

This session will look at the history of how this happened, use data visualisation to show the scope and scale of the issue, and propose some ways of cleaning up Wikidata, Wikipedia and even the original data sources. It will concentrate primarily on geographic places in Aotearoa New Zealand and some parts of Australia, but will be relevant to other countries where the issue of bot-created duplicates of geographic entities is significant.
"