Grasia/WikiChron

Provide an option to download and parse data of a wiki given its url

Akronix opened this issue · 1 comments

An user can enter a link and Wikichron will take care of: downloading the history dump for that wiki, parsing the data, getting the metadata for the wikis.json file and making it available in the list of wikis of the tool.

This feature, even nice, it's hard to have since there are huge wikis out there (Wikipedia ones for instance) that would take too long to be downloaded and parsed, as well as our server hard disk could run out of space and bandwidth downloading and parsing such wikis. Also, particularities of the different MediaWiki instances and versions that would make the parser to fail are hard to predict and handle.
Lastly, users could often want to use their custom exported data: like only of a certain time range, or of certain namespaces; and they could have different ways to export data and create the csv file.
For these reasons, we have decided to not do this and stick on #20 instead.