itkach/mwscrape

Add support for SQLite database

oliver opened this issue · 1 comments

Since CouchDB is a rather big dependency that requires installation as a system-wide service, would it be possible to add support for an SQLite database as well? I would imagine that SQLite would be able to cope with all Mediawikis except Wikipedia; and it could make mwscrape much easier to set up for new users, since it would simply write to a single local file.

CouchDB is a rather big dependency that requires installation as a system-wide service

Not really. CouchDB can be installed locally and can be started manually, not as a service. Even if it wasn't the case though, I'm not sure what's the issue with installing a system-wide service?

as well

If mwscrape ever moves to a different backing storage it most likely won't be "as well" - supporting multiple is tedious and only one is needed

I would imagine that SQLite would be able to cope with all Mediawikis except Wikipedia;

It probably won't be able to cope with most Mediawikis but the smallest, but even if it were "except Wikipedia" that's a good enough reason to rule it out since being able to handle biggest Wikipedias is probably the most interesting/valuable feature.

SQLite is probably the wrong tool for the job because a) it doesn't deal well with any concurrency (mwscrape has some) b) it doesn't deal well with large databases (multi-gigabyte) c) unlike CouchDB, it is not a document data store d) doesn't have network access, so running scraper on one machine and database on another is not an option