A text wikifier for hebrew texts, based on WikiMiner 1.2.
Links to the hebrew version of the data used by WikiMiner:
- CSV files (80MB zip file)
- Database (labels are processed using a dummy TextProcessor which does nothing)
- The xml dump of the hebrew version of wikipedia from Feb. 28th (1.5GB unzipped)
- A link to the same xml file, zipped (305MB bz2'ed file)
- The first 15% of the full hebrew dump (280MB xml file)
Prerequisits:
- For the ynet-scraping script: