This project aims to be an AI which could play the Wikipedia race. It has all the methods to parse the data from Wikipedia as well as model training and playing the game.
This project have been heavily inspired by the following content:
- Wikipedia Data Science: Working with the World’s Largest Encyclopedia
- Word Embedding: Word2Vec With Genism, NLTK, and t-SNE Visualization
- An AI for the Wikipedia Game - Stanford University
- Your good old
pip install requirements.txt
- Download the Wikipedia data
- Create an .env file with the following variables:
-
- DATA_PATH: The local PATH for your enwiki-[DATE-OF-DUMP]-pages-articles-multistream.xml.bz2 file.
- You're good to go ;)
There are a number of ways you can support the project:
- Use it, star it, build something with it, spread the word!
- Raise issues to improve the project (note: doc typos and clarifications are issues too!)
- Please search existing issues before opening a new one - it may have already been addressed.
- Pull requests: please discuss new code in an issue first, unless the fix is really trivial.
- Make sure new code is tested.
- Be mindful of existing code - PRs that break existing code have a high probability of being declined, unless it fixes a serious issue.
The BSD 3-Clause license, very open and free so use whenever you want. Let me know if you build something cool with it!