Primarily text processing scripts in Python
I am trying to learn some some Python, so this is where I'm putting my scripts. The scripts mostly focus on trying out code in the NLTK book and learning to process various types of documents (e.g., html, sgml).
I have tried to comment the code heavily, both for myself and for anyone who finds this usefu. All code in this repository is licenced under the GPLv3, which can be found at More information can be found in the LICENCE file.