I demonstrate some basic techniques for machine learning and data analysis with NLP, with the theory behind them, in the context of the MSKCC's competition on extracting the class of genetic mutation from clinical text data.
See my blog post for more details.
Datasets can be found on Kaggle.
Scripts: