This is a notebook exploring the text dataset given by Kaggle Competition: "Personalized Medicine: Redefining Cancer Treatment".
NLP concepts are applied to the dataset, such as tokenization, lemmatization, document vectorizing, etc.
With the document vectors in place, a Support Vector Classifier is used to predict the Genetic Mutation class (1 to 9).