/Using-NLP-to-Classify-Genetic-Mutations-based-on-Clinical-Evidence

Classifying genetic mutations based on text clinical evidences by applying NLP techniques (word2vec) and Support Vector Classifier.

Primary LanguageJupyter Notebook

Classifying-Genetic-Mutations-based-on-Clinical-Evidence

This is a notebook exploring the text dataset given by Kaggle Competition: "Personalized Medicine: Redefining Cancer Treatment".

NLP concepts are applied to the dataset, such as tokenization, lemmatization, document vectorizing, etc.

With the document vectors in place, a Support Vector Classifier is used to predict the Genetic Mutation class (1 to 9).