Disease Diagnosis by using NLP's Algorithms

This Project collaboration between: Hatim Alshehri & Mohammed Alghamedi

Abstract and inspiration

  • The medical field has become more sophisticated and advanced than it was previously, but early diagnosis of diseases helps reduce the consequences of complications, that's thing inspired us and we recognized some people haven't the ability to know what type of diseases they have, especially at an early stage of diseases. So their cases might become very harmful and very complex, and can not be treated easily. In this project, we will try to help those people who struggle to figure out the kind of pain they feel by providing a simple way that might help them with trusted sources to get ideas about their conditions and how dangerous it is.

Project Description:

  • The goal of this project is clustering the medical textbook, in order to categorize diseases based on the most common words in each disease description by using NLP algorithms and techniques. Furthermore, we will dive more into each disease section and categorize the most common disease in that section based on the signs and symptoms.

All in all, the project will be able to detect diseases that are described in a written way by patients (users).

Data:

  • In order to achieve the project objective, we will use one of the trusted books in the field and one of that books is "Professional Guide to Diseases 11th edition". The book contains over 2900 pages that will be used as a source/reference for this project and all necessary information will be extended from it.

Tools and python packages

  • Numpy
  • Pandas
  • sklearn
  • PyPDF2
  • gensim
  • Nltk
  • Flask
  • Pickl
  • html