/ML-projects

Primary LanguageJupyter Notebook

ML-projects

The datasets for all these codes are available in the folder.

  1. Gapminder
  • predicts fertility
  • linear model - linear regression
  • tree (overfitting) - decision tree regressor
  1. Titanc
  • predicts survivability
  • linear model - logistic regression
  1. Heart
  • predicts chance of heart disease
  • neighbors - KNeighborsClassifier
  • tree (overfitting) - decision tree classifier
  1. House Voters 84
  • predicts class name
  • ensemle - random forest classifier
  1. Tweets
  • predicts author
  • feature extraction - CountVectorizer
  • naive_bayes - MultinomialNB
  1. Country Data
  • predicts mortality
  • cluster - K means

Other features used:

  1. one hot encoding
  2. dummy encoding
  3. tf_idf vectorizer
  4. count vectorizer

Issues in Datasets :

  1. Not all data are numeric
  2. Data has null values

These codes have used both supervised and unsupervised learning.