/ML-Sklearn

Simple machine learning model using scikit-learn

Primary LanguagePythonApache License 2.0Apache-2.0

Static Badge Static Badge Static Badge

ML-Sklearn

This repository uses scikit-learn to implement regression and classification models for machine learning algorithms. Then, evaluate each model and save and compare evaluation metrics. The data used in the regression analysis uses kaggle's 'red wine quilty', and the data used in the classification problem uses kaggle's 'Heart Failure Prediction'. Additionally, I created a code showing how to find optimal hyperparameters using 'GridSearchCV'.


Data

Static Badge

Static Badge


Algorithm


D-Tree (Decision Tree)

  • Code DTree.py

  • Hyper parameters

    "dtree": { "max_depth": [1, 2, 3, 4, 5], "min_samples_split": [2, 3] }
  • Usage

    $ python3 main.py --prob={reg or class} --model=dtree

RF (Random Forest)

  • Code RF.py

  • Hyper parameters

    "rf": {
          "n_estimators": [10, 100],
          "max_depth": [6, 8, 10, 12],
          "min_samples_leaf": [8, 12, 18],
          "min_samples_split": [8, 16, 20]
        }
  • Usage

    $ python3 main.py --prob={reg or class} --model=rf

NB (Naive Bayes)

Gaussian Naive Bayes(GNB)

  • Hyper parameters
    "gnb": {
          "var_smoothing": [1e-2, 1e-3, 1e-4, 1e-5, 1e-6]
        }
  • Usage
    $ python3 main.py --prob=class --model=gnb

Multinomial Naive Bayes(MNB)

  • Hyper parameters
    "mnb": {
          "var_smoothing": [1e-2, 1e-3, 1e-4, 1e-5, 1e-6]
        }
  • Usage
    $ python3 main.py --prob=class --model=mnb

K-NN (K Nearest Neighbors)

  • Code KNN.py

  • Hyper parameters

    "knn": {
          "n_neighbors": [1, 2, 3, 4, 5],
          "weights": ["uniform", "distance"]
        }
  • Usage

    $ python3 main.py --prob={reg or class} --model=knn

Ada (Adaptive Boosting)

  • Code Ada.py

  • Hyper parameters

    "ada": {
          "n_estimators": [50, 100, 150],
          "learning_rate": [0.01, 0.1]
        }
  • Usage

    $ python3 main.py --prob={reg or class} --model=ada

DA (Discriminant Analysis)

Linear Discriminant Analysis(LDA)

  • Hyper parameters
    "lda": {
          "n_components": [6, 8, 10, 12],
          "learning_decay": [0.75, 0.8, 0.85]
        }
  • Usage
    $ python3 main.py --prob=class --model=lda

Quadratic Discriminant Analysis(QDA)

  • Hyper parameters
    "qda": {
          "reg_param": [0.1, 0.2, 0.3, 0.4, 0.5]
        }
  • Usage
    $ python3 main.py --prob=class --model=qda

SVM (Support Vector Machine)

  • Code SVM.py

  • Hyper parameters

    "svm": {
          "C": [0.1, 0.8, 0.9, 1, 1.1, 1.2, 1.3, 1.4],
          "kernel": ["linear", "rbf"],
          "gamma": [0.1, 0.8, 0.9, 1, 1.1, 1.2, 1.3, 1.4]
        }
  • Usage

    $ python3 main.py --prob={reg or class} --model=svm

Voting

  • Code Voting.py

  • Hyper parameters

    Not yet
  • Usage

    $ python3 main.py --prob={reg or class} --model=voting

Bagging

  • Code Bagging.py

  • Hyper parameters

    Not yet
  • Usage

    $ python3 main.py --prob={reg or class} --model=bagging

Reference