Diabetes Mini Project

  • Figured out Correlation between variables, variation, and prune-trees.
  • Further tested if choosing the optimal value for cost and gamma in SVM is important.
  • Ensured when to use standardisation of variables.

Used Packages

  • ISLR
  • pls
  • dplyr
  • standardize
  • tree
  • caret
  • e1071
  • MASS

Performed Analysis

  • PCA
  • Correlation between PCs
  • Confusion Matrix
  • Misclassification Rate
  • Cost Complexity-pruning
  • Plot Pruned Tree
  • Predictions on Test Data
  • randomForest and Node Purity
  • Boosted regression tree
  • Standardising
  • Cluster