
Codes for Machine Learning course Assignments.

Primary LanguagePython


Codes for Machine Learning course Assignments.

Assignment 1

  • Problem Statement:

    • Find cluster center (k=2 to 10) using K-Means clustering for every project.
    • Compute the value of DB index and silhouette value
    • Find the optimal number of cluster using DB index and silhouette value.
    • Store your results with single excel file with multiple rows i.e., one row for each project
  • Software used: MATLAB

Assignment 2

  • Problem Statement:
    • Apply 3 different Naive Bayes Classifiers on all data.
    • Apply 5-fold cross validation
    • Compute the value of F-measure and accuracy for all features and significant features
    • Find the best Naive Bayes Classifier and also compare original data with significant features data.
    • Store your results with single excel file with multiple rows i.e., one row for each project
  • Programming language: Python
  • Libraries used: numpy, pandas, scipy, sklearn, matplotlib

Assignment 3

  • Problem Statement:

    • Apply feature ranking techniques using gini split, information gain, PCA.
    • Apply same 3 different Naive Bayes Classifiers on selected features data.
    • Compute the value of F-measure and accuracy
    • Find the best Naive Bayes Classifier and also compare best sets of features.
    • Store your results with single excel file with multiple rows i.e., one row for each project
  • Programming language: Python

  • Libraries used: numpy, pandas, scipy, sklearn, matplotlib, graphviz

Assignment 4

  • Problem Statement:
    • Apply different data sampling techniques like random sampling, upsampling, and Downsampling to handle class imbalance problem.
    • Apply logistic regression, Decision tree on selected data.
    • Compute the value of F-measure and accuracy and find the best techniques.
    • Store your results with single excel file with multiple rows i.e., one row for each project
    • You should also validated the null hypothesis like "There is no any significant improvement after applying data sampling techniques"
  • Programming language: Python
  • Libraries used: numpy, pandas, scipy, sklearn, matplotlib, graphviz