This is a course of my L4T2(Final Term). As the name suggests its a course on machine learning. We are to implement different machine learning algorithms from scrach.
- Decision Tree classifier
- Ensemble Learning algorithm AdaBoost using Decision Stump
Language Used: Python
-
k-NN algorithm for text classification
- Hamming distance: each document is represented as a boolean vector, where each bit represents whether the corresponding word appears in the document.
- Euclidean distance: each document is represented as a numeric vector, where each number represents how many times the corresponding word appears in the document.
- Cosine similarity with TF-IDF weights: each document is represented by a numeric vector as in the case of euclidean distance. However, now each number is the TF-IDF(Term Frequency–Inverse Document Frequency) weight for the corresponding word.The similarity between two documents is the dot product of their corresponding vectors, divided by the product of their norms.
Experimented with
$k=1,3,5$ and different distance metric. -
Naive Bayes for text classification
- Considered all the words of document independently,then calculated the probability of the document of being a topic, and then picked up the topic which provides the highest probability score.
- Tried
$10$ different smoothing factors and calculate the accuracy for each value of smoothing factor to get the best performing smoothing factor.
-
T-test for comparison
- Ran
$50$ iterations with test domcuments. - Compared kNN and NB using Paired T-test with Significance level
$\alpha = 0.005,0.01,0.05$
- Ran
Language Used: Python
Assignment 3: Dimensionality Reduction using Principal Component Analysis and Clustering using Expectation-maximization Algorithm
-
Principal Component Analysis(PCA) implementation : X be a NxD data matrix where D is the number of dimensions and N is the number of instances.
-
Expectation-maximization(EM) Algorithm implementation : Now we will cluster the two-dimensional data assuming a Gaussian mixture model using the EM algorithm. Let a vector x with dimension D can be generated from any one of the K Gaussian distribution where the probability of selection of Gaussian distribution k is wk where,
and the probability of generation of x from Gaussian distribution is given as,
To learn a Gaussian mixture model using EM algorithm, we need to maximize the likelihood function with respect to the parameters. The steps are given below,
- Initialize the means,covariances and mixing coefficients and evaluate the initial value of the log likelihood.
- E step: Evaluate the conditional distribution of latent factors using the current parameter values,
- M step: Re-estimate the parameters using the conditional distribution of latent factors,
- Evaluate the log likelihood and check for convergence of the log likelihood. If the convergence criterion is not satisfied return to step 2.