- Tom Zhi Hern 1068268
- Peter Qian Ziyu 1067810
- Read train and test data from data/train.csv and data/test.csv and convert them to dataframe (df_train & df_test)
- Run preprocess(df_train, df_test) to preprocess data and convert df_test into numpy array (np_test)
- Run train(df_train) to get statistics from train data (prior probability, mean, stdv) and store it into pose_dict
- Run predict(np_test, pose_dict) to predict the results and store the results into list (results)
- Run evaluate(results) to get the accuracy of the model
- Run get_con_matrix(results, poses) to get confusion matrix (con_matrix) for the result obtained in the previous section
- Run print_model_eval(con_matrix) to evaluate the model, using micro and macro averaging and print the evaluation
- Load all data from data/all.csv which combined both data from data/train.csv & data/test.csv and store it into dataframe (df_all)
- Add headers to df_all
- Run plot_qq(df_all, pose, remove=True) to plot QQ plot for each attribute (x1 to y11) for given pose
- Choose pose from [mountain, downnwarddog, childs]
- Read train and test data from data/train.csv and data/test.csv and convert them to dataframe (df_train & df_test)
- Run preprocess(df_train, df_test) to preprocess data and convert df_test into numpy array (np_test)
- Run predict_kde(np_test, df_train, SIGMA=i) 2 times (sigma = 0.1 and sigma = 5) with a for loop
- It will also run get_con_matrix(results, poses) to get the confusion matrix for each result and print them
- Run this cell to plot pdf for gaussian & kde (with given sigma) for train dataset
- Repeat with sigma=0.1 and sigma=0.5
- Read train and test data from data/train.csv and data/test.csv and convert them to dataframe (df_train & df_test)
- Run predict_kde_rs(df, num) to run KDE Naive Bayes prediction random holdout using random holdout with given num. 5 is used here.
- The result for each prediction will be printed out