/Tsinghua-Big-Data-Analysis

Tsinghua Shenzhen International School Big Data Analysis Course

Primary LanguagePython

Tsinghua-Big-Data-Analysis

This repository mainly record Tsinghua Shenzhen International Graduate School Big Data Analysis course Homework.

Homework-1: ANOVA

This homework is mainly about ANOVA test. The details can be found in homework1.pdf.

You can run the homework1 with:

python anova.py

This file includes lots of function tools for one-way ANOVA such as normality test, the process of one-way ANOVA and so on.

Requiremnets

  • pandas, numpy, matplotlib, scipy

Homework-2: Matrix Factorization

We need to implement collaborative filtering algorithm and matrix factorization and compare their performance difference. More details can be found in homework2.pdf.

You can run the homework2 with:

python train.py

This file includes RMSE calculation function, cosine similarity calculation function and two main algorithms.

Requiremnets

  • numpy, matplotlib, tqdm, sklearn