Class project for MATH 60603A Statistical Learning, Fall 2020.
The problem of credit card fraud can have a major impact on enterprises. In real life, the prediction model of credit card fraud detection has been widely used. However, different machine learning algorithms also have a certain influence on the prediction process and results. This paper will focus on studying the credit card fraud prediction of different models by evaluating four algorithms, including logistic regression, random forest algorithm, k-nearest-neighbours classification algorithm, naive bayes classification algorithm, to compare their training speed, prediction speed and accuracy for different training sets of different sizes. We also compare how the model reacts to changes in data nature. Ultimately we seek to determine which model is most suitable for a latency sensitive application such as credit card fraud. In this project, the basic data set will be randomly generated using Monte Carlo simulation.
The final paper is available to read: ResearchPaper.pdf
Data Generation: Saba Daftari + Xinyuan Qi
Modeling + Visualizations: Norina Sun
Report: Saba Daftari + Xinyuan Qi
Norina Sun, Saba Daftari, Xinyuan Qi. November 2020