/Fraud-Detection-with-Paysim-Data

Fraud Detection with Paysim Data

Primary LanguageJupyter Notebook

Fraud-Detection-with-Paysim-Data

Title: Fraud Detection on Financial Data

Dataset link: https://www.kaggle.com/ntnu-testimon/paysim1

Project Explanation

The financial services industry and the industries that involve financial transactions are suffering from fraud-related losses and damages. The number of fraudulent customers has reached a high level in recent years. The reason for this is the money stolen from banks. The shift to the digital space opens new channels for financial service delivery. It also created a rich environment for scammers. As a consequence of this, the need for automatic systems which are able to detect and fight fraudsters has emerged. Fraud detection is notably a challenging problem because; Fraud strategies change in time, as well as customers’ spending habits evolve. Few examples of frauds available, so it is hard to create a model of fraudulent behavior. Not all frauds are reported or reported with a large delay. Few transactions can be timely investigated. If earlier criminals had to counterfeit client IDs, now getting a person’s account password may be all that’s needed to steal money. With fraudsters becoming more adept at finding and exploiting loopholes in systems, fraud management has turned painful for the banking and finance industry. Customer loyalty and conversions are affected by fraudsters. In order to maintain customer loyalty and conversions, financial services firm’s need to detect fraud correctly and rapidly. Machine Learning and Deep Learning systems can detect changing strategies of fraudness quickly and correctly as needed.

Why we use Machine Learning and Deep Learning to detect fraud has 4 main reasons: • Scalable • Faster • Efficient • More accurate

There are 11 attributes in our data with approximately 6.3 million instances. Attributes are ;step, type, amount, nameOrig, oldbalanceOrg, newbalanceOrig, nameDest, oldbalanceDest, newbalanceDest, isFraud, isFlaggedFraud. Our main purpose is to detect fraud activities in transactions between accounts. Our project will take the financial sector one step further by identifying fraud, which is the bleeding wound of the financial sector, through machine learning by modeling in a way to detect fraud using the data set we have obtained.

Data Statistics:

There are 11 columns and 6.362.622 rows in our dataset.

Classification Experiments:

alt text