📒 About this Project | Brief description about the projects |
💾 Introduction | Introduction |
📖 Problem Statements | Problem Statements |
📊 About the Dataset | About the Dataset |
🖥 The Flow | Workflow |
Since Github have limitation of size to support the files upload which is up to 200 MB per files, so you can explore about this project with Dataiku DSS by downloading the file with the link provided as below:
After downloading the files, you can simply import this files directly into your Dataiku DSS. Happy exploring! ;)
In this project, machine learning model will predict the probability an online transaction being fraudulent, as indicated by the binary target isFraud.
The data is divided into two files, identification and transaction, which are linked together by TransactionID. Not all transactions are associated with a unique identifier.
This ML Model developed end-to-end with Dataiku DSS Platform.
Consider yourself at the grocery store check-out counter, a large queue behind you, and the cashier not-so-quietly reveals that your card has been declined. You're probably not thinking about the data science that influenced your fate right now.
Embarrassed, but convinced that you have enough money to have an awesome nacho party for 50 of your best friends, you try your card again. The same outcome. You receive a text message from your bank as you step aside to enable the cashier to assist the next client. "If you truly attempted to spend $500 on cheddar cheese, press 1."
Therefore, with Dataiku DSS Platform, I wanted to enhance this figure while also increasing the client experience with this project. With improved accuracy fraud detection, customers able to go back to business with their chips.
The goal of this ML Model:
- Built machine learning models on a challenging large-scale e-commerce transactions dataset
- To help business to reduce fraud loss and increase their revenue
- To provide best solutions for fraud prevention
To download the dataset, you may get it from here.
This dataset provided by Vesta Corporation, guaranteed e-commerce payment solutions. Retrieved from here
The data is derived from real-world e-commerce transactions conducted by Vesta and includes a wide range of variables ranging from device type to product specifications.
In DSS, the Flow is the visual representation of how data, recipes, and models work together to move data through an analytical pipeline. The Flow in DSS has an awareness of the relationships and dependencies between datasets in the project.