In this assignment I am able to use my new SQL skills learned from the Rice University FinTech Bootcamp to analyze historical credit card transactions and possibly uncover fraudulent activity.[1]
- General info
- Screenshots
- Technologies
- Installation Guide
- Code Examples
- Usage
- Sources
- Status
- Contributors
Unfortunately, fraud comes in multiple forms and is found in various technology. This is where I come in to help. The three main tasks for this project are:
1. Data Modeling: Define a database model to store the credit card transactions data and create a new PostgreSQL database using your model.
2. Create a database schema on PostgreSQL and populate your database from the CSV files provided.
3. Analyze the data to identify possible fraudulent transactions.
[1]
- Python - Version 3.8.5
- VS Code - Version 1.49.1
- Jupyter Notebook - Version 6.1.1
- Windows 10
- pgAdmin4 - Version 4.2
- Library - pandas
- Library - datetime
- Library - plotly.express
- Library - hvplot.pandas
- Library - sqlalchemy
- Download the entire suspicious_transactions repository
- Open Git Terminal
- Navigate into the repository file path where you stored the files during the download.
- The files should be visible and ready to run.
*See the Usage section below for instructions on how to run the notebook.
- Section of code to select the data using a sql statement from pgAdmin4.
query_card_holder = """
SELECT card_holder.card_holder_id, transaction.transaction_date, transaction.transaction_amount, merchant.merchant_id, merchant_category.merchant_category_name
FROM card_holder
LEFT JOIN credit_card ON card_holder.card_holder_id = credit_card.card_holder_id
LEFT JOIN transaction ON credit_card.credit_card_number = transaction.credit_card_number
LEFT JOIN merchant ON transaction.merchant_id = merchant.merchant_id
LEFT JOIN merchant_category ON merchant.merchant_category_id = merchant_category.merchant_category_id
"""
# read in all data as dataframe
df_card_holder = pd.read_sql(query_card_holder, engine)
- To run the analysis process, navigate to the directory where visual_data_analysis.ipynb is located using Git Terminal within the suspicious transactions directory or where ever you save the repository.
- Execute the command 'code .' in the terminal to open VS Code.
- VS Code opens. Select the visual_data_analysis.ipynb file in the suspicious_transactions directory found in the left side navigation pane or where you saved the repository.
- Click the Run All Cells button, double arrows, found at the top of the main workspace to run all cells in the Jupyter Notebook file.
- All cells in the notbook run. The simulations can take a few minutes to run, so please be patient while they process and display the data on the screen.
Project is: finished
- Jonathan Owens
- LinkedIn: www.linkedin.com/in/jonowens