/Bank-statement-Analysis

Documenting the data cleaning process on a bank statement dataset using the python libraries, NumPy and Pandas.

Primary LanguageJupyter Notebook

Bank Statement Analysis -Data Cleaning

Introduction

Welcome to the data cleaning documentation for a bank statement dataset from June 2021 to January 2022. I was provided with the bank statement To gain insights into the spending habits during the specified period. The dataset includes transactional data along with other relevant details. This documentation aims to provide a clear and structured overview of the data cleaning process undertaken to prepare the dataset for analysis.

Dataset Description

The Bank Statement dataset contains 1396 rows and 8 columns, with data ranging from June 1st, 2021 to January 9th, 2022. The columns in the dataset are defined as follows:

  • Trans. Date: the date on which a transaction occurred.
  • Value. Date: The date on which the transaction was posted to the account.
  • Reference: The reference number or additional information associated with the transaction.
  • Debits: The amount debited from the account for the transaction.
  • Credits: The amount credited to the account for the transaction.
  • Balance: The account balance after the transaction.
  • Originating Branch: The branch where the transaction originated.
  • Remarks: Any additional notes or comments associated with the transaction.

Data Cleaning and Preparation

The data cleaning and preparation process involved:

  • Visual and programmatic assessments to identify any data quality issues
  • Rectifying and cleaning the data to address the identified issues.

By following this process, I ensured the accuracy and reliability of the dataset for further analysis.

The cleaning and preparation prcocess was done using Python, and the Jupyter notebook containing the steps can be found here