/Analyzing-Daily-Rainfall-Data

📊 Pre-processed and analyzed daily rainfall climate data provided by the Australian Government - Bureau of Meteorology

Primary LanguageJupyter Notebook

Analyzing-Daily-Rainfall-Data: Data Cleaning and Summarization

Overview

This project is a part of the Practical Data Science with Python course at RMIT University. The assignment focuses on the initial stages of the data science process, including data cleaning, exploration, and summarization.

Project Structure

  • assignment1.ipynb: Jupyter Notebook containing Python code for data cleaning and exploration.
  • Data.csv: The given dataset.
  • cleaned_data.csv: The cleaned dataset after processing.
  • report.pdf: A detailed report summarizing the findings and methodologies used.

Tasks Overview

1. Data Preparation

  • Objective: Load, clean, and process daily rainfall climate data provided by the Australian Government - Bureau of Meteorology.
  • Steps:
    • Loaded the CSV data file using pandas.
    • Identified and corrected issues such as typos, extra whitespaces, and missing values.
    • Saved the cleaned data into cleaned_data.csv.

2. Data Exploration

  • Objective: Analyze the cleaned data to derive insights.
  • Explorations:
    • Analyzed the highest daily rainfall in each month of 2014.
    • Yearly and monthly analysis of data between 2015 and 2017, including visualizations.
    • Compared the top 3 years with the highest and lowest rainfall amounts.
    • Explored rainfall trends in ABC City over the last 10 years.

3. Report

  • Objective: Document the findings and the process.
  • Content:
    • A brief explanation of the data cleaning process.
    • Justifications for the methods used in data exploration.
    • Visualizations and comparisons to support the findings.

How to Run

  1. Ensure you have Anaconda installed with the necessary packages (pandas, matplotlib, etc.).
  2. Open assignment1.ipynb in Jupyter Notebook.
  3. Run all cells to load the data, clean it, and perform the analyses.
  4. Review the output for data insights and visualizations.

Requirements

  • Python 3
  • Jupyter Notebook
  • pandas, matplotlib, and other libraries as specified in the notebook.