This project is a part of the Practical Data Science with Python course at RMIT University. The assignment focuses on the initial stages of the data science process, including data cleaning, exploration, and summarization.
assignment1.ipynb
: Jupyter Notebook containing Python code for data cleaning and exploration.Data.csv
: The given dataset.cleaned_data.csv
: The cleaned dataset after processing.report.pdf
: A detailed report summarizing the findings and methodologies used.
- Objective: Load, clean, and process daily rainfall climate data provided by the Australian Government - Bureau of Meteorology.
- Steps:
- Loaded the CSV data file using
pandas
. - Identified and corrected issues such as typos, extra whitespaces, and missing values.
- Saved the cleaned data into
cleaned_data.csv
.
- Loaded the CSV data file using
- Objective: Analyze the cleaned data to derive insights.
- Explorations:
- Analyzed the highest daily rainfall in each month of 2014.
- Yearly and monthly analysis of data between 2015 and 2017, including visualizations.
- Compared the top 3 years with the highest and lowest rainfall amounts.
- Explored rainfall trends in ABC City over the last 10 years.
- Objective: Document the findings and the process.
- Content:
- A brief explanation of the data cleaning process.
- Justifications for the methods used in data exploration.
- Visualizations and comparisons to support the findings.
- Ensure you have
Anaconda
installed with the necessary packages (pandas
,matplotlib
, etc.). - Open
assignment1.ipynb
in Jupyter Notebook. - Run all cells to load the data, clean it, and perform the analyses.
- Review the output for data insights and visualizations.
- Python 3
- Jupyter Notebook
pandas
,matplotlib
, and other libraries as specified in the notebook.