/climate-data-analysis

Climate Analysis for a holiday vacation in Honolulu, Hawaii

Primary LanguageJupyter Notebook

SQLAlchemy Challenge - Surfs Up!

Where did the data come from?

The hawaii.sqlite database is provided by Monash University Data Analytics Bootcamp.

Climate Analysis and Exploration

Use Python and SQLAlchemy to do basic climate analysis and data exploration of the climate database.

Precipitation Analysis

  • Design a query to retrieve the last 12 months of precipitation data.

  • Select only the date and prcp values.

  • Load the query results into a Pandas DataFrame and set the index to the date column.

  • Sort the DataFrame values by date.

  • Plot the results using the DataFrame plot method.

  • Use Pandas to print the summary statistics for the precipitation data.

Station Analysis

  • Design a query to calculate the total number of stations.

  • Design a query to find the most active station.

  • Design a query to retrieve the last 12 months of temperature observation data (TOBS) of the most active station.

    • Plot the results as a histogram

Climate App

Design a Flask API based on the queries developed.

Routes

  • /

    • Home page.

    • List all routes that are available.

  • /api/v1.0/precipitation

    • Convert the query results to a dictionary using date as the key and prcp as the value.

    • Return the JSON representation of your dictionary.

  • /api/v1.0/stations

    • Return a JSON list of stations from the dataset.
  • /api/v1.0/tobs

    • Query the dates and temperature observations of the most active station for the last year of data.

    • Return a JSON list of temperature observations (TOBS) for the previous year.

  • /api/v1.0/<start> and /api/v1.0/<start>/<end>

    • Return a JSON list of the minimum temperature, the average temperature, and the max temperature for a given start or start-end range.

    • When given the start only, calculate TMIN, TAVG, and TMAX for all dates greater than and equal to the start date.

    • When given the start and the end date, calculate the TMIN, TAVG, and TMAX for dates between the start and end date inclusive.

Temperature Analysis I

  • Identify the average temperature in June at all stations across all available years in the dataset. Do the same for December temperature.

  • Use the t-test to determine whether the difference in the means, if any, is statistically significant.

Temperature Analysis II

  • Calculate the min, avg, and max temperatures for the holiday using the matching dates from the previous year.

  • Plot the min, avg, and max temperature as a bar chart.

    • Use the average temperature as the bar height.

    • Use the peak-to-peak (TMAX-TMIN) value as the y error bar (YERR).

Daily Rainfall Average

  • Calculate the rainfall per weather station using the previous year's matching dates.

  • Calculate the daily normals. Normals are the averages for the min, avg, and max temperatures.

  • Use Pandas to plot an area plot for the daily normals.


Contact:

Email: thao.ph.ha@gmail.com