To help plan my trip to Honolulu, Hawaii, I will do some climate analysis on the area.
To use Python and SQLAlchemy to do climate analysis and data exploration of the climate database, using SQLAlchemy ORM queries, Pandas, and Matplotlib.
Using SQLAlchemy create_engine to connect to the sqlite database and using SQLAlchemy automap_base() to reflect the tables into classes and save a reference to those classes called Station and Measurement.
- Precipitation Analysis
- Designed a query to retrieve the last 12 months of precipitation data and selected only the date and prcp values
- Loaded the query results into a Pandas DataFrame and set the index to the date column
- Sort the DataFrame values by date
- Plotted the results using the DataFrame plot method
- Printed the summary statistics for the precipitation data
- Station Analysis
- Designed a query to calculate the total number of stations
- Designed a query to find the most active stations
- Listed the stations and observation counts in descending order
- Designed a query to calculate that 'the most active station is WAIHEE 837.5, HI US'
- Designed a query to retrieve the last 12 months of temperature observation data (TOBS)
- Filter by the station with the highest number of observations, WAIHEE 837.5, HI US
- Plotted the results as a histogram with bins=12
Design a Flask API based on the queries developed in part I.
Created 6 different routes in a Flask App
- Home page
- /api/v1.0/precipitation Converted the query results to a dictionary using date as the key and prcp as the value. Returning the JSON representation of the dictionary.
- /api/v1.0/stations Returned a JSON list of stations from the dataset
- /api/v1.0/tobs Querying the dates and temperature observations of the most active station, WAIHEE 837.5, HI US Returning a JSON list of temperature observations (TOBS) for the previous year
- /api/v1.0/<start> When given the start only, calculate TMIN, TAVG, and TMAX for all dates greater than and equal to the start date Returning a JSON list of the minimum temperature, the average temperature, and the max temperature for the given start date
- **/api/v1.0/<**start>/<end> When given the start and the end date, calculate the TMIN, TAVG, and TMAX for dates between the start and end date inclusive. Returning a JSON list of the minimum temperature, the average temperature, and the max temperature for the given start-end range
- Clone this repository
- Open new terminal within root of the directory
- Enable python envirnonment 'conda activate PythonData'
- Run 'python app.py'
- Open local host in web browser
- Investigate routes by following links
- For routes 5 and 6, remember to enter dates into the url bar in the correct format
Hawaii is reputed to enjoy mild weather all year. Is there a meaningful difference between the temperature in, for example, June and December?
-
Temperature Analysis I
- Identified the average temperature in June and December at all stations across all available years in the dataset
- Used an unpaired t-test to determine whether the difference in the means, if any, is statistically significant
-
Temperature Analysis II
- Used function called calc_temps to calculate the min, avg, and max temperatures for my trip
- Plotted the min, avg, and max temperature from previous query as a bar chart
- Daily Rainfall Average
- Calculated the rainfall per weather station using the previous year's matching dates
- Calculated the daily normals. Normals are the averages for the min, avg, and max temperatures.
- Used the function called daily_normals that calculates the daily normals for a specific date.
- Created a list of dates for my trip in the format %m-%d
- Used the daily_normals function to calculate the normals for each date string and appended the results to a list
- Loaded the list of daily normals into a Pandas DataFrame and set the index equal to the date
- Used Pandas to plot an area plot for the daily normals