codeforboston/safe-water

Extract data for map prototype

Closed this issue · 4 comments

Need to extract all the data necessary to reproduce the map there: https://www.crwa.org/water-quality-data.html

save the data there: projects/crwa/data/sampling_datasets/ecoli_november19.csv

use this data as input: https://github.com/codeforboston/safe-water/blob/master/projects/crwa/data/charles_river_samples_interim/results_merged_cleaned.csv

save the notebook that you write to do this extraction there: projects/crwa/notebooks/dataset_extraction

aim: be able to use this data to design a prototype of a map later.

questions:

Hi Team,

Following are my comments about building prototype

  1. All the columns necessary to build the prototype are in the database

  2. Following are the columns and associated tables which I think are required to build the prototype

Monitoring_Sites --> Site_ID,Site_Name,Town,River_Mile_Headwaters, Latitude_DD, Longitude_DD
Results --> Date_Collected, Actual Result, Reporting Result
Actual_Result_Type--> Result_Type
Reporting_Result_Type --> Result_Type
Actual_Result_Units--> Unit_Abbreviation
Reporting_Result_Units--> Unit_Abbreviation
Rainfall--> Precipitation

  1. Issues with the data

a) The "Results" column contains collection dates up to 12/11/2018 but this file has https://www.crwa.org/uploads/1/2/6/7/126781580/e.colitable_11-19-2019.pdf

earliest collection date 1/15/19. So we need updated data

b) The same goes for the "Rainfall" table. It has date till 4/9/2013 but the earliest date in given map is 1/15/19

Finally, I think we need an updated data and I will able to provide a joined table with required columns soon so that we can start building visualization

Thanks,
Bhushan

But with the current data we can build prototype for this file
https://www.crwa.org/uploads/1/2/6/7/126781580/crwa_ammonia_web_2017_updated.xlsx

Thanks, @bhushan-choudhari ! Good to know.
I think it is probably not worth asking CRWA to resend us their whole Access database for now.
I would then propose to extract a E.Coli water quality dataset for one of the last month in the database that we have (e.g. October 2018).
And I don’t think that we need rainfall data for the map, right? Thus we could leave that for now.