The purpose of the code in this repo is to maintain an archive of the registered ground water wells data provided by the Government of British-Columbia in the data/
folder.
Three CSVs appear in the data/
folder. They are all updated on a daily basis:
gwells_data_first_appearance.csv
keeps a record of each well on the day they were added to the gwells
csv. We never update the information of a well, allowing us to go back in time and generate a record for any time period. The wells are defined by their well_tag_number
. The columns are the same as for the gwells.csv, with the addition of the date_added
column, which is the first date a well_tag_number
was spotted by this script.
-
wells_geocoded.csv
is the result of passing thegwells_data_first_appearance.csv
through thepython gwells_locationqa geocode
script. -
gwells_locationqa.csv
is the result of passinggwells_data_first_appearance.csv
andwells_geocoded.csv
through thepython gwells_locationqa qa
script.
The daily updates occur daily thanks to scheduled github action
that depends on the Docker Image created specifically for this project. The image was tailored to include all the R, Python and spatial dependencies required to run the Python scripts created by Simon Norris and build on the rocker/geospatial:4.1.2
docker image.
The three CSVs will then be used to feed the shiny app (code) created for this Code With Us project.
below is some information I got somewhere else :
https://docs.python.org/3/library/csv.html:
The so-called CSV (Comma Separated Values) format is the most common import and export format for spreadsheets and databases. CSV format was used for many years prior to attempts to describe the format in a standardized way in RFC 4180. The lack of a well-defined standard means that subtle differences often exist in the data produced and consumed by different applications. These differences can make it annoying to process CSV files from multiple sources. Still, while the delimiters and quoting characters vary, the overall format is similar enough that it is possible to write a single module which can efficiently manipulate such data, hiding the details of reading and writing the data from the programmer.
Well extracts are generated in Python 3 using the csv library "excel" dialect (see https://docs.python.org/3/library/csv.html#csv.excel)
It is suggested that you use a mature library, rather than attempting to write bespoke code to read CSV data. For example, Python 3 comes with module to read and write CSV data.
It may be that your application does not correctly handle escaped line-break characters. See RFC 4180, Section 2, point 6:
Fields containing line breaks (CRLF), double quotes, and commas should be enclosed in double-quotes. For example:
"aaa","b CRLF
bb","ccc" CRLF
zzz,yyy,xxx
The data in the well export closely matches the current state of data in the GWELLS web application, as such the structure may change from time to time.
- Column names and positions may change at any time.
- Columns may be added or removed at any time.
Data extracts should be generated daily but may fail to be generated for various reasons.
- XLSX extract, also generated by the GWELLS web application.
- DataBC provides information sourced from the GWELLS application in various formats.
- API calls directly to the GWELLS application: https://apps.nrs.gov.bc.ca/gwells/api/.