Contains files and scripts for creating an NPS data package. The main areas are metadata (EML) and the final products (Package). The working example is the HTLN breeding bird protocol and database. The 2022 breeding bird protocol revision is located in Documentation. The directory ./src is my dev area. Files stored in other directories are functioning executable scripts and data products. Thanks for reading!
Additional background info on HTLN breeding bird monitoring project is here:
https://www.nps.gov/im/htln/birds.htm
https://doi.org/10.57830/2300410.
https://irma.nps.gov/DataStore/Reference/Profile/2300410
[DRAFT]: Heartland Inventory and Monitoring Network Breeding Land Bird Data Package
20230816
pubDate error - eml_validate
20230807
make_eml() running, cleaned comments, corrected categorical attribs, date and time with "T" delimiter
20230801
make_eml() crashing. Still need date and time column needs 'T' delimiter.
20230728
Completed last EML function - taxonomic covereage. Start generating EML.
20230724
Run EML functions Move EML assemblyline back to src Bird and tree species names into EML script
20230720
Set up EML and Package. Initial EML script edited. Need taxonomy.
20230717
Fixes to date and time format.
20230713
Updated NPSdataverse libraries. EML script work is in the Package directory because all .csv's are located there.
20230710
Habitat QC completed.
20230705
All subplot 0's edited to 1's. Webapp no longer allows 0's. Rerun all habitat SQL export scripts and resume R - QC scripts.
20230703
BirdobservationsThru2022_2.csv passed all the QC tests. See the file in ./Package called BirdobservationsThru2022_2.csv. The QC tests are all in the script ./QCscripting called BirdObservationsQC.R
Started on habitat-BasalArea and immediately found 0 values for SubPlot. Fixing SubPlot values that were accidentally set to 0 back in 2017. They should all be 1's. Making database corrections using HTLN_Landbirds IRMA app. There are n = 152 records to fix. See the list under src/Subplot0s.csv
20230627
Put DRR on hold. Need to develop QA/QC for all csvs. Something weird with .csv exports in Sites/BirdObservations. Develop R code to test every field. Problems associated with flattening all the LUTs in site/birdobservations. Possible commas in LUT values that will create problems with exports.
20230613
Need to complete QA/QC and DRR prior to completing data package. This is on-hold until HTLN-BreedingBird-DDR is done.
20230606
Listing fate of LUTs in spreadsheet "Lookup_tables.csv"
20230605
Make a list of look-up-tables Create a csv file in Excel with the following: LUT_table_name, processing_step (join/download/describe), join_table_name(s)
20230427
Finished csv exports from SQL Server Reviewed csv files lookup tables still need to be addressed join LUTs into data tables whereever possible
Lookup table - csv files affected cover class codes: foliar-cover.csv horiz-distance-profile.csv horiz-vegetation-profile.csv veg-type.csv
Following all apply to site-birdobs.csv site conditions: wind codes rain codes noise codes bird observations: AOUCodes and species names Detection Type codes
Tree tally needs species as well as common name
20230403
Get site / bird observations in final form; habitat data in dataset flat file format from database.
20230331
Review these repos and docs
https://github.com/nationalparkservice/NPS_EML_Script
https://github.com/nationalparkservice/EMLeditor
https://github.com/nationalparkservice/NPSdataverse
Documentation for EML process at NPS
https://nationalparkservice.github.io/NPS_EML_Script/
Cleanup...
Moved all EML dev files back to src. Archived all of the itis work. EMLAssembly should auto-generate ITIS higher taxonomy.
20230310
Removed Validate component of repo. Validation will be included in a separate repo.
20230227
Installed SQL Server, loaded ITIS database. Created HTLN_Sandbox db. Ran join on TSN, ScientificName. TSNs and scientific names in HTLN_Landbirds agree with Itis.
20230213
Created branch called 'itis'. Downloaded copy of itis. Need to install SQL Server on new computer.
20230130
Created species list based on observation data. List includes TSN, Family, Genus, Species, AOUCode, CommonName. Need copy of ITIS to join Orders to TSNs
20230120
Created branch for taxonomy. Write T-SQL to create species list including TSN, Family, Genus, Species, CommonName. Need write R script to insert Kingdom, Phylum, Order into each species record.
20221222
Re-ran T-SQL to pull .csv file. Removed CUVA data for now due to formatting issues with PlotID. Come back to this at some future point.
20221114
Loaded the .csv into R dataframe using fread.
20221216
Ran first part of EML creation script to create blank text files for the abstract, additional information, custom units, intellectual rights, keywords, methods, and personnel. Then ran function to create "attributes_datafilename.txt" to describe the dataset attributes.
20221110
Downloaded EML_Creation_Script.R from here:
https://github.com/nationalparkservice/NPS_EML_Script
Created T-SQL script called BirdObservationThru2022.sql Created .csv file called HTLNBreedingBirds_BirdObs.csv
Copied EML_Creation_Script.R to EML_Creation_Script_HTLNBreedingBirds.R
20221031
Installed packages associated with NPSdataverse from here: