The QC application will use the tools:
- CEDA-CC
- CF-checker
- Additional routines built here
- Time checks
to provide a QC summary for all the data in the CP4CDS qcapp
- Checkout
export DJANGO_SETTINGS_MODULE=qcproj.settings
- To run a parallel process
- Run
qc_db_starter.py
- This reads in a "data request"
- For each variable, at a given frequency and CMOR table this will set off a lotus process by calling
submit-lotus.sh
submit-lotush.sh
submits jobs to louts throughrun_qc_lotus.sh
run_qc_lotus.sh
sets up the correct environment on the lotus node and callsqc_db_builder.py
qc_db_builder.py
takes the three arguments: variable, cmor_table, frequency and applies the selected functions for all experiments: historical, piControl, amip, rcp26, rcp45, rcp60, rcp85
- To run a single process currently need to call with args
qc_db_builder.py <var> <table> <frequency>
- Populates tables from an esgf-index1.ceda.ac.uk local data node only search:
- DataRequester
- DataSpecification
- Dataset
- DataFile
Also
-
Calculates the local md5sum
-
Checks if the directory holding the data has more than one file, i.e. data is part of a timeseries
-
Performs a distributed search, for latest, no replicas to checks whether this is the most recent version of the data
The time checker will adjust automatically any NetCDF file where the units of time are given as "days since YYYY-MM" to "days since YYYY-MM-DD".
If a file start date is 0001-01-01 then the time checker performs only one consistency check
If the file is valid at one time only then in time_checks/file_time_checks.py if len(times) == 1, returns True