DMS and Chlorophyll Notebook issue with Chlorophyll
barronh opened this issue · 2 comments
Description
When running the Jupyter Notebook in PYTOOLS/dmschlo/CMAQ_DMS_ChlorA.ipynb, the notebook fails when finding climatology files.
Scope and Impact
This causes the notebook to fail and the DMS/CHLO variables cannot be created.
Solution
The error is caused by a server reorganization of the files on the server at the OPBG DAAC.
- the directory structure has changed from Julian day (
%j
) of year to month-day (%m%d
). - Naming structure also changed
- old>
A%Y%j%Y%j.L3m_MC_CHL_chlor_a_9km.nc
- new>
AQUA_MODIS.%Y%m%d_%Y%m%d.L3m.MC.CHL.chlor_a.9km.nc
- old>
- The climatology files previously used 2003-08-01 as the start of climatology data, but now they are starting with 2002-07-01.
This requires two changes to the notebook. Both are in the cell that starts with “if getlatestchlo”.
First, change dates in the for loop
< for prefix in ['2003/0801', '2003/0901', '2003/1001', '2003/1101', '2003/1201', '2004/0101', '2004/0201', '2004/0301', '2004/0401', '2004/0501', '2004/0601', '2004/0701']:
> for prefix in ['2002/0701', '2002/0801', '2002/0901', '2002/1001', '2002/1101', '2002/1201', '2003/0101', '2003/0201', '2003/0301', '2003/0401', '2003/0501', '2003/0601']:
Second, change the regular expression that finds the files
< mostrecent = sorted(re.compile('(?<=>).+L3m_MC_CHL_chlor_a_9km.nc(?=</)').findall(htmltxt))[-1]
> mostrecent = sorted(re.compile('(?<=>).+.L3m.MC.CHL.chlor_a.9km.nc(?=</)').findall(htmltxt))[-1]
I have successfully run for a new domain with the latest climatology files.
Additional context
A PR will be forthcoming that also changes the documentation
This type of update is inherent in including download as part of the process.
- We could move download out of the notebook, but the problem of reorganization will continue -- just outside the notebook.
- We could avoid this by using the CMR to dynamically query, but the CMR is not always up-to-date.
- Open to other proposals.
p.s., you also need to change the loop and date diagnostic lines in the loop later.
Change from:
for chloutpath in sorted(glob(f'chlor_a/{dom}/A2*_{dom}.nc')):
mydate = datetime.strptime(os.path.basename(chloutpath)[1:8], '%Y%j')
Change to:
for chloutpath in sorted(glob(f'chlor_a/{dom}/A*.{dom}.nc')):
mydate = datetime.strptime(os.path.basename(chloutpath)[11:19], '%Y%m%d')