NOAA-GFDL/MDTF-diagnostics

Error running framework on workstation on ESM4 test data using GFDL site settings

wrongkindofdoctor opened this issue · 0 comments

I'm trying to run the framework on the ESM4 dataset using the GFDL site-specific settings for workstations in the current develop branch. There are issues finding the required files during the gcp stage. The settings are are the same as those in the default_gfdl.jsonc file, with MDTF_TMPDIR defined as '/net2/$USER/tmp'. This issue does not occur on PPAN.

I'm using my own conda installation so that the environments are up-to-date. The command used to run framework from inside my MDTF-diagnostics repo is:
mdtf -f sites/NOAA_GFDL/default_gfdl.jsonc --site NOAA_GFDL -v

The detailed output log from the EOF_500hPa case is below:

MDTF PACKAGE LOG

Started logging at 2021-04-22 11:18:37.711046
git hash/branch: c35e65d4d2ffd7b1762d3177fd545bb5f8d3a959 (on regenerate_figs)
sys.platform: 'linux'
sys.version: '3.7.10 | packaged by conda-forge | (default, Feb 19 2021, 16:07:37) 
[GCC 9.3.0]'
--------------------------------------------------------------------------------

11:18:37 DEBUG: Using selector: EpollSelector
11:18:37 DEBUG: Adding variable name convention 'NCAR'
11:18:37 DEBUG: Adding variable name convention 'GFDL'
11:18:37 DEBUG: Adding variable name convention 'CMIP'
11:18:37 DEBUG: Empty obs data directory at '/net2/$USER/tmp/inputdata/obs_data'.
11:18:37 INFO: 	Symlinking obs data dir to /home/oar.gfdl.mdtf/mdtf/inputdata/obs_data.
11:18:37 ERROR: Expected symlink at /net2/$USER/tmp/inputdata/obs_data
11:18:37 INFO: PACKAGE SETTINGS:
11:18:37 INFO:   case_list(0): 
    CASENAME: ESM4_historical_D1
    model: GFDL-ESM4
    convention: CMIP
    FIRSTYR: 1996
    LASTYR: 1999
    CASE_ROOT_DIR: /archive/oar.gfdl.cmip6/ESM4/DECK/ESM4_historical_D1/gfdl.ncrc4-intel16-prod-openmp/pp
    pod_list: 
      EOF_500hPa
    experiment: ""
  paths: 
    CODE_ROOT: /net/$USER/mdtf/MDTF-diagnostics
    OBS_DATA_ROOT: /net2/$USER/tmp/inputdata/obs_data
    MODEL_DATA_ROOT: /net2/$USER/tmp/inputdata/model
    WORKING_DIR: /net2/$USER/tmp/wkdir
    OUTPUT_DIR: /net2/$USER/tmp/mdtf_out
    conda_root: /net/$USER/anaconda3
    conda_env_root: /net/$USER/anaconda3/envs
    OBS_DATA_REMOTE: /home/oar.gfdl.mdtf/mdtf/inputdata/obs_data
    GFDL_PPAN_TEMP: $TMPDIR
    GFDL_WS_TEMP: /net2/$USER/tmp
    TEMP_DIR_ROOT: /net2/$USER/tmp
  settings: 
    convention: CMIP
    data_manager: GFDL_auto
    verbose: 1
    file_transfer_timeout: 900
    output_manager: html
    environment_manager: conda
    runtime_manager: subprocess
  env_vars: 
    RGB: /net/$USER/mdtf/MDTF-diagnostics/shared/rgb
    MPLBACKEND: Agg
    MDTF_TMPDIR: /net2/$USER/tmp
11:18:37 INFO: ### Framework: initialize ESM4_historical_D1
11:18:37 DEBUG: GfdlautoDataManager: Dispatched DataManager to GfdlppDataManager.
11:18:37 DEBUG: Contents of log cache transferred.
11:18:37 DEBUG: Copied value of Z slice (=500 hPa) of 'plev' (identical units).
11:18:37 DEBUG: Renaming 500 hPa Z slice of 'zg' to 'zg500'.
11:18:37 DEBUG: ExtractLevelFunction for <#2.EOF_500hPa:zg>: add translated <zg (=geopotential_height)> as alternate for <zg500 (=geopotential_height) @ 500 hPa>
11:18:37 DEBUG: ######################################################################
11:18:37 DEBUG: Pre-query varlists for ESM4_historical_D1:
11:18:37 DEBUG: <#2.EOF_500hPa:zg (=zg500) @ 1mo, 500 hPa; active:INITED, ok, required>
    Translation: <zg500 (=geopotential_height) @ 500 hPa>
    Alternate set #1: [<#5.EOF_500hPa:zg (=zg) @ 1mo, 500 hPa>]
    Alternate set #2: [<#3.EOF_500hPa:zg_hybrid_sigma (=zg) @ 1mo>, <#4.EOF_500hPa:ps (=ps) @ 1mo>]
11:18:37 DEBUG: <#5.EOF_500hPa:zg (=zg) @ 1mo, 500 hPa; inactive:INITED, ok, alternate>
    Translation: <zg (=geopotential_height)>
    Alternate set #1: [<#3.EOF_500hPa:zg_hybrid_sigma (=zg) @ 1mo>, <#4.EOF_500hPa:ps (=ps) @ 1mo>]
11:18:37 DEBUG: <#3.EOF_500hPa:zg_hybrid_sigma (=zg) @ 1mo; inactive:INITED, ok, alternate>
    Translation: <zg (=geopotential_height)>
11:18:37 DEBUG: <#4.EOF_500hPa:ps (=ps) @ 1mo; inactive:INITED, ok, alternate>
    Translation: <ps (=surface_air_pressure)>
11:18:37 DEBUG: ######################################################################
11:18:37 INFO: ### Framework: request data for ESM4_historical_D1
11:18:37 DEBUG: Starting catalog directory crawl at /archive/oar.gfdl.cmip6/ESM4/DECK/ESM4_historical_D1/gfdl.ncrc4-intel16-prod-openmp/pp
11:19:27 INFO: Couldn't parse path land_cmip/land_cmip.static_old.nc
11:20:46 INFO: Couldn't parse path atmos_plev39_6hr/ts/daily/5yr/atmos_plev39.20100101-20141231.utendwtem.nc
11:20:46 INFO: Couldn't parse path atmos_plev39_6hr/ts/daily/5yr/atmos_plev39.20100101-20141231.vtem.nc
and so on for the remaining variables ...
11:21:33 DEBUG: Directory crawl found 107289 files.
11:21:40 DEBUG: Updating active PODs for CASENAME 'ESM4_historical_D1'
11:21:40 DEBUG: Updating active vars for POD EOF_500hPa
11:21:40 DEBUG: Query batch: [<#2.EOF_500hPa:zg>]
11:21:40 INFO: Querying <zg500 (=geopotential_height) @ 500 hPa>
11:21:44 INFO: No data found for <zg500 (=geopotential_height) @ 500 hPa>.
11:21:44 DEBUG: Updating active PODs for CASENAME 'ESM4_historical_D1'
11:21:44 DEBUG: Updating active vars for POD EOF_500hPa
11:21:44 INFO: Request for <#2.EOF_500hPa:zg (=zg500) @ 1mo, 500 hPa> failed; finding alternate vars.
11:21:44 DEBUG: Query batch: [<#5.EOF_500hPa:zg>]
11:21:44 INFO: Querying <zg (=geopotential_height)>
11:21:48 DEBUG: Query found <expt_key=('', 'aerosol_cmip', '5yr'), data_key=(61278,)> for <#5.EOF_500hPa:zg>
11:21:48 DEBUG: Query found <expt_key=('', 'atmos_cmip', '5yr'), data_key=(82676,)> for <#5.EOF_500hPa:zg>
11:21:48 DEBUG: Query found <expt_key=('', 'atmos_plev39_cmip', '5yr'), data_key=(70836,)> for <#5.EOF_500hPa:zg>
11:21:48 DEBUG: Updating active PODs for CASENAME 'ESM4_historical_D1'
11:21:48 DEBUG: Updating active vars for POD EOF_500hPa
11:21:48 DEBUG: 3 expt attr choices for all PODs from <#5.EOF_500hPa:zg>
11:21:48 DEBUG: 3 expt attr choices for all PODs
11:21:48 DEBUG: Selected experiment attribute component='atmos_cmip' for ESM4_historical_D1 (out of ['atmos_plev39_cmip', 'atmos_cmip', 'aerosol_cmip']).
11:21:48 DEBUG: Setting experiment_key for EOF_500hPa to 'atmos_cmip'
11:21:48 DEBUG: 1 expt attr choices for all POD's variables from <#5.EOF_500hPa:zg>
11:21:48 DEBUG: 1 expt attr choices for all POD's variables
11:21:48 DEBUG: Setting experiment_key for zg to '5yr'
11:21:48 DEBUG: Updating active PODs for CASENAME 'ESM4_historical_D1'
11:21:48 DEBUG: Updating active vars for POD EOF_500hPa
11:21:48 DEBUG: Fetch batch: [<#5.EOF_500hPa:zg>]
11:21:48 INFO: Start dmget of 1 files.
11:23:39 INFO: Successful exit of dmget.
11:23:39 INFO: Fetching <#5.EOF_500hPa:zg (=zg) @ 1mo, 500 hPa>
11:23:39 DEBUG: Fetching data_key=(82676,)
11:23:39 DEBUG: Created GCP fetch temp dir at /net2/$USER/tmp/MDTF_temp_903ozibc.
11:23:39 DEBUG: Selected fetch method 'gcp'.
11:23:39 INFO: 	Fetching /atmos_cmip/ts/monthly/5yr/atmos_cmip.199501-199912.zg.nc
11:24:30 ERROR: run_command on gcp --sync -v -cd gfdl:/archive/oar.gfdl.cmip6/ESM4/DECK/ESM4_historical_D1/gfdl.ncrc4-intel16-prod-openmp/pp/atmos_cmip/ts/monthly/5yr/atmos_cmip.199501-199912.zg.nc gfdl:/net2/$USER/tmp/MDTF_temp_903ozibc/ (pid 8047) exit status=1:gcp 2.3.22  by $USER at Thu Apr 22 11:23:40 2021
Unique log session id is 526741f0-9aba-4c52-b460-998cfefd06b7 at 2021-04-22Z15:23


11:24:30 DEBUG: Eliminating data_key=(82676,) for all vars.
11:24:30 INFO: Fetch failed for <#5.EOF_500hPa:zg (=zg) @ 1mo, 500 hPa>.
11:24:30 DEBUG: Updating active PODs for CASENAME 'ESM4_historical_D1'
11:24:30 DEBUG: Updating active vars for POD EOF_500hPa
11:24:30 INFO: Request for <#5.EOF_500hPa:zg (=zg) @ 1mo, 500 hPa> failed; finding alternate vars.
11:24:30 DEBUG: Query batch: [<#3.EOF_500hPa:zg_hybrid_sigma>, <#4.EOF_500hPa:ps>]
11:24:30 INFO: Querying <zg (=geopotential_height)>
11:24:34 DEBUG: Query found <expt_key=('', 'aerosol_cmip', '5yr'), data_key=(61278,)> for <#3.EOF_500hPa:zg_hybrid_sigma>
11:24:34 DEBUG: Query found <expt_key=('', 'atmos_cmip', '5yr'), data_key=(82676,)> for <#3.EOF_500hPa:zg_hybrid_sigma>
11:24:34 DEBUG: Query found <expt_key=('', 'atmos_plev39_cmip', '5yr'), data_key=(70836,)> for <#3.EOF_500hPa:zg_hybrid_sigma>
11:24:34 INFO: Querying <ps (=surface_air_pressure)>
11:24:38 DEBUG: Query found <expt_key=('', 'aerosol_cmip', '5yr'), data_key=(57585,)> for <#4.EOF_500hPa:ps>
11:24:38 DEBUG: Query found <expt_key=('', 'atmos', '25yr'), data_key=(77293,)> for <#4.EOF_500hPa:ps>
11:24:38 DEBUG: Query found <expt_key=('', 'atmos', '5yr'), data_key=(79484,)> for <#4.EOF_500hPa:ps>
11:24:38 DEBUG: Query found <expt_key=('', 'atmos_cmip', '5yr'), data_key=(82614,)> for <#4.EOF_500hPa:ps>
11:24:38 DEBUG: Query found <expt_key=('', 'atmos_co2_level', '5yr'), data_key=(36227,)> for <#4.EOF_500hPa:ps>
11:24:38 DEBUG: Query found <expt_key=('', 'atmos_level', '5yr'), data_key=(76062,)> for <#4.EOF_500hPa:ps>
11:24:38 DEBUG: Query found <expt_key=('', 'atmos_level_cmip', '5yr'), data_key=(77143,)> for <#4.EOF_500hPa:ps>
11:24:38 DEBUG: Query found <expt_key=('', 'atmos_month_aer', '5yr'), data_key=(69483,)> for <#4.EOF_500hPa:ps>
11:24:38 DEBUG: Query found <expt_key=('', 'tracer_level', '5yr'), data_key=(64981,)> for <#4.EOF_500hPa:ps>
11:24:38 DEBUG: Updating active PODs for CASENAME 'ESM4_historical_D1'
11:24:38 DEBUG: Updating active vars for POD EOF_500hPa
11:24:38 DEBUG: 2 expt attr choices for all PODs from <#3.EOF_500hPa:zg_hybrid_sigma>
11:24:38 DEBUG: 8 expt attr choices for all PODs from <#4.EOF_500hPa:ps>
11:24:38 DEBUG: 1 expt attr choices for all PODs
11:24:38 DEBUG: Setting experiment_key for EOF_500hPa to 'aerosol_cmip'
11:24:38 DEBUG: 1 expt attr choices for all POD's variables from <#3.EOF_500hPa:zg_hybrid_sigma>
11:24:38 DEBUG: 1 expt attr choices for all POD's variables from <#4.EOF_500hPa:ps>
11:24:38 DEBUG: 1 expt attr choices for all POD's variables
11:24:38 DEBUG: Setting experiment_key for zg_hybrid_sigma to '5yr'
11:24:38 DEBUG: Setting experiment_key for ps to '5yr'
11:24:38 DEBUG: Updating active PODs for CASENAME 'ESM4_historical_D1'
11:24:38 DEBUG: Updating active vars for POD EOF_500hPa
11:24:38 DEBUG: Fetch batch: [<#3.EOF_500hPa:zg_hybrid_sigma>, <#4.EOF_500hPa:ps>]
11:24:38 INFO: Start dmget of 2 files.
11:27:22 INFO: Successful exit of dmget.
11:27:22 INFO: Fetching <#3.EOF_500hPa:zg_hybrid_sigma (=zg) @ 1mo>
11:27:22 DEBUG: Fetching data_key=(61278,)
11:27:22 DEBUG: Created GCP fetch temp dir at /net2/$USER/tmp/MDTF_temp_g4n1rmyb.
11:27:22 DEBUG: Selected fetch method 'gcp'.
11:27:22 INFO: 	Fetching /aerosol_cmip/ts/monthly/5yr/aerosol_cmip.199501-199912.zg.nc
11:28:14 ERROR: run_command on gcp --sync -v -cd gfdl:/archive/oar.gfdl.cmip6/ESM4/DECK/ESM4_historical_D1/gfdl.ncrc4-intel16-prod-openmp/pp/aerosol_cmip/ts/monthly/5yr/aerosol_cmip.199501-199912.zg.nc gfdl:/net2/$USER/tmp/MDTF_temp_g4n1rmyb/ (pid 8877) exit status=1:gcp by $USER at Thu Apr 22 11:27:23 2021
Unique log session id is bb3b0308-cd95-45ff-bd68-f2d2eaad3636 at 2021-04-22Z15:27
11:28:14 DEBUG: Eliminating data_key=(61278,) for all vars.
11:28:14 INFO: Fetch failed for <#3.EOF_500hPa:zg_hybrid_sigma (=zg) @ 1mo>.
11:28:14 INFO: Fetching <#4.EOF_500hPa:ps (=ps) @ 1mo>
11:28:14 DEBUG: Fetching data_key=(57585,)
11:28:14 DEBUG: Created GCP fetch temp dir at /net2/$USER/tmp/MDTF_temp_yegr_f_f.
11:28:14 DEBUG: Selected fetch method 'gcp'.
11:28:14 INFO: 	Fetching /aerosol_cmip/ts/monthly/5yr/aerosol_cmip.199501-199912.ps.nc
11:29:05 ERROR: run_command on gcp --sync -v -cd gfdl:/archive/oar.gfdl.cmip6/ESM4/DECK/ESM4_historical_D1/gfdl.ncrc4-intel16-prod-openmp/pp/aerosol_cmip/ts/monthly/5yr/aerosol_cmip.199501-199912.ps.nc gfdl:/net2/$USER/tmp/MDTF_temp_yegr_f_f/ (pid 9052) exit status=1:gcp 2.3.22  by $USER at Thu Apr 22 11:28:15 2021
Unique log session id is 0bfaff13-74a7-4d0f-a6da-4aabe705fa46 at 2021-04-22Z15:28

11:29:05 DEBUG: Eliminating data_key=(57585,) for all vars.
11:29:05 INFO: Fetch failed for <#4.EOF_500hPa:ps (=ps) @ 1mo>.
11:29:05 DEBUG: Updating active PODs for CASENAME 'ESM4_historical_D1'
11:29:05 DEBUG: Updating active vars for POD EOF_500hPa
11:29:05 INFO: Request for <#3.EOF_500hPa:zg_hybrid_sigma (=zg) @ 1mo> failed; finding alternate vars.
11:29:05 INFO: No alternates available for <#3.EOF_500hPa:zg_hybrid_sigma>.
11:29:05 INFO: Request for <#4.EOF_500hPa:ps (=ps) @ 1mo> failed; finding alternate vars.
11:29:05 INFO: No alternates available for <#4.EOF_500hPa:ps>.
11:29:05 WARNING: Execution of POD EOF_500hPa couldn't be completed:
***** Caught exception #1:
Traceback (most recent call last):
  File "/net/$USER/mdtf/MDTF-diagnostics/src/data_manager.py", line 629, in fetch_data
    raise util.DataFetchError("Fetch failed.", data_key)
src.util.exceptions.DataFetchError: Data fetch error for data in (61278,): Fetch failed.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/net/$USER/mdtf/MDTF-diagnostics/src/diagnostic.py", line 563, in update_active_vars
    self) from v.exception
src.util.exceptions.PodDataError: Requested data not available for POD 'EOF_500hPa': No alternates available for <#3.EOF_500hPa:zg_hybrid_sigma (=zg) @ 1mo>.

***** Caught exception #2:
Traceback (most recent call last):
  File "/net/$USER/mdtf/MDTF-diagnostics/src/data_manager.py", line 629, in fetch_data
    raise util.DataFetchError("Fetch failed.", data_key)
src.util.exceptions.DataFetchError: Data fetch error for data in (57585,): Fetch failed.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/net/$USER/mdtf/MDTF-diagnostics/src/diagnostic.py", line 563, in update_active_vars
    self) from v.exception
src.util.exceptions.PodDataError: Requested data not available for POD 'EOF_500hPa': No alternates available for <#4.EOF_500hPa:ps (=ps) @ 1mo>.
11:29:05 DEBUG: Execution for CASENAME 'ESM4_historical_D1' couldn't be completed successfully.
11:29:05 DEBUG: Updating active PODs for CASENAME 'ESM4_historical_D1'
11:29:05 DEBUG: Execution for CASENAME 'ESM4_historical_D1' couldn't be completed successfully.
11:29:05 DEBUG: Updating active PODs for CASENAME 'ESM4_historical_D1'
11:29:05 DEBUG: Execution for CASENAME 'ESM4_historical_D1' couldn't be completed successfully.
11:29:05 DEBUG: Updating active PODs for CASENAME 'ESM4_historical_D1'
11:29:05 DEBUG: Execution for CASENAME 'ESM4_historical_D1' couldn't be completed successfully.
11:29:05 DEBUG: Updating active PODs for CASENAME 'ESM4_historical_D1'
11:29:05 DEBUG: Execution for CASENAME 'ESM4_historical_D1' couldn't be completed successfully.
11:29:05 DEBUG: Execution for CASENAME 'ESM4_historical_D1' couldn't be completed successfully.
11:29:05 INFO: ### Framework: Data request for ESM4_historical_D1 failed, skipping execution.
11:29:05 DEBUG: Writing file /net2/$USER/tmp/wkdir/MDTF_ESM4_historical_D1_1996_1999/config_save.json
11:29:05 INFO: 	GCP gfdl:/net2/$USER/tmp/wkdir/MDTF_ESM4_historical_D1_1996_1999/ -> gfdl:/net2/$USER/tmp/mdtf_out/
11:29:56 ERROR: run_command on gcp --sync -v -cd -r gfdl:/net2/$USER/tmp/wkdir/MDTF_ESM4_historical_D1_1996_1999/ gfdl:/net2/$USER/tmp/mdtf_out/ (pid 9231) exit status=1:gcp 2.3.22  by $USER at Thu Apr 22 11:29:06 2021
Unique log session id is a29c76bf-f871-4602-b500-ed746e39eb95 at 2021-04-22Z15:29

11:29:56 CRITICAL: **********************************************************************
Uncaught exception:
Traceback (most recent call last):
  File "/net/$USER/mdtf/MDTF-diagnostics/mdtf_framework.py", line 66, in <module>
    exit_code = main()
  File "/net/$USER/mdtf/MDTF-diagnostics/mdtf_framework.py", line 62, in main
    exit_code = framework.main()
  File "/net/$USER/mdtf/MDTF-diagnostics/src/core.py", line 282, in main
    out_mgr.make_output()
  File "/net/$USER/mdtf/MDTF-diagnostics/src/output_manager.py", line 393, in make_output
    self.copy_to_output(self._case)
  File "/net/$USER/mdtf/MDTF-diagnostics/sites/NOAA_GFDL/gfdl.py", line 734, in copy_to_output
    gfdl_util.gcp_wrapper(self.WK_DIR, self.OUT_DIR)
  File "/net/$USER/mdtf/MDTF-diagnostics/sites/NOAA_GFDL/gfdl_util.py", line 121, in gcp_wrapper
    dry_run=dry_run
  File "/net/$USER/mdtf/MDTF-diagnostics/src/util/processes.py", line 130, in run_command
    returncode=retcode, cmd=cmd_str, output=stderr)
src.util.exceptions.MDTFCalledProcessError: Command 'gcp --sync -v -cd -r gfdl:/net2/$USER/tmp/wkdir/MDTF_ESM4_historical_D1_1996_1999/ gfdl:/net2/$USER/tmp/mdtf_out/' returned non-zero exit status 1.