ufs-community/ufs-weather-model

Problem to run a C96 GEFS case on Hercules

Closed this issue · 2 comments

Description

UFS model failed with fatal error at the output.

To Reproduce:

Run on hercules, use branch:

Fetch URL: git@github.com:NeilBarton-NOAA/global-workflow.git
Push URL: git@github.com:NeilBarton-NOAA/global-workflow.git

Work dir:
/work/noaa/epic/weihuang/src/sfs-global-workflow-neil

setup with:
HPC_ACCOUNT=epic
IDATE=2011110100
pslot=c96sfs
RUNTESTS=/work2/noaa/epic/weihuang/run
./workflow/create_experiment.py
--yaml SFS_baseline/SFS.yaml

Additional context

[weihuang@hercules-login-1 sfs-global-workflow-neil]$ cat SFS_baseline/SFS.yaml
experiment:
system: gefs
mode: forecast-only

arguments:
pslot: {{ 'pslot' | getenv }}
app: S2S
resdetatmos: 96
resensatmos: 96
resdetocean: 1.0
nens: 10
gfs_cyc: 1
start: cold
comroot: {{ 'RUNTESTS' | getenv }}/COMROOT
expdir: {{ 'RUNTESTS' | getenv }}/EXPDIR
idate: 2011110100
edate: 2011110100
yaml: {{ HOMEgfs }}/SFS_baseline/SFS_options.yaml

skip_ci_on_hosts:

  • wcoss2

[weihuang@hercules-login-1 sfs-global-workflow-neil]$ cat SFS_baseline/SFS_options.yaml
base:
DO_JEDIATMVAR: "NO"
DO_JEDIATMENS: "NO"
DO_JEDIOCNVAR: "NO"
DO_JEDISNOWDA: "NO"
DO_MERGENSST: "NO"
DO_BUFRSND: "NO"
DO_GEMPAK: "NO"
DO_AWIPS: "NO"
DO_GENESIS_FSU: "NO"
KEEPDATA: "YES"
FHMAX_GFS: 120
FHMAX_HF_GFS: 0
KEEPDATA: "NO"
DO_EXTRACTVARS: "NO"
HPSSARCH: "NO"
LOCALARCH: "NO"
USE_OCN_PERTURB_FILES: "false"
REPLAY_ICS: "NO"
FCST_BREAKPOINTS: 48
FLTFILEGFS: "postxconfig-NT-SFS.txt"
FLTFILEGFSF00: "postxconfig-NT-SFS.txt"
ACCOUNT: {{ 'HPC_ACCOUNT' | getenv }}
BASE_IC: /work2/noaa/epic/weihuang/ICs/REPLAY_ICs/C96mx100
fcst:
TYPE: "hydro"
MONO: "mono"

Output

EXPDIR: /work2/noaa/da/weihuang/run/EXPDIR/c96sfs
ROTDIR: /work2/noaa/da/weihuang/run/COMROOT/c96sfs

37: WARNING from PE 7: Extreme surface sfc_state detected: i= 213 j= 81 lon= -87.500 lat= -43.206 x= -87.500 y= -43.206 D= 3.6334E+03 SSH= 6.6060E+01 SST= 1.0067E+01 SSS= 3.3952E+01 U-=-3.9181E+00 U+=-3.0035E+00 V-=-1.7761E+00 V+=-1.4790E+00
37:
37:
37: WARNING from PE 7: Extreme surface sfc_state detected: i= 214 j= 81 lon= -86.500 lat= -43.206 x= -86.500 y= -43.206 D= 3.5225E+03 SSH= 7.0271E+01 SST= 1.0239E+01 SSS= 3.3951E+01 U-=-3.0035E+00 U+=-1.1783E+00 V-=-2.9644E+00 V+=-2.4425E+00
37:
37:
37: WARNING from PE 7: Extreme surface sfc_state detected: i= 215 j= 81 lon= -85.500 lat= -43.206 x= -85.500 y= -43.206 D= 3.5360E+03 SSH= 7.2328E+01 SST= 1.0400E+01 SSS= 3.3952E+01 U-=-1.1783E+00 U+=-9.1210E-02 V-=-3.1631E+00 V+=-3.0908E+00
37:
37:
37: WARNING from PE 7: Extreme surface sfc_state detected: i= 216 j= 81 lon= -84.500 lat= -43.206 x= -84.500 y= -43.206 D= 3.4371E+03 SSH= 7.2940E+01 SST= 1.0296E+01 SSS= 3.3949E+01 U-=-9.1210E-02 U+= 6.2099E-01 V-=-2.7393E+00 V+=-2.9584E+00
37:
37:
37: WARNING from PE 7: Extreme surface sfc_state detected: i= 211 j= 82 lon= -89.500 lat= -42.472 x= -89.500 y= -42.472 D= 3.4979E+03 SSH= 5.4525E+01 SST= 1.0343E+01 SSS= 3.3946E+01 U-=-3.7249E+00 U+=-4.6761E+00 V-= 1.8267E-01 V+= 3.1809E-01
37:
37:
37: WARNING from PE 7: There were more unreported extreme events!
37:
30:
30: FATAL from PE 0: There were a total of 43197 locations detected with extreme surface values!
30:
37:
31:
31: FATAL from PE 1: There were a total of 43197 locations detected with extreme surface values!
31:
32:
32: FATAL from PE 2: There were a total of 43197 locations detected with extreme surface values!
32:
33:
33: FATAL from PE 3: There were a total of 43197 locations detected with extreme surface values!
33:
38:
38: FATAL from PE 8: There were a total of 43197 locations detected with extreme surface values!

@weihuang-jedi can you take a look at #2447? It sounds like similar issue.

Neil and I figured out that this is due the MOM6 IC issue and find a solution.
which we need to set MOM6_RESTART_SETTING to "n".
Close the issue now.