ORNL-Fusion/ips-wrappers

Environment variables

parkjm opened this issue · 9 comments

Please document environment variables for your components/workflows including modules needed. At this moment, we don't need to make naming convention be consistent, let's just collect the variables as used in your production run.

IPS-FASTRAN:

module load python cray-netcdf gcc

export ATOM=/project/projectdirs/atom/atom-install-edison
export LOCAL=/global/project/projectdirs/atom/atom-install-edison/cesol
export IPSCONFIG_DIR=$LOCAL/conf

export IPS_ROOT=/global/project/projectdirs/atom/atom-install-cori/ips-python3
export DAKOTA_ROOT=$ATOM/dakota
export FASTRAN_ROOT=$LOCAL/ips-fastran
export EPED_ROOT=$LOCAL/ips-eped
export DATA_ROOT=$ATOM/data

export LD_LIBRARY_PATH=$DAKOTA_ROOT/lib:$LD_LIBRARY_PATH
export LD_LIBRARY_PATH=/opt/gcc/7.1.0/snos/lib64:$LD_LIBRARY_PATH
export LD_LIBRARY_PATH=/opt/gcc/6.1.0/snos/lib64:$LD_LIBRARY_PATH
export PATH=$DAKOTA_ROOT/bin:$IPS_ROOT/bin:$PATH

export PYTHONPATH=$EPED_ROOT/src:$PYTHONPATH
export PYTHONPATH=$EPED_ROOT/lib:$PYTHONPATH
export PYTHONPATH=$FASTRAN_ROOT/lib:$PYTHONPATH
export PYTHONPATH=$FASTRAN_ROOT/utils:$PYTHONPATH
export PYTHONPATH=$FASTRAN_ROOT/src:$PYTHONPATH
export PYTHONPATH=$IPS_ROOT/bin:$PYTHONPATH

export PYTHONPATH=$LOCAL/share/pyps3:$PYTHONPATH

export ATOM_BIN_DIR=$ATOM/binaries
export PSTOOL_BIN_DIR=$ATOM_BIN_DIR/pstool/default
export PSTOOL_BIN_NAME=pstool
export WGEQDSK_BIN_DIR=$ATOM_BIN_DIR/wgeqdsk/default
export WGEQDSK_BIN_NAME=wgeqdsk
export FASTRAN_BIN_DIR=$ATOM_BIN_DIR/fastran/default
export FASTRAN_BIN_NAME=xfastran_ver0.93
export FASTRAN_SERIAL_BIN_NAME=xfastran_ver0.93_ser
export EFIT_BIN_DIR=$ATOM_BIN_DIR/efit/default
export EFIT_BIN_NAME=efitd90
export ESC_BIN_DIR=$ATOM_BIN_DIR/esc/default
export ESC_BIN_NAME=xesc
export NUBEAM_BIN_DIR=$ATOM_BIN_DIR/nubeam/default
export NUBEAM_BIN_NAME=mpi_nubeam_comp_exec
export TORAY_BIN_DIR=$ATOM_BIN_DIR/toray/default
export TORAY_BIN_NAME=xtoray
export CURRAY_BIN_DIR=$ATOM_BIN_DIR/curray/default
export CURRAY_BIN_NAME=xcurray
export GENRAY_BIN_DIR=$ATOM_BIN_DIR/genray/default
export GENRAY_BIN_NAME=xgenray.intel.edison
export NFREYA_BIN_PATH=$FASTRAN_ROOT/bin
export NFREYA_BIN_NAME=onetwo_129_201
export NFREYA_DATA_ROOT=/project/projectdirs/atom/users/parkjm

export PSTOOL=$PSTOOL_BIN_DIR/$PSTOOL_BIN_NAME
export PS_BACKEND=pyps

I sent out emails on August 30, 2019 at 5:27 PM and on Sept 8, 2019 at 3:31PM detailing the problems I have had with the env.ips.cori file. The main points are:

export IPS_PATH=$IPS_PATH_CORI, but there is no definition for IPS_PATH_CORI

export IPS_WRAPPER_PATH=$IPS_WRAPPER_PATH_CORI, but there is no definition for IPS_WRAPPER_PATH_CORI

Merge_plasma_state failed because the executable was not on the path. JM found the executable in a build left over from Edison and put in a link.

Both the netCDF4 module and matplotlib are missing from the default python 2.7 on Cori. It was necessary to add "module load python/2.7-anaconda-4.4, which is an older version. I don’t know what happens with python 3.

I have to add $IPS_WRAPPER_PATH/utilities:$IPS_WRAPPER_PATH/generic-drivers to get the examples to work.

If you source env.ips.edison from the master branch you get:

ModuleCmd_Switch.c(179):ERROR:152: Module 'PrgEnv-intel/6.0.4' is currently not loaded
cmake(6):ERROR:105: Unable to locate a modulefile for 'cmake/3.8.2'
ModuleCmd_Load.c(244):ERROR:105: Unable to locate a modulefile for 'java'
ModuleCmd_Load.c(244):ERROR:105: Unable to locate a modulefile for 'hdf5-parallel/1.10.1'

As far as I know all of the platform config files are extremely obsolete. The work but refer to hopper, edison and project m876.

Here is the env.ips.cori file that I use and works for me:

MYDIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )"

export ATOM=/project/projectdirs/atom
export TEMPLATE_DATA_DIR=$ATOM/template-data

################

Cori specfic

################

export ATOM_CORI=$ATOM/atom-install-cori
#export IPS_DIR=$ATOM_CORI

The following line is a hack to make IPS_DIR point to my dbb4 github, which is in

$U2115 -> /project/projectdirs/atom/users/u2115/ips-wrappers. This is necessary

because there are problems with the env.ips.cori file in atom-install-cori

export IPS_DIR=$U2115
export IPS_PATH_CORI=$ATOM_CORI/ips-gnu-sf # N.B. This points to atom-install-cori
export IPS_WRAPPER_PATH_CORI=$IPS_DIR/ips-wrappers
export IPS_EXAMPLES_PATH_CORI=$IPS_DIR/ips-examples
export SOLPS_DATA_ITERP_PATH=$ATOM_CORI/solps-iter-data
export FTRIDYN_LIBRARY_PATH=$ATOM_CORI/FtridynFromPython/build
export GITR_PATH=$ATOM_CORI/GITR
export XOLOTL_PATH=$ATOM_CORI/xolotl/xolotl-trunk-build
export JAVA_XOLOTL_EXE=/opt/java/jdk1.8.0_51/bin/java
export JAVA_XOLOTL_LIBRARY=/project/projectdirs/atom/atom-install-cori/xolotl/xolotl-trunk-source/gov.ornl.xolotl.preprocessor/deps
export XOLOTL_PREPROCESSOR_DIR=$XOLOTL_PATH/gov.ornl.xolotl.preprocessor/preprocessor/CMakeFiles/xolotlPreprocessor.dir/
export FTRIDYN_PATH=$ATOM_CORI/fractal-tridyn
export FTRIDYN_PYTHON=$FTRIDYN_PATH/utils
export TRANSP_BIN_DIR=$ATOM_CORI/transp-build/transp/cori/intel/exe
export NTCC_BIN_DIR=$ATOM_CORI/ntcc-gnu/bin
export IPS_FASTRAN_DIR=$IPS_WRAPPER_PATH_CORI/ips-fastran

export TORIC_BIN_DIR_CORI=$ATOM_CORI/binaries/toric/default
export TORIC_BIN_NAME_CORI=xtoric.intel.cori

export GENRAY_BIN_DIR_CORI=$ATOM_CORI/binaries/genray/mks_units
export GENRAY_BIN_NAME_CORI=xgenray_mpi_intel.cori

export PSTOOL_BIN_DIR_CORI=$ATOM_CORI/binaries/pstool/dlg
export PSTOOL_BIN_NAME_CORI=pstool

export UPDATE_STATE_BIN_DIR_CORI=$ATOM_CORI/binaries/update-state/default
export UPDATE_STATE_BIN_NAME_CORI=xupdate-state.intel.cori

export NUBEAM_BIN_DIR_CORI=$ATOM_CORI/binaries/nubeam/ntcc-gnu-23-oct-15
export NUBEAM_BIN_NAME_CORI=mpi_nubeam_comp_exec

export ESC_BIN_DIR_CORI=$ATOM_CORI/binaries/esc/default
export ESC_BIN_NAME_CORI=xesc

export GEQXPL_BIN_DIR_CORI=$ATOM_CORI/binaries/geqxpl/default
export GEQXPL_BIN_NAME_CORI=geqxpl

export WGEQDSK_BIN_DIR_CORI=$ATOM_CORI/binaries/wgeqdsk/default
export WGEQDSK_BIN_NAME_CORI=wgeqdsk

export SOLPS5_SOURCE_DIR=$ATOM_CORI/solps-5-eirene99

#################################################

Generalize for machine independent config files

#################################################

export IPS_PATH=$IPS_PATH_CORI
export TRANSP_BIN_DIR=$TRANSP_BIN_DIR_CORI
export NUBEAM_BIN_DIR=$NUBEAM_BIN_DIR_CORI
export NTCC_BIN_DIR=$NTCC_BIN_DIR_CORI
export TORIC_BIN_DIR=$TORIC_BIN_DIR_CORI
export IPS_FASTRAN_DIR=$IPS_FASTRAN_DIR_CORI
export FASTRAN_ROOT=$IPS_FASTRAN_DIR
export IPS_CSWIM_WRAPPER_PATH=$IPS_CSWIM_WRAPPER_PATH_CORI
export IPS_WRAPPER_PATH=$IPS_WRAPPER_PATH_CORI
export IPS_EXAMPLES_PATH=$IPS_EXAMPLES_PATH_CORI
export IPS_NESTED_COMP_PATH=$ATOM_CORI/ips-wrappers/ips-nested_ftridyn_gitr_xolotl/component_driver

export ECHO_INSTALL_PATH=/bin
export ECHO_INSTALL_NAME=echo

export GENRAY_BIN_NAME=$GENRAY_BIN_NAME_CORI
export GENRAY_BIN_DIR=$GENRAY_BIN_DIR_CORI

export TORIC_BIN_NAME=$TORIC_BIN_NAME_CORI
export TORIC_BIN_DIR=$TORIC_BIN_DIR_CORI

export PSTOOL_BIN_NAME=$PSTOOL_BIN_NAME_CORI
export PSTOOL_BIN_DIR=$PSTOOL_BIN_DIR_CORI

export UPDATE_STATE_BIN_DIR=$UPDATE_STATE_BIN_DIR_CORI
export UPDATE_STATE_BIN_NAME=$UPDATE_STATE_BIN_NAME_CORI

export NUBEAM_BIN_NAME=$NUBEAM_BIN_NAME_CORI
export NUBEAM_BIN_DIR=$NUBEAM_BIN_DIR_CORI

export ESC_BIN_NAME=$ESC_BIN_NAME_CORI
export ESC_BIN_DIR=$ESC_BIN_DIR_CORI

export GEQXPL_BIN_DIR=$GEQXPL_BIN_DIR_CORI
export GEQXPL_BIN_NAME=$GEQXPL_BIN_NAME_CORI

export WGEQDSK_BIN_DIR=$WGEQDSK_BIN_DIR_CORI
export WGEQDSK_BIN_NAME=$WGEQDSK_BIN_NAME_CORI

################################################################

update_state has to be in the path for some stupid hack reason

################################################################

export PATH=$UPDATE_STATE_BIN_DIR:$PATH

###################################

Not sure what I need this for yet

###################################

EXTERN_CORI=/project/projectdirs/atom/users/elwasif/extern-edison
export LD_LIBRARY_PATH=$EXTERN_CORI/lib:$EXTERN_CORI/lib64:$LD_LIBRARY_PATH

#####################

Python module paths

#####################

export PYTHONPATH=$IPS_WRAPPER_PATH/utilities:$IPS_WRAPPER_PATH/generic-drivers/:$IPS_FASTRAN_DIR/share/python/:$IPS_FASTRAN_DIR/src/:$IPS_CSWIM_WRAPPER_PATH/bin:$XOLOTL_PATH:$FTRIDYN_PYTHON:$IPS_WRAPPER_PATH_CORI/ips-iterative-xolotlFT/python_scripts_for_coupling/:$PYTHONPATH$EXTERN_CORI/lib/python2.7/site-packages/Numeric:$EXTERN_CORI/lib/python2.7/site-packages:$EXTERN_CORI/lib/python2.7/site-packages/Scientific/linux2/

export PYPLASMASTATE_PATH=$ATOM/users/elwasif/plasma_state_build/pyplasmastate
export PYTHONPATH=$PYPLASMASTATE_PATH:$PYTHONPATH

#export PYTHONPATH=$GITR_PATH/ftridyn:$PYPLASMASTATE_PATH:$PYTHONPATH:$FTRIDYN_PATH/utils:$GITR_PATH/python:/global/homes/t/tyounkin/code/libconfPython/lib/python2.7/site-packages:/global/homes/t/tyounkin/code/periodic-2.1.1:$FTRIDYN_LIBRARY_PATH

#delete the paths exported in GITR's env
export PYTHONPATH=$PYPLASMASTATE_PATH:$PYTHONPATH:$FTRIDYN_PATH/utils:/global/homes/t/tyounkin/code/libconfPython/lib/python2.7/site-packages:/global/homes/t/tyounkin/code/periodic-2.1.1:$FTRIDYN_LIBRARY_PATH

#################

Dakota Settings

#################

export DAKOTA_ROOT=$ATOM_CORI/dakota
export PATH=$DAKOTA_ROOT/bin:$PATH
export LD_LIBRARY_PATH=$DAKOTA_ROOT/lib:$LD_LIBRARY_PATH

##############

Load modules

##############
#new ones
module swap PrgEnv-intel/6.0.4 PrgEnv-gnu
module unload darshan
module load cmake/3.8.2
module load java
module load boost/1.70.0
module load hdf5-parallel/1.10.1
module load python/2.7-anaconda-4.4

People have used their own environment variables. One of the issues is that the assumptions have been interpreted in different ways in different places. For example setting up IPS_PATH_CORI and IPS_WRAPPER_PATH_CORI is designed to be user's responsibility (@cianciosa, right?), while @batchelordb thought they were never defined. The module load may (or should) depend on workflow.

The goal of this thread is to collect examples of IPS environment setup people are using to build more transparent one.

I haven't been using any environment variables instead all my workflows were setup to reference everything in a platform file specific for that workflow.

@cianciosa, this is another example of how people setup IPS runs in different ways - you put setup in a platform file, kind of genius. I setup my workflows in a job submission file. Do you still have IPS_DIR, DAKOTA_ROOT, etc (or similar) in your platform file?

@anelasa @tyounkin @dlg0 how about you?

Here's one example
https://github.com/ORNL-Fusion/ips-wrappers/blob/master/ips-cariddi/Test/platform.conf
I only reference what I need for that particular workflow.

However doing it this way is not exactly optimal. This method requires me to make a new platform file for each workflow on each device I want to run on. For simple workflows, this is easy. But for deeply nested workflows or workflows that contain a lot of components, this would require you setting up each variable needed by everything.

I think a better approach would to be something more a kin to like a C header file. Where each component sources the environment script of its direct dependancies. But I'm not exactly sure how to set that up yet. I think it would look something like this.

#!/bin/bash

#  Check if this component has already been sourced.
if [ ! ${component_loaded} ]
then
    #  Mark component as sourced.
    export component_loaded=1

    #  Import direct dependancies. The sourced dependancies
    #  will import its own dependancies.
    source dependent_component.sh

    #  Get the current device id.
    MACHINE_ID=`uname -n`

    #  Set platform dependent variables.
    if [ "$MACHINE_ID" == "cori" ]
    then

        #  Set cori dependent variables.
        export install_dependant_variable=value
        ...

    elif [ "$MACHINE_ID" == "othermachine" ]
    then

        #  Set othermachine dependent variables.
        ...

    else
        echo Error: $MACHINE_ID unrecognized.
    fi

    #  Set platform independent variables.
    export install_independant_variable=value
    ...

fi

@cianciosa Looks good idea. Until we develop more complete mechanics with help from SW engineers, I would suggest to start with a simple approach already in place as @batchelordb pointed out, but with a goal of improving transparency, minimizing collision/confusion, and making ips_wrappers repo work out-of-box on cori

  1. env.ips.cori
  • have a limited/controlled number of essential variables like IPS_DIR, ...
  • setup for key common libraries and binaries
  • remove all physics code paths
  1. cori.conf
    machine configuration file

At this moment, only for cori, which however can be used for a template for other machines.

We move other local variables (such as library path specific to the code, ..) to the components/workflows, located under each subdirectory of workflows/components, for example: cql3d/evn.ips.cori_cql3d. This file would contain:

  1. module load specific to the codes (in addition to those in env.ips.cori)
  2. library path specific to the codes
  3. binary path

Hopefully no strong objections?