xsdk-project/xsdk-examples

AMReX + SUNDIALS test takes a long time to run

v-dobrev opened this issue · 5 comments

Compared to other tests the AMReX + SUNDIALS test takes a very long time:

      Start  1: AMREX-amrex_sundials_advection_diffusion
 1/20 Test  #1: AMREX-amrex_sundials_advection_diffusion ...   Passed  620.17 sec
      Start  2: HYPRE-ij_laplacian
 2/20 Test  #2: HYPRE-ij_laplacian .........................   Passed    0.46 sec
      Start  3: MFEM-mfem_ex22_gko
 3/20 Test  #3: MFEM-mfem_ex22_gko .........................   Passed    1.76 sec
      Start  4: MFEM-magnetic-diffusion--cpu
 4/20 Test  #4: MFEM-magnetic-diffusion--cpu ...............   Passed    0.44 sec
      Start  5: MFEM-convdiff--hypre-boomeramg
 5/20 Test  #5: MFEM-convdiff--hypre-boomeramg .............   Passed    0.33 sec
      Start  6: MFEM-convdiff--superlu
 6/20 Test  #6: MFEM-convdiff--superlu .....................   Passed    1.23 sec
      Start  7: MFEM-obstacle
 7/20 Test  #7: MFEM-obstacle ..............................   Passed   19.39 sec
      Start  8: MFEM-transient-heat
 8/20 Test  #8: MFEM-transient-heat ........................   Passed    1.04 sec
      Start  9: MFEM-advection--cpu
 9/20 Test  #9: MFEM-advection--cpu ........................   Passed    0.82 sec
      Start 10: MFEM-diffusion-eigen--strumpack
10/20 Test #10: MFEM-diffusion-eigen--strumpack ............   Passed    6.21 sec
      Start 11: MFEM-diffusion-eigen--superlu
11/20 Test #11: MFEM-diffusion-eigen--superlu ..............   Passed    0.49 sec
      Start 12: MFEM-diffusion-eigen--hypre-boomeramg
12/20 Test #12: MFEM-diffusion-eigen--hypre-boomeramg ......   Passed    0.26 sec
      Start 13: MFEM-HIOP-adv
13/20 Test #13: MFEM-HIOP-adv ..............................   Passed    3.71 sec
      Start 14: PETSc-ex19_1
14/20 Test #14: PETSc-ex19_1 ...............................   Passed    0.57 sec
      Start 15: PETSc-ex19_hypre
15/20 Test #15: PETSc-ex19_hypre ...........................   Passed    0.27 sec
      Start 16: PETSc-ex19_superlu_dist
16/20 Test #16: PETSc-ex19_superlu_dist ....................   Passed    0.25 sec
      Start 17: SUNDIALS-cv_petsc_ex7_1
17/20 Test #17: SUNDIALS-cv_petsc_ex7_1 ....................   Passed    0.21 sec
      Start 18: SUNDIALS-cv_petsc_ex7_2
18/20 Test #18: SUNDIALS-cv_petsc_ex7_2 ....................   Passed    0.21 sec
      Start 19: SUNDIALS-ark_brusselator1D_FEM_sludist
19/20 Test #19: SUNDIALS-ark_brusselator1D_FEM_sludist .....   Passed    0.74 sec
      Start 20: STRUMPACK-sparse
20/20 Test #20: STRUMPACK-sparse ...........................   Passed    0.47 sec

It will be best if we can get the runtime under 1 min or even shorter.

This is a separate issue but it is related to the same example, so I'll post it here.

When trying to run this example with HIP enabled I get the following error from ctest -V:

...
test 1
      Start  1: AMREX-amrex_sundials_advection_diffusion

1: Test command: /dev/shm/dobrev1/xsdk-examples/build/amrex/sundials/amrex_sundials_advection_diffusion
1: Working Directory: /dev/shm/dobrev1/xsdk-examples/build/amrex/sundials
1: Test timeout computed to be: 1500
1: Initializing HIP...
1: HIP initialized.
1: amrex::Abort::0::GPU last error detected in file /dev/shm/dobrev1/spack/var/spack/stage/spack-stage-amrex-22.09-huuzz4saz73j72hmwd6wupsmfpx62owg/spack-src/Src/Base/AMReX_GpuLaunchFunctsG.H line 809: shared object initialization failed !!!
1: SIGABRT
1: /usr/bin/addr2line: Dwarf Error: Invalid or unhandled FORM value: 0x25.
1: /usr/bin/addr2line: Dwarf Error: Invalid or unhandled FORM value: 0x25.
1: /usr/bin/addr2line: Dwarf Error: Invalid or unhandled FORM value: 0x25.
1: /usr/bin/addr2line: Dwarf Error: Invalid or unhandled FORM value: 0x25.
1: /usr/bin/addr2line: Dwarf Error: Invalid or unhandled FORM value: 0x25.
1: /usr/bin/addr2line: Dwarf Error: Invalid or unhandled FORM value: 0x25.
1: /usr/bin/addr2line: Dwarf Error: Invalid or unhandled FORM value: 0x25.
1: See Backtrace.0 file for details
1: MPICH ERROR [Rank 0] [job id 364801714345214976] [Fri May 12 21:03:56 2023] [tioga23] - Abort(6) (rank 0 in comm 0): application called MPI_Abort(MPI_COMM_WORLD, 6) - process 0
1: 
 1/26 Test  #1: AMREX-amrex_sundials_advection_diffusion ...***Failed    0.92 sec
...

Does anyone have any suggestions?

The Spack spec for AMReX is:

[+]  amrex@22.09%gcc@12.2.0~amrdata~cuda~eb~fortran~hdf5~hypre~ipo+linear_solvers+mpi~openmp~particles~petsc~pic~plotfile_tools+rocm~shared+sundials~sycl~tiny_profile amdgpu_target=gfx90a build_system=cmake build_type=RelWithDebInfo dimensions=3 generator=make precision=double arch=linux-rhel8-zen3
[+]      ^cmake@3.24.2%gcc@12.2.0~doc+ncurses+ownlibs~qt build_system=generic build_type=Release arch=linux-rhel8-zen3
[+]      ^cray-mpich@8.1.25%gcc@12.2.0+wrappers build_system=generic arch=linux-rhel8-zen3
[+]      ^gmake@4.2.1%gcc@12.2.0~guile build_system=autotools patches=ca60bd9,fe5b60d arch=linux-rhel8-zen3
[+]      ^hip@5.4.3%gcc@12.2.0~cuda~ipo+rocm build_system=cmake build_type=Release generator=make patches=ca523f1 arch=linux-rhel8-zen3
[+]      ^hsa-rocr-dev@5.4.3%gcc@12.2.0+image~ipo+shared build_system=cmake build_type=Release generator=make patches=71e6851 arch=linux-rhel8-zen3
[+]      ^llvm-amdgpu@5.4.3%gcc@12.2.0~ipo~link_llvm_dylib~llvm_dylib~openmp+rocm-device-libs build_system=cmake build_type=Release generator=ninja patches=a08bbe1 arch=linux-rhel8-zen3
[+]      ^rocprim@5.4.3%gcc@12.2.0~ipo amdgpu_target=auto build_system=cmake build_type=Release generator=make arch=linux-rhel8-zen3
[+]      ^rocrand@5.4.3%gcc@12.2.0+hiprand~ipo amdgpu_target=auto build_system=cmake build_type=Release generator=make patches=a35e689 arch=linux-rhel8-zen3
[+]      ^sundials@6.4.1%gcc@12.2.0+ARKODE+CVODE+CVODES+IDA+IDAS+KINSOL~cuda+examples+examples-install~f2003~fcmix+generic-math+ginkgo+hypre~int64~ipo~klu~kokkos~kokkos-kernels~lapack+magma~monitoring+mpi~openmp+petsc~profiling~pthread~raja+rocm+shared+static+superlu-dist~superlu-mt~sycl+trilinos amdgpu_target=gfx90a build_system=cmake build_type=RelWithDebInfo cstd=99 cxxstd=14 generator=make logging-level=0 logging-mpi=OFF precision=double arch=linux-rhel8-zen3
[+]          ^ginkgo@1.5.0%gcc@12.2.0~cuda~develtools~full_optimizations~hwloc~ipo+mpi~oneapi~openmp+rocm+shared amdgpu_target=gfx90a build_system=cmake build_type=RelWithDebInfo generator=make patches=ba0956e arch=linux-rhel8-zen3
[+]              ^hipblas@5.4.3%gcc@12.2.0~cuda~ipo+rocm amdgpu_target=auto build_system=cmake build_type=Release generator=make arch=linux-rhel8-zen3
[+]              ^hipsparse@5.4.3%gcc@12.2.0~cuda~ipo+rocm amdgpu_target=auto build_system=cmake build_type=Release generator=make patches=c447537 arch=linux-rhel8-zen3
[+]              ^rocthrust@5.4.3%gcc@12.2.0~ipo amdgpu_target=auto build_system=cmake build_type=Release generator=make arch=linux-rhel8-zen3
[+]          ^hypre@2.26.0%gcc@12.2.0~complex~cuda~debug+fortran~gptune~int64~internal-superlu~mixedint+mpi~openmp+rocm+shared+superlu-dist~sycl~umpire~unified-memory amdgpu_target=gfx90a build_system=autotools arch=linux-rhel8-zen3
[+]              ^cray-libsci@23.02.1.1%gcc@12.2.0+mpi~openmp+shared build_system=generic arch=linux-rhel8-zen3
[+]              ^rocsparse@5.4.3%gcc@12.2.0~ipo~test amdgpu_target=auto build_system=cmake build_type=Release generator=make arch=linux-rhel8-zen3
[+]          ^magma@2.7.0%gcc@12.2.0~cuda+fortran~ipo+rocm+shared amdgpu_target=gfx90a build_system=cmake build_type=RelWithDebInfo generator=make arch=linux-rhel8-zen3
[+]          ^petsc@3.18.1%gcc@12.2.0~X+batch~cgns~complex~cuda~debug+double~exodusii~fftw+fortran~giflib+hdf5~hpddm~hwloc+hypre~int64~jpeg~knl~kokkos~libpng~libyaml~memkind+metis~mkl-pardiso~mmg~moab~mpfr+mpi~mumps~openmp~p4est~parmmg~ptscotch~random123+rocm~saws~scalapack+shared~strumpack~suite-sparse+superlu-dist~tetgen~trilinos~valgrind amdgpu_target=gfx90a build_system=generic clanguage=C arch=linux-rhel8-zen3
[+]              ^diffutils@3.6%gcc@12.2.0 build_system=autotools arch=linux-rhel8-zen3
[+]              ^hdf5@1.14.0%gcc@12.2.0~cxx+fortran+hl~ipo~java~map+mpi+shared~szip~threadsafe+tools api=default build_system=cmake build_type=RelWithDebInfo generator=make patches=0b5dd6f arch=linux-rhel8-zen3
[+]                  ^pkgconf@1.8.0%gcc@12.2.0 build_system=autotools arch=linux-rhel8-zen3
[+]              ^hipsolver@5.4.3%gcc@12.2.0~cuda~ipo+rocm amdgpu_target=auto build_system=cmake build_type=Release generator=make arch=linux-rhel8-zen3
[+]              ^metis@5.1.0%gcc@12.2.0~gdb~int64~ipo~real64+shared build_system=cmake build_type=RelWithDebInfo generator=make patches=4991da9,93a7903,b1225da arch=linux-rhel8-zen3
[+]              ^parmetis@4.0.3%gcc@12.2.0~gdb~int64~ipo+shared build_system=cmake build_type=RelWithDebInfo generator=make patches=4f89253,50ed208,704b84f arch=linux-rhel8-zen3
[+]              ^python@3.10.10%gcc@12.2.0+bz2+crypt+ctypes+dbm~debug+libxml2+lzma~nis~optimizations+pic+pyexpat+pythoncmd+readline+shared+sqlite3+ssl~tkinter+uuid+zlib build_system=generic patches=0d98e93,7d40923,f2fd060 arch=linux-rhel8-zen3
[+]                  ^bzip2@1.0.6%gcc@12.2.0~debug~pic+shared build_system=generic arch=linux-rhel8-zen3
[+]                  ^expat@2.5.0%gcc@12.2.0+libbsd build_system=autotools arch=linux-rhel8-zen3
[+]                      ^libbsd@0.11.7%gcc@12.2.0 build_system=autotools arch=linux-rhel8-zen3
[+]                          ^libmd@1.0.4%gcc@12.2.0 build_system=autotools arch=linux-rhel8-zen3
[+]                  ^gdbm@1.23%gcc@12.2.0 build_system=autotools arch=linux-rhel8-zen3
[+]                  ^gettext@0.19.8.1%gcc@12.2.0+bzip2+curses+git~libunistring+libxml2+tar+xz build_system=autotools patches=9acdb4e arch=linux-rhel8-zen3
[+]                  ^libffi@3.4.4%gcc@12.2.0 build_system=autotools arch=linux-rhel8-zen3
[+]                  ^libxcrypt@4.4.33%gcc@12.2.0~obsolete_api build_system=autotools arch=linux-rhel8-zen3
[+]                      ^perl@5.26.3%gcc@12.2.0+cpanm+open+shared+threads build_system=generic patches=8cf4302 arch=linux-rhel8-zen3
[+]                  ^ncurses@6.1%gcc@12.2.0~symlinks+termlib abi=none build_system=autotools arch=linux-rhel8-zen3
[+]                  ^openssl@1.1.1k%gcc@12.2.0~docs~shared build_system=generic certs=mozilla arch=linux-rhel8-zen3
[+]                  ^readline@8.2%gcc@12.2.0 build_system=autotools patches=bbf97f1 arch=linux-rhel8-zen3
[+]                  ^sqlite@3.40.1%gcc@12.2.0+column_metadata+dynamic_extensions+fts~functions+rtree build_system=autotools arch=linux-rhel8-zen3
[+]                  ^util-linux-uuid@2.38.1%gcc@12.2.0 build_system=autotools arch=linux-rhel8-zen3
[+]                  ^xz@5.2.4%gcc@12.2.0~pic build_system=autotools libs=shared,static arch=linux-rhel8-zen3
[+]              ^rocblas@5.4.3%gcc@12.2.0~ipo+tensile amdgpu_target=auto build_system=cmake build_type=Release generator=make patches=81591d9 arch=linux-rhel8-zen3
[+]              ^rocsolver@5.4.3%gcc@12.2.0~ipo+optimal amdgpu_target=auto build_system=cmake build_type=Release generator=make patches=8067bfb arch=linux-rhel8-zen3
[+]              ^zlib@1.2.13%gcc@12.2.0+optimize+pic+shared build_system=makefile arch=linux-rhel8-zen3
[+]          ^superlu-dist@8.1.2%gcc@12.2.0~cuda~int64~ipo~openmp~rocm+shared build_system=cmake build_type=RelWithDebInfo generator=make arch=linux-rhel8-zen3
[+]          ^trilinos@13.4.1%gcc@12.2.0~adelus~adios2+amesos+amesos2+anasazi+aztec~basker+belos+boost~chaco~complex~cuda~cuda_rdc~debug~dtk+epetra+epetraext~epetraextbtf~epetraextexperimental~epetraextgraphreorderings~exodus+explicit_template_instantiation~float+fortran~gtest+hdf5+hypre+ifpack+ifpack2~intrepid+intrepid2~ipo~isorropia+kokkos~mesquite~minitensor+ml+mpi+muelu~mumps+nox~openmp~panzer~phalanx~piro~python~rocm~rocm_rdc~rol~rythmos+sacado~scorec+shards+shared~shylu~stk~stokhos+stratimikos~strumpack~suite-sparse~superlu+superlu-dist~teko~tempus+thyra+tpetra~trilinoscouplings~wrapper~x11+zoltan+zoltan2 build_system=cmake build_type=RelWithDebInfo cxxstd=14 generator=make gotype=int arch=linux-rhel8-zen3
[+]              ^boost@1.79.0%gcc@12.2.0+atomic+chrono~clanglibcpp~container~context~contract~coroutine+date_time~debug+exception~fiber+filesystem+graph~graph_parallel~icu+iostreams~json+locale+log+math~mpi+multithreaded~nowide~numpy~pic+program_options~python+random+regex+serialization+shared+signals~singlethreaded+stacktrace+system~taggedlayout+test+thread+timer~type_erasure~versionedlayout+wave build_system=generic cxxstd=14 patches=a440f96 visibility=hidden arch=linux-rhel8-zen3
[+]                  ^zstd@1.5.5%gcc@12.2.0~programs build_system=makefile libs=shared,static arch=linux-rhel8-zen3
[+]              ^hwloc@2.8.0%gcc@12.2.0~cairo~cuda~gl~libudev+libxml2~netloc~nvml~oneapi-level-zero~opencl+pci~rocm build_system=autotools libs=shared,static arch=linux-rhel8-zen3

Note that hypre is built with +rocm -- could that be a problem?

Here's the contents of the file Backtrace.0 mentioned in the error message above:

=== If no file names and line numbers are shown below, one can run
            addr2line -Cpfie my_exefile my_line_address
    to convert `my_line_address` (e.g., 0x4a6b) into file name and line number.
    Or one can use amrex/Tools/Backtrace/parse_bt.py.

=== Please note that the line number reported by addr2line may not be accurate.
    One can use
            readelf -wl my_exefile | grep my_line_address'
    to find out the offset for that line.

 0: /dev/shm/dobrev1/xsdk-examples/build/amrex/sundials/amrex_sundials_advection_diffusion() [0x314b19]
    amrex::BLBackTrace::print_backtrace_info(_IO_FILE*) at ??:?

 1: /dev/shm/dobrev1/xsdk-examples/build/amrex/sundials/amrex_sundials_advection_diffusion() [0x314707]
    amrex::BLBackTrace::handler(int) at ??:?

 2: /dev/shm/dobrev1/xsdk-examples/build/amrex/sundials/amrex_sundials_advection_diffusion() [0x2902cf]
    amrex::Gpu::ErrorCheck(char const*, int) at ??:?

 3: /dev/shm/dobrev1/xsdk-examples/build/amrex/sundials/amrex_sundials_advection_diffusion() [0x2cf532]
    amrex::InitRandom(unsigned long, int, unsigned long) at ??:?

 4: /dev/shm/dobrev1/xsdk-examples/build/amrex/sundials/amrex_sundials_advection_diffusion() [0x29b7ed]
    amrex::Initialize(int&, char**&, bool, int, std::function<void ()> const&, std::ostream&, std::ostream&, void (*)(char const*)) at ??:?

 5: /dev/shm/dobrev1/xsdk-examples/build/amrex/sundials/amrex_sundials_advection_diffusion() [0x288efe]
    main at ??:?

 6: /lib64/libc.so.6(__libc_start_main+0xe5) [0x15553f6d3d85]

 7: /dev/shm/dobrev1/xsdk-examples/build/amrex/sundials/amrex_sundials_advection_diffusion() [0x286ade]
    _start at ??:?

I also tried a build of xsdk+rocm with (the default) ^hypre~rocm and I see the same issue.

ping: @gardner48, @balos1

Can you please help with the two issues above: (1) the long running example with a CPU build, and (2) the issue with the HIP build.

Thanks!

balos1 commented

I opened a new issue regarding the HIP part since its a distinct issue. The original topic of this issue is resolved by #49.