NERSC/buildtest-nersc

[Bug]: superlu-dist test failing due to Permission Issues. Broken spack test

Closed this issue · 4 comments

CDASH Build

https://my.cdash.org/test/66436440

Link to buildspec file

https://github.com/buildtesters/buildtest-nersc/blob/devel/buildspecs/e4s/E4S-Testsuite/perlmutter/22.05/superlu-dist.yml

Please describe the issue?

@wspear

This test looks like its running the spack test from E4S Testsuite according to https://github.com/E4S-Project/testsuite/blob/master/validation_tests/superlu-dist/run.sh#L3

Just testing this out from spack test and this is failing too and it looks like its trying to write to the production path

 ~/ ml e4s/22.05
 ~/ spack test run superlu-dist
==> Spack test ex35odmsu7qrlczudfsrtggjxt4zz3ds
==> Testing package superlu-dist-7.2.0-p7xxexy
==> Error: PermissionError: [Errno 13] Permission denied: '/global/common/software/spackecp/perlmutter/e4s-22.05/73973/spack/opt/spack/cray-sles15-zen3/cce-13.0.2/superlu-dist-7.2.0-p7xxexym4hmuqf5kwmni4biwfphyfjmx/.spack/test/make.inc~'

/global/common/software/spackecp/perlmutter/e4s-22.05/73973/spack/var/spack/repos/builtin/packages/superlu-dist/package.py:144, in test:
        141    def test(self):
        142        mk_file = join_path(self.install_test_root, self.mk_hdr)
        143        # Replace 'SRC' with 'lib' in the library's path
  >>    144        filter_file(r'^(DSUPERLULIB.+)SRC(.+)', '\\1lib\\2', mk_file)
        145        # Set library flags for all libraries superlu-dist depends on
        146        filter_file(r'^LIBS.+\+=.+', '', mk_file)
        147        filter_file(r'^LIBS[^\+]+=.+', 'LIBS = $(DSUPERLULIB)' +

See test log for details:
  /global/homes/s/siddiq90/.spack/test/ex35odmsu7qrlczudfsrtggjxt4zz3ds/superlu-dist-7.2.0-p7xxexy-test-out.txt

==> Testing package superlu-dist-7.2.0-mozwjf3
==> Error: PermissionError: [Errno 13] Permission denied: '/global/common/software/spackecp/perlmutter/e4s-22.05/73973/spack/opt/spack/cray-sles15-zen3/gcc-11.2.0/superlu-dist-7.2.0-mozwjf33ihxdsqn5zv2d5llovrg2zotp/.spack/test/make.inc~'

/global/common/software/spackecp/perlmutter/e4s-22.05/73973/spack/var/spack/repos/builtin/packages/superlu-dist/package.py:144, in test:
        141    def test(self):
        142        mk_file = join_path(self.install_test_root, self.mk_hdr)
        143        # Replace 'SRC' with 'lib' in the library's path
  >>    144        filter_file(r'^(DSUPERLULIB.+)SRC(.+)', '\\1lib\\2', mk_file)
        145        # Set library flags for all libraries superlu-dist depends on
        146        filter_file(r'^LIBS.+\+=.+', '', mk_file)
        147        filter_file(r'^LIBS[^\+]+=.+', 'LIBS = $(DSUPERLULIB)' +

See test log for details:
  /global/homes/s/siddiq90/.spack/test/ex35odmsu7qrlczudfsrtggjxt4zz3ds/superlu-dist-7.2.0-mozwjf3-test-out.txt

==> Testing package superlu-dist-7.2.0-rpngmxn
==> Error: PermissionError: [Errno 13] Permission denied: '/global/common/software/spackecp/perlmutter/e4s-22.05/73973/spack/opt/spack/cray-sles15-zen3/gcc-11.2.0/superlu-dist-7.2.0-rpngmxniqhgn6qfegb5vta7nu4hmq32i/.spack/test/make.inc~'

/global/common/software/spackecp/perlmutter/e4s-22.05/73973/spack/var/spack/repos/builtin/packages/superlu-dist/package.py:144, in test:
        141    def test(self):
        142        mk_file = join_path(self.install_test_root, self.mk_hdr)
        143        # Replace 'SRC' with 'lib' in the library's path
  >>    144        filter_file(r'^(DSUPERLULIB.+)SRC(.+)', '\\1lib\\2', mk_file)
        145        # Set library flags for all libraries superlu-dist depends on
        146        filter_file(r'^LIBS.+\+=.+', '', mk_file)
        147        filter_file(r'^LIBS[^\+]+=.+', 'LIBS = $(DSUPERLULIB)' +

See test log for details:
  /global/homes/s/siddiq90/.spack/test/ex35odmsu7qrlczudfsrtggjxt4zz3ds/superlu-dist-7.2.0-rpngmxn-test-out.txt

======================== 3 failed, 0 passed of 3 specs =========================
==> Error: 3 test(s) in the suite failed.

The problem with this code is on these lines https://github.com/spack/spack/blob/e4s-22.05/var/spack/repos/builtin/packages/superlu-dist/package.py#L142-L144 where its trying to replace SRC with lib and this doesn't work well. We should not be updating anything in production path

Relevant log output

uperlu-dist %gcc: mozwjf3
Running /global/cfs/cdirs/m3503/buildtest/runs/perlmutter_scheduled_test/2022-11-07/perlmutter.slurm.regular/superlu-dist/superlu-dist_e4s_testsuite_22.05/5a5b1d41/stage/testsuite/validation_tests/superlu-dist
Skipping load: Environment already setup
==> Error: ProcessError: Command exited with status 2:
    'make' '-j16' 'pddrive'

5 errors found in test log:
     3     ==> [2022-11-07-18:29:11.961086] FILTER FILE: /global/common/software/spackecp/perlmutter/e4s-22.05/73973/spack/opt/spack/cray-sles15-zen3/gcc-11.2.
           0/superlu-dist-7.2.0-mozwjf33ihxdsqn5zv2d5llovrg2zotp/.spack/test/make.inc [replacing "^(DSUPERLULIB.+)SRC(.+)"]
     4     ==> [2022-11-07-18:29:11.985133] FILTER FILE: /global/common/software/spackecp/perlmutter/e4s-22.05/73973/spack/opt/spack/cray-sles15-zen3/gcc-11.2.
           0/superlu-dist-7.2.0-mozwjf33ihxdsqn5zv2d5llovrg2zotp/.spack/test/make.inc [replacing "^LIBS.+\+=.+"]
     5     ==> [2022-11-07-18:29:11.995036] FILTER FILE: /global/common/software/spackecp/perlmutter/e4s-22.05/73973/spack/opt/spack/cray-sles15-zen3/gcc-11.2.
           0/superlu-dist-7.2.0-mozwjf33ihxdsqn5zv2d5llovrg2zotp/.spack/test/make.inc [replacing "^LIBS[^\+]+=.+"]
     6     ==> [2022-11-07-18:29:11.999343] FILTER FILE: /global/common/software/spackecp/perlmutter/e4s-22.05/73973/spack/opt/spack/cray-sles15-zen3/gcc-11.2.
           0/superlu-dist-7.2.0-mozwjf33ihxdsqn5zv2d5llovrg2zotp/.spack/test/make.inc [replacing "^LOADOPTS.+"]
     7     ==> [2022-11-07-18:29:12.001788] 'make' '-j16' 'pddrive'
     8     /opt/cray/pe/mpich/8.1.15/ofi/gnu/9.1/bin/mpicxx -Wl,-rpath,/global/common/software/spackecp/perlmutter/e4s-22.05/73973/spack/opt/spack/cray-sles15-
           zen3/gcc-11.2.0/superlu-dist-7.2.0-mozwjf33ihxdsqn5zv2d5llovrg2zotp/lib,-rpath,/global/common/software/spackecp/perlmutter/e4s-22.05/73973/spack/opt
           /spack/cray-sles15-zen3/gcc-11.2.0/openblas-0.3.20-kpue5bnywwglz4wnssagsb7wko2mpamg/lib,-rpath,/global/common/software/spackecp/perlmutter/e4s-22.05
           /73973/spack/opt/spack/cray-sles15-zen3/gcc-11.2.0/openblas-0.3.20-kpue5bnywwglz4wnssagsb7wko2mpamg/lib,-rpath,/global/common/software/spackecp/perl
           mutter/e4s-22.05/73973/spack/opt/spack/cray-sles15-zen3/gcc-11.2.0/parmetis-4.0.3-r7ltmqs2igjzfmv7fhtg67x7vflmd47o/lib,-rpath,/global/common/softwar
           e/spackecp/perlmutter/e4s-22.05/73973/spack/opt/spack/cray-sles15-zen3/gcc-11.2.0/metis-5.1.0-iawwq32vzsvijnkdegvxs6fcinz6s5pp/lib pddrive.o dcreate
           _matrix.o  /global/common/software/spackecp/perlmutter/e4s-22.05/73973/spack/opt/spack/cray-sles15-zen3/gcc-11.2.0/superlu-dist-7.2.0-mozwjf33ihxdsq
           n5zv2d5llovrg2zotp/lib/libsuperlu_dist_fortran.so /global/common/software/spackecp/perlmutter/e4s-22.05/73973/spack/opt/spack/cray-sles15-zen3/gcc-1
           1.2.0/superlu-dist-7.2.0-mozwjf33ihxdsqn5zv2d5llovrg2zotp/lib/libsuperlu_dist.so /global/common/software/spackecp/perlmutter/e4s-22.05/73973/spack/o
           pt/spack/cray-sles15-zen3/gcc-11.2.0/openblas-0.3.20-kpue5bnywwglz4wnssagsb7wko2mpamg/lib/libopenblas.so -lm -lcupti -lcudart -lcuda -lmpifort_gnu_9
           1 -lmpi_gnu_91 -ldsmml -lgfortran -lquadmath -lpthread -lgfortran -lm -lgcc_s -lgcc -lquadmath -lm -lgcc_s -lgcc -lc -lgcc_s -lgcc -lm -o pddrive
  >> 9     /usr/bin/ld: cannot find -lcupti
  >> 10    /usr/bin/ld: cannot find -lcudart
  >> 11    /usr/bin/ld: cannot find -ldsmml
  >> 12    collect2: error: ld returned 1 exit status
  >> 13    make: *** [Makefile:81: pddrive] Error 1

See test log for details:
  /global/homes/e/e4s/.spack/test/u3cvyk2rzbkddvrm4566sw2wcurodnqv/superlu-dist-7.2.0-mozwjf3-test-out.txt

==> Error: 1 test(s) in the suite failed.

Run failed

This issue is also reported/discussed here: spack/spack#29688

we decided to remove this test since it wont get fixed in e4s/22.05 stack this is being addressed in https://software.nersc.gov/NERSC/buildtest-nersc/-/merge_requests/123