andreasmang/claire

Installation with CUDA

Opened this issue · 4 comments

@andreasmang:
I am hoping that you can help me with the installation of CLAIRE. Any advice would be very welcome and appreciated. I have performed the following steps. Thanks @tjasaki

  1. Installed Nvidia CUDA Toolkit.
    Graphics Card: NVIDIA Corporation TU117GLM [Quadro T2000 Mobile / Max-Q] / Quadro T2000/PCIe/SSE2.
    Processor: Intel® Core™ i9-9880H CPU @ 2.30GHz × 16.
    OS: Ubuntu 20.04.2 LTS.

  2. cmake, niftilib, python all operational

  3. Installed mpich 3.3.2-2build1 (Ubuntu) (MPI-3.1 standard)

  4. mpicc, mpicxx, and nvcc all recognized, with directories
    /usr/bin/mpicc
    /usr/bin/mpicxx
    /usr/bin/nvcc

  5. running make from claire-gpu/deps results in the following message:

===============================================================================
Configuring PETSc to compile on your system

=============================================================================== ***** WARNING: MAKEFLAGS (set to ) found in environment variables - ignori use ./configure MAKEFLAGS=$MAKEFLAGS if you really want to use that value=============================================================================== TESTING: check from config.libraries(config/BuildSystem/config/libraries.py:157)*******************************************************************************
UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log for details):

Unable to find cuda in default locations!
Perhaps you can specify with --with-cuda-dir=
If you do not want cuda, then give --with-cuda=0


  1. Changing makefile with "WITH_CUDA_MPI = no" results in the same message.

  2. Changing makefile with "WITH_CUDA_MPI = no" and "BUILD_GPU = no" results in a completed make.

  3. Moving forward with this configuration, I set the environment variables using: source env_source.sh

  4. I modified the top-level makefile to specify "BUILD_GPU = no" and "WITH_CUDA_MPI = no". Executing :make -j" results in the error mesage

    config.mk:68: *** This branch only supports GPU build. Stop.

  5. I reverted back to "BUILD_GPU = yes". Running "make VERBOSE=1 VVERBOSE=1 config" shows the following:

fatal: not a git repository (or any of the parent directories): .git

Options

BUILD_GPU: yes; [yes, no]
BUILD_TEST: no; [yes, no]
BUILD_PYTHON: no; [yes, no]

WITH_NIFTI: yes; [yes, no]
WITH_PNETCDF: no; [yes, no]
WITH_DOUBLE: no; [yes, no]
WITH_DEBUG: no; [yes, no]
WITH_DEVELOP: no; [yes, no]
WITH_CUDA_MPI: no; [yes, no]

BUILD_DIR: ./bin

CXX: mpicxx
NVCC: nvcc

internal build options

BUILD_SHARED: no; [yes, no]
BUILD_TARGET: X86; [POWER9, X86]

MPI_DIR: /usr
CUDA_DIR: /usr
PETSC_DIR: /home/asaki/Software/claire-gpu/deps/lib
NIFTI_DIR: /home/asaki/Software/claire-gpu/deps/lib
ZLIB_DIR: ./
PNETCDF_DIR:
PYTHON_DIR: /usr/include/python3.5

GPU_VERSION:
CPP_VERSION: c++11

APP_DIR: ./apps
SRC_DIR: ./src
OBJ_DIR: ./obj
LIB_DIR: ./lib
EXSRC_DIR: ./3rdparty

CXX_FLAGS:
NVCC_FLAGS:
LD_FLAGS:

@tjasaki We will look into it. Thanks! [ cc @naveenaero @MalteBrunn ]

@tjasaki To me this looks like PETSc cannot find the CUDA package. Did you try to add the CUDA library to your environment variables?

@tjasaki There are two independent makefiles. One in the main folder for the claire project and one in the ./deps folder for all dependencies. The switches (like BUILD_GPU) are not shared between both makefiles.

PETSc is a bit picky with the CUDA version installed/used. The PETSc version can also be adjusted with makefile switches.

@tjasaki following up on the compatibility issues mentioned by @MalteBrunn:

Here's a table as to which PETSc version worked with which CUDA version:
https://github.com/andreasmang/claire/blob/gpu/doc/README-INSTALL.md#detailed-installation-guide-

As @MalteBrunn has already pointed out, we observed several compatibility issues with PETSc and CUDA.

If you believe this is not the issue, let us know.