/hyperion

High performance radio astronomy data processing prototyping

Primary LanguageC++Apache License 2.0Apache-2.0

Hyperion

Build Status

A prototype high performance radio astronomy data calibration, imaging and analysis project. This project provides a basis for conducting an exploration of the suitability and performance of various modern high-performance computing oriented methods, libraries, frameworks and hardware architectures for radio astronomy data imaging and analysis.

An initial implementation of MeasurementSet data access using the Legion programming system is underway. A major short-term goal is to implement an application for the gridding of visibility data, in order to gain an understanding of the performance, flexibility, and complexity of a Legion-based solution for some of the significant computational challenges faced by next-generation radio telescope arrays, such as the ngVLA.

Building the software

Prerequisites

Several dependencies are optional, and while the software will build without them, functionality may be somewhat limited as a result. Flags used when cmake is invoked will determine whether a dependency is required or not. In the list below, after every optional requirement, in brackets, the CMake conditions that determine how a dependency is used are shown.

  • Required
    • CMake, version 3.14 or later
    • zlib
    • git
    • git-lfs [for test data, currently no way to avoid this]
    • gcc >= v9.3, or Clang >= 10.0 (other versions of either compiler may work, but have not been fully tested)
  • Optional
    • HDF5®, version 1.10.5 or later [with USE_HDF5=ON]
    • LLVM, any version acceptable to Legion [with Legion_USE_LLVM=ON]
    • GASNet, any version acceptable to Legion [with Legion_USE_GASNET=ON]
    • yaml-cpp, version 0.6.2 or later [with USE_YAML=ON]
    • casacore [with USE_CASACORE=ON]
    • pkgconf or pkg-config [with USE_CASACORE=ON]
    • CUDA, version 10.2.89 or later [with USE_CUDA=ON]
    • Kokkos, version 3.2.00 or later [with USE_KOKKOS=ON]
    • Kokkos Kernels [with hyperion_USE_KOKKOS_KERNELS=ON, or USE_KOKKOS=ON]
    • nvcc_wrapper [with USE_CUDA=ON, USE_KOKKOS=ON and gcc compiler]
    • OpenMP, provided by gcc [with USE_OPENMP=ON]
  • Currently unimplemented, but when internal casacore build is again available (planned)

In-project casacore build

The following paragraph is not correct for the master branch of this project. However, it describes the functionality available in a previous version, which may become available again in the near future.

The dependence on casacore is optional, but hyperion can be built using casacore whether or not casacore is already installed on your system. To build hyperion without any dependency on casacore, simply use -DUSE_CASACORE=OFF in the arguments to cmake. When cmake arguments include -DUSE_CASACORE=ON and casacore is found by CMake, hyperion will be built against your casacore installation. To provide a hint to locating casacore, you may use the CASACORE_ROOT_DIR CMake variable. If -DUSE_CASACORE=ON and no casacore installation is found on your system (which can also be forced by setting -DCASACORE_ROOT_DIR=""), casacore will be downloaded and built as a CMake external project for hyperion. One variable to consider using when building casacore through hyperion is casacore_DATA_DIR, which provides the path to an instance of the casacore data directory. If, however, casacore will be built through hyperion and casacore_DATA_DIR is left undefined, the build script will download, and install within the build directory, a recent copy of the geodetic and ephemerides data automatically. For Ubuntu systems, the required casacore components and the casacore data are available in the casacore-dev and casacore-data packages, respectively.

In-project Legion

Legion itself is always built in-project using CMake's FetchContent module. A few of its configuration variables have been lifted to the top level CMakeLists.txt file, whereas others are set from the hyperion configuration to ensure consistency. Other Legion configuration variables should be available directly when invoking cmake.

Build instructions

First, clone the repository.

$ cd /my/hyperion/source/directory
$ git clone https://github.com/mpokorny/hyperion .

Create a build directory, and invoke cmake.

$ cd /my/hyperion/build/directory
$ cmake [CMAKE OPTIONS] /my/hyperion/source/directory

Note that both Unix Makefile and Ninja project file generators are known to work. If using Ninja, in the following instructions for build and test, substitute "ninja" for "make".

Build the software

$ cd /my/hyperion/build/directory
$ make [-j N]

Test the software

$ cd /my/hyperion/build/directory
$ make test

Install instructions

Executable binaries are installed into CMAKE_INSTALL_PREFIX/bin, header files into CMAKE_INSTALL_PREFIX/include/hyperion, and libraries into CMAKE_INSTALL_PREFIX/lib64. Legion header files are also installed into CMAKE_INSTALL_PREFIX/include/hyperion, and libraries into CMAKE_INSTALL_PREFIX/lib64/hyperion. Note that the RPATH of the main hyperion library is set to include the directory into which the Legion libraries are installed.

make install

Build and install via Spack

A preliminary Spack package file for hyperion exists in this repository.

Using the software

There are currently two complete applications, ms2h5 and cfcompute. Another application, gridder, is undergoing continuing development.

ms2h5

This application converts a MeasurementSet in standard format (CTDS, or "casacore table data system") into a custom HDF5 format. Note that the format of the HDF5 files created by this application is almost certainly not the same as that expected by casacore Table libraries, and trying to access the HDF5 files through those libraries will fail.

Usage: ms2h5 [OPTION...] MS [TABLE...] OUTPUT, where MS is the path to the MeasurementSet, and OUTPUT is the name of the HDF5 file to be created. If OUTPUT already exists, the application will not write over or delete that file, and will exit immediately. [TABLE...] is a list of tables in the MeasurementSet to write into the HDF5 file; if this list is not present on the command line, all tables will be written. [OPTION...] is a list options, primarily Legion and GASNet options. Please be aware that the argument parsing done by ms2h5 is currently not robust; for the time being, specifying flags before options with values may ameliorate any problems.

cfcompute

This application computes convolution functions for use in gridding of visibilities for imaging; it is intended for performance testing and analysis of parallelization and mapping strategies, relying strongly on Legion and Kokkos in its design and implementation. As a tool for performance testing, the computed convolution functions are not retained when the program completes. On the other hand, performance measures are collected during program execution, and metrics are reported when the program completes. Currently the program supports fixed parallelization and mapping strategies, but this is expected to change as the program is developed. Note, however, that some variation in the program execution is supported indirectly through the choice of options employed by Legion and Kokkos, and hyperion compile-time options; and directly through a few easily modified compile-time definitions.

Usage: cfcompute [Legion args]

gridder

This application will implement a gridding code using A-projection and W-projection, with the possible extensions of FFT/IFFT and de-gridding. It is intended to measure the performance of algorithms implemented using Legion for the gridding of visibility data, as would be needed by currently planned radio telescope arrays. The algorithms should be very close to what would be expected to be required for a real instrument, and should not contain any computationally significant shortcuts or workarounds.

gridder is a work in progress, and is not yet functional.