/libensemble

A library to coordinate the concurrent evaluation of dynamic ensembles of calculations.

Primary LanguagePythonOtherNOASSERTION

libEnsemble


https://img.shields.io/pypi/v/libensemble.svg?color=blue https://travis-ci.org/Libensemble/libensemble.svg?branch=master https://coveralls.io/repos/github/Libensemble/libensemble/badge.svg?branch=master Documentation Status

Introduction to libEnsemble

libEnsemble is a Python library to coordinate the concurrent evaluation of dynamic ensembles of calculations. The library is developed to use massively parallel resources to accelerate the solution of design, decision, and inference problems and to expand the class of problems that can benefit from increased concurrency levels.

libEnsemble aims for the following:

  • Extreme scaling
  • Resilience/fault tolerance
  • Monitoring/killing of tasks (and recovering resources)
  • Portability and flexibility
  • Exploitation of persistent data/control flow

The user selects or supplies a function that generates simulation input as well as a function that performs and monitors the simulations. For example, the generation function may contain an optimization routine to generate new simulation parameters on the fly based on the results of previous simulations. Examples and templates of such functions are included in the library.

libEnsemble employs a manager/worker scheme that can run on various communication media (including MPI, multiprocessing, and TCP); interfacing with user-provided executables is also supported. Each worker can control and monitor any level of work, from small subnode tasks to huge many-node simulations. An executor interface is provided to ensure that scripts are portable, resilient, and flexible; it also enables automatic detection of the nodes and cores in a system and can split up tasks automatically if resource data isn't supplied.

Dependencies

Required dependencies:

For libEnsemble running with the mpi4py parallelism:

  • A functional MPI 1.x/2.x/3.x implementation, such as MPICH, built with shared/dynamic libraries
  • mpi4py v2.0.0 or above

Optional dependency:

From v0.2.0, libEnsemble has the option of using the Balsam job manager. Balsam is required in order to run libEnsemble on the compute nodes of some supercomputing platforms that do not support launching tasks from compute nodes. As of v0.5.0, libEnsemble can also be run on launch nodes using multiprocessing.

The example simulation and generation functions and tests require the following:

PETSc and NLopt must be built with shared libraries enabled and present in sys.path (e.g., via setting the PYTHONPATH environment variable). NLopt should produce a file nlopt.py if Python is found on the system. See the NLopt documentation for information about building NLopt with shared libraries. NLopt may also require SWIG to be installed on certain systems.

Installation

libEnsemble can be installed or accessed from a variety of sources.

Install libEnsemble and its dependencies from PyPI using pip:

pip install libensemble

Install libEnsemble with Conda from the conda-forge channel:

conda config --add channels conda-forge
conda install -c conda-forge libensemble

Install libEnsemble using the Spack distribution:

spack install py-libensemble

libEnsemble is included in the xSDK Extreme-scale Scientific Software Development Kit from xSDK version 0.5.0 onward. Install the xSDK and load the environment with

spack install xsdk
spack load -r xsdk

The codebase, tests and examples can be accessed in the GitHub repository. If necessary, you may install all optional dependencies (listed above) at once with

pip install libensemble[extras]

A tarball of the most recent release is also available.

Testing

The provided test suite includes both unit and regression tests and is run regularly on

The test suite requires the mock, pytest, pytest-cov, and pytest-timeout packages to be installed and can be run from the libensemble/tests directory of the source distribution by running

./run-tests.sh

Further options are available. To see a complete list of options, run

./run-tests.sh -h

If you have the source distribution, you can download (but not install) the testing prerequisites and run the tests with

python setup.py test

in the top-level directory containing the setup script.

Coverage reports are produced separately for unit tests and regression tests under the relevant directories. For parallel tests, the union of all processors is taken. Furthermore, a combined coverage report is created at the top level, which can be viewed at libensemble/tests/cov_merge/index.html after run_tests.sh is completed. The Travis CI coverage results are available online at Coveralls.

Note

The executor tests can be run by using the direct-launch or Balsam executors. Balsam integration with libEnsemble is now tested via test_balsam_hworld.py.

Basic Usage

The examples directory contains example libEnsemble calling scripts, simulation functions, generation functions, allocation functions, and libEnsemble submission scripts.

The default manager/worker communications mode is MPI. The user script is launched as

mpiexec -np N python myscript.py

where N is the number of processors. This will launch one manager and N-1 workers.

If running in local mode, which uses Python's multiprocessing module, the local comms option and the number of workers must be specified. The script can then be run as a regular Python script:

python myscript.py

These options may be specified via the command line by using the parse_args() convenience function within libEnsemble's tools module.

See the user guide for more information.

Resources

Support:

  • The best way to receive support is to email questions to libEnsemble@lists.mcs.anl.gov.
  • Communicate (and establish a private channel, if desired) at the libEnsemble Slack page.
  • Join the libEnsemble mailing list for updates about new releases.

Further Information:

  • Documentation is provided by ReadtheDocs.
  • A visual overview of libEnsemble is given in this poster.

Citation:

  • Please use the following to cite libEnsemble in a publication:
@techreport{libEnsemble,
  author      = {Stephen Hudson and Jeffrey Larson and Stefan M. Wild and
                 David Bindel and John-Luke Navarro},
  title       = {{libEnsemble} Users Manual},
  institution = {Argonne National Laboratory},
  number      = {Revision 0.7.1},
  year        = {2020},
  url         = {https://buildmedia.readthedocs.org/media/pdf/libensemble/latest/libensemble.pdf}
}