PGAPack
PGAPack is a general-purpose, data-structure-neutral, parallel genetic algorithm library originally developed by David Levine at Argonne National Laboratory. It has libraries for C and Fortran. There are companion projects:
- PGAPy, a Python wrapper for PGApack https://github.com/schlatterbeck/pgapy
- debian-pgapack, a project for building debian packages (for all MPI libraries supported by Debian): https://github.com/schlatterbeck/debian-pgapack
Documentation is on Read the Docs.
Updates
3rd update Jan 2023:
- Now the user-guide is converted to sphinx. The old LaTeX version can still be built but will vanish eventually
- You can access the documentation on Read the Docs now.
2nd update Jan 2023:
- Generalize mean hamming distance reporting to mean genetic distance reporting. This now works with all data types not just binary. This uses the already-existing genetic distance user function which is implemented for all builtin data types and can be overridden to use euclidian distance instead of the default manhattan distance for integer and real data types.
- Deprecated PGAHammingDistance in favor of PGAGeneDistance. There is a backward-compatible define for C but not for Fortran. It's doubful anybody has ever used this in custom code (it was mainly used in built-in reporting of the hamming distance of binary strings).
Update Jan 2023:
- Add Sphinx documentation
- Generate manual pages from Sphinx docs, this has a lot of bug-fixes in the manpages (e.g. wrong documentation of return value) and documents some functions that did previously not have a manual page
- Default for PGASetSumConstraintsFlag is now PGA_TRUE, works more reliable, this is only relevant when using one of the NSGA multi objective algorithms
Update Dez 2022:
- Bug fixes discovered during implementation of a regression test for the python wrapper
- Output of string printing and result printing can now be redirected to a file
Update Oct 2022:
- Use hashing for comparing individuals when the NoDuplicates flag is
turned on. Previously the comparison operations were O(n²) in the
population size. Now the effort is linear. The downside is that when
you have a custom comparison function with PGA_USERFUNCTION_DUPLICATE
you also need a hash function with PGA_USERFUNCTION_HASH. This always
applies when you have user-defined datatypes (and want to use the
NoDuplicates flag). Examples for C are in
examples/c/namefull.c
and inexamples/c/udtstr.c
(for a user defined datatype) and for Fortran inexamples/fortran/namefull.f
(there are no user-defined datatypes in Fortran). The good news is that there is a utility functionPGAUtilHash
that you can use when implementing a custom hashing function. - Factor MPI serialization: When serializing an Individual, pgapack needs
certain fields to be sent together with the gene, this is now in its
own function. This function can also be used for MPI serialization
when using user-defined datatypes with MPI: In that case a user
function PGA_USERFUNCTION_BUILDDATATYPE has to be written. The new
code substantially reduces the boilerplate code for writing a function
for building an MPI datatype and will probably need no updates when
the pgapack internal information changes. An example is in
examples/c/udtstr.c
and in the user guide. - Add a new serialization API for MPI serialization that can be used instead of PGA_USERFUNCTION_BUILDDATATYPE. This is especially useful when the user-defined datatype is variable length. We send the length of the serialization in a first MPI message before sending the (variable length) individual. Since we're not using multicast, this works fine for transferring variable-size information with MPI. This new API will be used in the companion-project PGAPy for user defined datatypes in python.
- Bug-fix in multi-objective optimization: When evaluations are exactly equal the ranking would not correctly compute the dominance relation
- Bug-fix in multi-objective optimization: The crowding metric was not properly initialized resulting sometimes in different optimization paths when compiled with/without optimization (-O2 and -O3 in gcc)
- Fix feature interaction between multi-objective optimization and the NoDuplicates flag: When combining two populations in the multi objective optimization algorithms (NSGA-II and NSGA-III) where both populations contain instances of the same indidivual, duplicates would result.
- The script for plotting the pareto front for 3-dimensional problems
used to be in
examples/nsgaiii/crowdingplot3
(this was already a symbolic link in the latest releases) and is now gone, use the-3
option forexamples/nsgaii/crowdingplot
.
The bug-fixes in multi-objective optization will result in different optimization paths being taken compared to previous versions (because of different sorting).
Update Aug 2022:
- Add a crossover method for permutations (e.g. traveling salesman)
- Add Epsilon-Constrained optimization, see blogpost on epsilon constrained optimization
Update Mar 2022:
- Attempt to get everything compiled with visual studio compiler. This compiler is stuck in the 1990s of the last millenium because it does not support dynamically sized arrays on the stack. This is part of the C99 standard. The workarounds involve some ugly macros.
- Bug-Fix in the genetic distance function PGARealGeneDistance which is used for RTR population replacement. This converted the distances to int which is wrong.
Update Jan 2022:
Now the tournament size can be a floating-point value implementing fine-grained tournaments (the fractional part is used to add an additional tournament participant probabilistically). See the Selection chapter in the user guide and the citations on the topic. For Fortran this could mean changes to the constant passed to PGASetTournamentSize.
Implement simulated binary crossover (SBX) and polynomial mutation, see user guide.
NSGA-III for many-objective optimization is now implemented
There is a small plotting-utility
examples/nsgaiii/crowdingplot3
written in python that can plot three function values in a 3D-plot. It can directly use the output of an optimization, e.g.:examples/nsgaiii/crowdingplot3 test/nsgaiii_optimize_13_1.data
Second Update December 2021:
- Now the multiobjective optimization algorithm NSGA-II (Nondominated Sorting Genetic Algorithm) by Deb et. al. is implemented. Like for constrained optimization this uses multiple objective functions.
- There are examples from the original paper (see README.rst) in the
directory
examples/nsgaii
, both with and without constraints. - Note that multiobjective optimization is considered experimental:
There are interaction with other parts of the API of the library,
e.g., functions dealing with the best evaluation like
PGAGetBestIndex
currently no longer have a valid semantic interpretation with multiobjective optimization, they sort by nondominance-rank now. And reporting has been rewritten to provide a meaningful output, in particular the optimization result prints all non-dominated solutions. - A Fortran example with constraints and multi-objective optimization
can be found in
examples/fortran/constr.f
- There is a small plotting-utility
examples/nsgaii/crowdingplot
written in python that can plot one function value (in the objective space) against a second function value, similar to the graphics in the NSGA-II paper. - You also want to check the next section for news.
First Update December 2021:
- If you're upgrading: The signature of your evaluation function has changed, it has grown a new parameter at the end. If you're not using constrained optimization you will only have to change your objective function to add this parameter, it is unused in that configuration. In Fortran you can get away without any changes.
- This release probably changes the path an optimization takes because we use a new (stable) sort for sorting populations during copying of individuals for elitist algorithms. This can result in different individuals being copied (which have the same evaluation but might have different genetic material).
- Add auxiliary evaluations, currently only used for constrained optimization from a paper by Deb, 2000 (see user guide for citation). To find out about the new feature see the user guide, section 4.9 "String Evaluation and Fitness". You may also want to look at the examples in examples/deb.
- Fixes for Fortran on 64-bit machines: The context variable is a pointer that didn't fit into a 4-byte integer on these machines resulting in a core-dump.
- Regression tests that use the alreay-coded examples as tests, this includes the Fortran examples. You can run them with "make test". Or, e.g., "make MPI=openmpi test" The default for MPI is to run with 4 processors and use the machine file .mpi-${MPI}-machinefile in your home directory (${MPI} is replaced by the mpi implementation given to the make command, openmpi in this example).
- New examples for constrained optimization using all the examples from Deb 2000.
- Tested MPI on a multiprocessor machine (a bunch of Orange-Pi computers acting as a (slow :-) multiprocessor). Works fine with Debian's OpenMPI and MPICH MPI implementations. Does not work for me with LAM, there is a debian bug-report #1000446.
Updated September 2020:
- Add Differential Evolution (DE) as a new Mutation Strategy
- Add more options to fully emulate Differential Evolution
- Update Docs for DE
Updated May 2020:
- Add Tournament Selection without replacement as an option
- Add Truncation Selection
- Update Documentation and manual pages
Updated March 2020:
- Add restricted tournament replacement, see updated user guide for details and references
- Fix some compiler warnings
- Implement Tournament Selection with more than 2 individuals, new
parameter settable with
PGASetTournamentSize
, the default is the old default of 2.
Updated Sept 2017: new installation instructions, availability:
- Bug fixes in MPI code: Now compiles against all MPI implementations shipped with Debian Linux (openmpi, mpich, lam).
- Bug fix in
PGAChange
that did not callPGASetEvaluationUpToDateFlag
: This would result in occasional wrong evaluation of individuals, noteably the evaluation went down even with an elitist strategy. - Bug fix for restart with an integer gene: According to the user guide
this should use
PGA_MUTATION_CONSTANT
but tried to usePGA_MUTATION_UNIFORM
which is undefined for integer genes - Fixes to the user guide with new documentation, the old original postscript is still available. Notably documentation bugs reported via the debian project were fixed. The user guide can be built from source again (after probably a very long time).
- Make Fortran compile again
Updated March 2008:
- PGAPack has also been built successfully against LAM/MPI and Open MPI.
Copyright
See the file COPYING for Copyright and disclaimer information.
Introduction
PGAPack is a general-purpose, data-structure-neutral, parallel genetic algorithm library developed at Argonne National Laboratory. Key features are:
- Callable from Fortran or C.
- Runs on uniprocessors, parallel computers, and workstation networks.
- Binary-, integer-, real-, and character-valued native data types.
- Object-oriented data structure neutral design.
- Parameterized population replacement.
- Multiple choices for selection, crossover, and mutation operators.
- An implementation of Differential Evolution
- Optimization with constraints
- Epsilon-constrained optimization
- Multi-objective optimization with NSGA-II
- Many-objective optimization with NSGA-III
- Easy integration of hill-climbing heuristics.
- Easy-to-use interface for novice and application users.
- Fully extensible to support custom operators and new data types.
- Extensive debugging facilities.
- A large set of example problems.
- It is released under the MPICH2 license (also used by the MPICH2 MPI implementation from Argonne National Laboratory).
- A separate package with Python bindings PGAPy
Availability
PGAPack is freely available.
The latest version can be obtained from github at https://github.com/schlatterbeck/pgapack
The distribution contains all source code, installation instructions, users guide, and a collection of examples in C and Fortran.
Older versions of the distribution are still available by anonymous ftp from ftp://ftp.mcs.anl.gov/pub/pgapack
Note that the github project contains all older releases in the git repo.
Computational Environment
PGAPack is written in ANSI C and uses the MPI message passing interface and should run on most uniprocessors, parallel computers, and workstation networks. PGAPack has been tested on the workstations and parallel computers specified by the ARCH_TYPE variable below.
Documentation
- Documentation is now on Read the Docs.
- The PGAPack users guide which used to be in LaTeX is now converted to Sphinx with cross-links to a reference documentation.
- The old LaTeX version is still available in the directory
docs
but no longer built by default. The ancient original documentation is still preserved asdocs/user_guide-orig.ps
for historical reasons. It is not recommended for a reference. - Man pages for PGAPack functions are in the
./man
directory. They are created automatically from the Sphinx documentation indocs/sphinx
using some postprocessing from the manual page export of Sphinx. But the man-pages are still checked into git and only rebuilt when something changes. The reason is that the manpages should be easily installable. - For building the man page sources a Sphinx setup is needed, see below in Building the documentation.
- Installation instructions are in this
README.rst
file. - Example problems are in the
./examples
directory.
Building the documentation
To build the Sphinx documentation you should install into a Sphinx
virtual environment: This uses a Python virtual environment and
installs Sphinx and all the necessary addons into this environment.
In addition to Sphinx proper you also need the additional packages in
docs/sphinx/requirements.txt
. You can install with:
pip install -r docs/sphinx/requirements.txt
But be sure that you have activated the virtual environment before issuing this command, otherwise you install into the global python interpreter or your user configuration.
After this you can change to docs/sphinx
directory and build the
html documentation with:
make html
Alternatively you can build manual pages with the target fixedman
and a pdf file with the target latexpdf
. The default if no target is
given is to build all three. The latter can also be achieved by:
make documentation
from the top-level. Note that you need to have the sphinx virtual environment activated for this to work. This is also the reason why the documentation is no longer built by default with the default make target from the top-level Makefile.
Currently the Sphinx documentation uses some hacks by modifying subprograms in memory while building the documentation. The Python community calls this monkey patching. This is because exhale hard-codes some of the section headings in the documentation and I did not want to have 'Classes' when the code is in C which doesn't have classes. And I like the functions in the function groups sorted by name which originally was not supported by breathe but a patch from me has been accepted and I expect this to be available in a future version. In short this means that you may be unable to build the documentation when a new version comes along. Please open a bug report on github if this occurs to you.
Installation Requirements
To compile you must have an ANSI C compiler that includes a full implementation of the Standard C library and related header files. To build a parallel version of PGAPack you must provide an implementation of MPI (Message Passing Interface) for the parallel computer or workstation network you are running on.
Most of our testing and development was done using MPICH, a freely available
implementation of MPI. MPICH runs on many parallel computers and
workstation networks and is publicly available and free. The complete
distribution is available by anonymous ftp from ftp://ftp.mcs.anl.gov.
Take the file mpich.tar.gz
from the directory pub/mpi
. Additional
information about MPICH is avaliable on the World Wide Web at
http://www.mcs.anl.gov/mpi. Note that MPI today is shipped with some
Linux distributions, noteably Debian Linux.
In addition to MPICH, the current installation was compiled successfully with openmpi and lam.
Installation Instructions
When installing PGAPack you make two choices: whether to build a sequential (the default) or parallel version, and whether to build a debug or optimized (the default) version. In broad outline, the installation steps are as follows.
Check out from github
Run
make MPI=$MPIVERSION
replacing
$MPIVERSION
with eitherserial
,openmpi
,mpich
, orlam
. If this doesn't work, you can specifyMPI_LIB
and/orMPI_INCLUDE
in addition. The original targets of the old configure were preserved for historical reasons, so you may want to build with:make ARCH_TYPE=$ARCHITECTURE
replacing
$ARCHITECTURE
with one of the following:Architecture
Description
sun4
for Sun SparcStations workstations,
next
for NeXT workstations,
rs600
for IBM RS6000 workstations,
irix
for Silicon Graphics workstations,
hpux
for Hewlett Packard workstations,
alpha
for DEC Alpha workstations,
linux
for machines running Linux,
freebsd
for machines running FreeBSD,
generic
for generic 32-bit machines,
powerchallenge
for the Silicon Graphics Power Challenge Array,
challenge
for the Silicon Graphics Challenge,
t3d
for the Cray T3D,
sp2
for the IBM SP2,
paragon
for the Intel Paragon, or
exemplar
for the Convex Exemplar.
The full make options are
ARCH_TYPE
,CC
,CFLAGS
,FC
,FFLAGS
,DEBUG
,MPI_INC
,MPI_LIB
In addition it is now possible to add C-compiler options with ADD_CFLAGS and Fortran compiler options with ADD_FFLAGS. The latter may be needed with Gnu Fortran compilers prior to major version 10 because of a bug in constant declarations. Use:
make MPI=$MPIVERSION ADD_FFLAGS=-fno-range-check
All parameters are optional and do the following:
Parameter
Description
CC
The name of the ANSI C compiler, cc by default.
CPPFLAGS
C Preprocessor flags (later appended to
CFLAGS
)CFLAGS
Options passed to the C compiler including necessary options for include file location.
ADD_CFLAGS
Additional options passed to C compiler. This is easier to use than FFLAGS because no knowledge of include directives is necessary.
DEBUG
If specified, enables the debugging features and compiles the source code with the
-g
flag.FC
The name of the Fortran 77 compiler, f77 by default. (The Fortran compiler is used only to compile the Fortran examples in the
./examples/
directory.)FFLAGS
Options passed to the Fortran compiler including necessary options for include file location.
ADD_FFLAGS
Additional options passed to the Fortran compiler. This is easier to use than FFLAGS because no knowledge of include directives is necessary.
INCLUDES
Include options (usually
-I directory
) but see theMPI_INC
belowLDFLAGS
Linker options
ADD_LDFLAGS
Additional linker options (in addition to to the defaults computed for the current architecture)
LIBS
Additional libraries, note that you probably have to include the math library with
-lm
MPI
Specify one of the known MPI types, one of
openmpi
,mpich
,lam
, orserial
(for a non-MPI implementation)MPI_INC
The Include-Option where MPI include files are located.
MPI_LIB
The Linker options for the MPI library, can also be the library file to link.
OPT
The optimization option your compiler understands
SHAREDLIBS
If set to something different from
yes
will not build shared librariesIf the
MPI
orMPI_INC
,MPI_LIB
options are specified, a parallel version of PGAPack will be built, unless you explicitly specifyMPI=serial
. If these flags are not specified, a rudimentary check for a default MPI installation is done. If no MPI installation is found, a sequential version of PGAPack will be built.Note that older versions required to set the
WL
(word length) preprocessor define. This is no longer required, unless you have a very unusual machine where the C-expression:sizeof(unsigned long) * 8
is not the number of bits in an unsigned long (e.g. if you have a different size of character).
Add PGAPack's man pages to your man page path:
setenv MANPATH "$MANPATH"":/home/pgapack/man"
Execute a simple test problem
Sequential version:
C: ``/usr/local/pga/examples/c/maxbit`` Fortran: ``/usr/local/pga/examples/fortran/maxbit``
Parallel version:
C: ``mpirun -np 4 /usr/local/pga/examples/c/maxbit`` Fortran: ``mpirun -np 4 /usr/local/pga/examples/fortran/maxbit``
If a parallel version of PGAPack was used, the actual commands to execute a parallel program depend on the particular MPI implementation and parallel computer. If the MPICH implementation was used the
mpirun
command can be used to execute a parallel program on most systems.
Compiling without Fortran
Note that Fortran is used only for the Fortran examples in
examples/fortran
and examples/mgh
. But these are also used in
the tests. If you can live without all test tests passing you can simply
override the FC
(Fortran Compiler) Makefile variable like so:
make MPI=serial FC=
This will set the Fortran compiler to an empty string and no attempt to
compile fortran code is made. Of course you may chose a different
setting for the MPI variable (e.g. MPI=openmpi
).
If you add the test
target:
make MPI=serial FC= test
Only the tests that do not need a Fortran compiler are run.
Structure of the Distribution Directory
File/Dir | Description |
---|---|
CHANGES | Changes new to this release of PGAPack. |
COPYING | Copyright and disclaimer information. |
README.rst | This file. |
Makefile | Makefile to build everything |
docs | Directory containing documentation. This builds the manual from LaTeX sources |
examples | A directory containing C and Fortran examples. |
include | The PGAPack include directory. |
lib | The directory the library will be installed in. |
man | The directory containing the PGAPack man pages. |
source | The source code for the PGAPack system. |
test | A directory containing programs to verify the installation. This now runs all the examples including the Fortran examples. With no Fortran compiler only the C-Tests are run. |
Contributions
PGAPack was written to be extensible in two ways: adding new operators that work with existing data types, and defining new data types. Enhancements of either type that you wish to share are welcome for possible inclusion in future versions of PGAPack.
Acknowledgment
Users of PGAPack are asked to acknowledge its use in any document referencing work based on the program, such as published research. Also, please supply to us a copy of any published research referencing work based on the software.
History
David Levine is the principal author of pgagpack and wrote most of the code during the mid-1990s. Dirk Eddelbuettel became its Debian maintainer in 2008, organised a relicensing by Argonne National Laboratories under the MPICH2 license and was the effective upstream maintainer until 2017.
In 2017 maintenance (and some development) was taken over be Ralf Schlatterbeck, who maintains the github project at https://github.com/schlatterbeck/pgapack
This repository contains the original 1996, 2008, and 2009 releases as distributed by Argonne National Laboratories as the first commits. It then has changes from the google code project (now archived by google at https://code.google.com/archive/p/pgapack/source) which later became the git repo of Dirk Eddelbuettel at https://github.com/eddelbuettel/pgapack Note that the changes by Allan Clark in that repository that introduced a new automake/autoconf configuration is currently on the autoconf branch – it did not work to build against different variants of MPI implementations (or against the serial version without MPI). There are currently no plans to incorporate automake again – computer architectures have become more similar in recent years so that the effort of maintaining a working automake environment seems not justified.