/MultiNest

MultiNest is a Bayesian inference tool which calculates the evidence and explores the parameter space which may contain multiple posterior modes and pronounced (curving) degeneracies in moderately high dimensions.

Primary LanguageFortranOtherNOASSERTION

MultiNest v 3.10
Farhan Feroz, Mike Hobson
f.feroz@mrao.cam.ac.uk
arXiv:0704.3704, arXiv:0809.3437 & arXiv:1306.2144
Released Jul 2015

---------------------------------------------------------------------------

MultiNest Licence

Users are required to accept to the licence agreement given in LICENCE file.

Users are also required to cite the MultiNest papers (arXiv:0704.3704, arXiv:0809.3437 & arXiv:1306.2144) in their publications.

---------------------------------------------------------------------------

Required Libraries:

MultiNest requires lapack. To use MPI support, some MPI library must also be installed.
The CMake compilation script will automatically find both libraries, or stop compilation
if they are not found.

---------------------------------------------------------------------------

Building and installing MultiNest CMake version:

Brian Kloppenborg has provided a CMake building enviornment which automatically 
builds MultiNest as both static and shared libraries with MPI enabled and 
disabled. The script will automatically detect lapack and any compiler quirks 
that needed to be manually configured in previous version of MultiNest.

The script is setup to do an out-of-source build which is invoked by:

$ cd build
$ cmake ..
$ make

This builds the libraries, binaries, and modules placing them in multinest/lib, 
multinest/bin, and multinest/modules respectively.  The installer is invoked 
via:

$ sudo make install

This will install the libraries, headers, modules, and binaries into the
corresponding directories in /usr/local by default. 

--------------
Special notes:
--------------

1. Changing the default installation path
If you wish to change the install directory either edit the 
CMAKE_INSTALL_PREFIX  variable in the multinest/CMakeLists.txt file or specify 
the directory when CMake is first invoked, i.e.

$ cmake -DCMAKE_INSTALL_PREFIX=/path/to/local/install ..

If the installation directory is changed, you will need to add the installation
directory to your shell's PATH variable. Please see your system's documentation
for how this can be accomplished.

2. Apple machines and 64-bit libraries
On Apple machines it appears only 32-bit binaries/libraries are built by
default. If you wish to have 64-bit libraries, you need to set the 64-bit
compiler flags when CMake is first invoked:

$ cmake -DCMAKE_{C,CXX}_FLAGS="-arch x86_64" -DCMAKE_Fortran_FLAGS="-m64" ..

3. Parallel building not functional
At present parallel building is not possible due to the compilation
of several Fortran modules required by the core MultiNest code.

4. Library not found / non-standard library locations
If you have a library that resides in a non-standard location or is not found by
CMake, try specifying the absolute path to the library using the 
CMAKE_PREFIX_PATH when first invoking CMake:

$ cmake -DCMAKE_PREFIX_PATH=/non/standard/directory ..

---------------------------------------------------------------------------

The subtoutine to begin MultiNest are as follows:

subroutine nestRun(IS, mmodal, ceff, nlive, tol, efr, ndims, nPar, nCdims, maxModes, updInt, Ztol, root, seed,
pWrap, feedback, resume, outfile, initMPI, logZero, maxiter, loglike, dumper, context)

logical IS 	 						!do Importance Nested Sampling (INS)?
logical mmodal 	 						!do mode separation?
integer nlive 	 						!number of live points
logical ceff							!run in constant efficiency mode
double precision tol 		 				!evidence tolerance factor
double precision efr 		 				!sampling efficiency
integer ndims	 						!number of dimensions
integer nPar	 						!total no. of parameters
integer nCdims							!no. of parameters on which clustering should be performed (read below)
integer maxModes						!maximum no. of modes (for memory allocation)
integer updInt							!iterations after which the output files should be written
double precision Ztol						!null log-evidence (read below)
character(LEN=100) root  					!root for MultiNest output files
integer seed 		 					!random no. generator seed, -ve value for seed from the sys clock
integer pWrap[ndims]						!wraparound parameters?
logical feedback						!need update on sampling progress?
logical resume							!resume from a previous run?
logical outfile							!write output files?
logical initMPI							!initialize MPI routines?, relevant only if compiling with MPI
								!set it to F if you want your main program to handle MPI initialization
double precision logZero					!points with loglike < logZero will be ignored by MultiNest
integer maxiter							!max no. of iterations, a non-positive value means infinity. MultiNest will terminate if either it 
								!has done max no. of iterations or convergence criterion (defined through tol) has been satisfied
loglike(Cube,ndims,nPar,lnew) 					!subroutine which gives lnew=loglike(Cube(ndims))
dumper(nSamples,nlive,nPar,physLive,posterior,
	paramConstr,maxloglike,logZ,INSlogZ,logZerr,c)		!subroutine called after every updInt*10 iterations with the posterior 
								!distribution, parameter constraints, max loglike & log evidence values
integer context							!not required by MultiNest, any additional information user wants to pass
      
--------------------------------------------------------------------------- 

likelihood routine: slikelihood(Cube,ndims,nPar,lnew,context)

Cube(1:nPar) has nonphysical parameters.

scale Cube(1:n_dim) & return the scaled parameters in Cube(1:n_dim) & additional parameters that you want to
be returned by MultiNest along with the actual parameters in Cube(n_dim+1:nPar)

Return the log-likelihood in lnew.

---------------------------------------------------------------------------

dumper routine: dumper(nSamples,nlive,nPar,physLive,posterior,paramConstr,maxloglike,logZ,INSlogZ,logZerr,context)

This routine is called after every updInt*10 iterations & at the end of the sampling allowing the posterior 
distribution & parameter constraints to be passed on to the user in the memory. The argument are as follows:

nSamples						= total number of samples in posterior distribution
nlive						     	= total number of live points
nPar						     	= total number of parameters (free + derived)
physLive(nlive, nPar+1) 		     		= 2D array containing the last set of live points (physical parameters plus derived parameters) along with their loglikelihood values
posterior(nSamples, nPar+2)		     		= posterior distribution containing nSamples points. Each sample has nPar parameters (physical + derived) along with the their loglike 
							value & posterior probability
paramConstr(1, 4*nPar):
     paramConstr(1, 1) to paramConstr(1, nPar)	     	= mean values of the parameters
     paramConstr(1, nPar+1) to paramConstr(1, 2*nPar)   = standard deviation of the parameters
     paramConstr(1, nPar*2+1) to paramConstr(1, 3*nPar) = best-fit (maxlike) parameters
     paramConstr(1, nPar*4+1) to paramConstr(1, 4*nPar) = MAP (maximum-a-posteriori) parameters
maxLogLike					    	= maximum loglikelihood value
logZ						     	= log evidence value from the default (non-INS) mode
INSlogZ						     	= log evidence value from the INS mode
logZerr						     	= error on log evidence value
context							not required by MultiNest, any additional information user wants to pass

The 2D arrays are Fortran arrays which are different to C/C++ arrays. In the example dumper routine provided with C & C++
eggbox examples, the Fortran arrays are copied on to C/C++ arrays.

---------------------------------------------------------------------------

Tranformation from hypercube to physical parameters:

MultiNest native space is unit hyper-cube in which all the parameter are uniformly distributed in [0, 1]. User
is required to transform the hypercube parameters to physical parameters. This transformation is described in
Sec 5.1 of arXiv:0809.3437. The routines to tranform hypercube parameters to most commonly used priors are
provided in module priors (in file priors.f90).

---------------------------------------------------------------------------

Checkpointing:

MultiNest is able to checkpoint. It creates [root]resume.dat file & stores information in it after every
updInt iterations to checkpoint, where updInt is set by the user. If you don't want to resume your program from
the last run run then make sure that you either delete [root]resume.dat file or set the parameter resume to F
before starting the sampling.

---------------------------------------------------------------------------

Periodic Boundary Conditions:

In order to sample from parameters with periodic boundary conditions (or wraparound parameters), set pWrap[i],
where i is the index of the parameter to be wraparound, to a non-zero value. If pWrap[i] = 0, then the ith
parameter is not wraparound.

---------------------------------------------------------------------------

Constant Efficiency Mode:

If ceff is set to T, then the enlargement factor of the bounding ellipsoids are tuned so that the sampling
efficiency is as close to the target efficiency (set by efr) as possible. This does mean however, that the
evidence value may not be accurate.

---------------------------------------------------------------------------

Sampling Parameters:

The recommended paramter values to be used with MultiNest are described below. For detailed description please
refer to the paper arXiv:0809.3437

nPar: 
Total no. of parameters, should be equal to ndims in most cases but if you need to store some additional
parameters with the actual parameters then you need to pass them through the likelihood routine.


efr:
defines the sampling efficiency. 0.8 and 0.3 are recommended for parameter estimation & evidence evalutaion
respectively.


tol:
A value of 0.5 should give good enough accuracy.

           
nCdims:
If mmodal is T, MultiNest will attempt to separate out the modes. Mode separation is done through a clustering
algorithm. Mode separation can be done on all the parameters (in which case nCdims should be set to ndims) & it
can also be done on a subset of parameters (in which case nCdims < ndims) which might be advantageous as
clustering is less accurate as the dimensionality increases. If nCdims < ndims then mode separation is done on
the first nCdims parameters.


Ztol:
If mmodal is T, MultiNest can find multiple modes & also specify which samples belong to which mode. It might be
desirable to have separate samples & mode statistics for modes with local log-evidence value greater than a
particular value in which case Ztol should be set to that value. If there isn't any particularly interesting
Ztol value, then Ztol should be set to a very large negative number (e.g. -1.d90).

---------------------------------------------------------------------------

Progress Monitoring:

MultiNest produces [root]physlive.dat & [root]ev.dat files after every updInt iterations which can be used to
monitor the progress. The format & contents of  these two files are as follows:

[root]physlive.dat:
This file contains the current set of live points. It has nPar+2 columns. The first nPar columns are the ndim
parameter values along with the (nPar-ndim)  additional parameters that are being passed by the likelihood
routine for MultiNest to save along with the ndim parameters. The nPar+1 column is the log-likelihood value &
the last column is the node no. (used for clustering).

[root]ev.dat:
This file contains the set of rejected points. It has nPar+3 columns. The first nPar columns are the ndim
parameter values along with the (nPar-ndim)  additional parameters that are being passed by the likelihood
routine for MultiNest to save along with the ndim parameters. The nPar+1 column is the log-likelihood value,
nPar+2 column is the log(prior mass) & the last column  is the node no. (used for clustering).

---------------------------------------------------------------------------

Posterior Files:

These files are created after every updInt*10 iterations of the algorithm & at the end of sampling.

MultiNest will produce five posterior sample files in the root, given by the user, as following


[root].txt
Compatable with getdist with 2+nPar columns. Columns have sample probability, -2*loglikehood, parameter values. 
Sample probability is the sample prior mass multiplied by its likelihood & normalized by the evidence.


[root]post_separate.dat
This file is only created if mmodal is set to T. Posterior samples for modes with local log-evidence value
greater than Ztol, separated by 2 blank lines. Format is the same as [root].txt file.


[root]stats.dat
Contains the global log-evidence, its error & local log-evidence with error & parameter means & standard
deviations as well as the  best fit & MAP parameters of each of the mode found with local log-evidence > Ztol.


[root]post_equal_weights.dat
Contains the equally weighted posterior samples. Columns have parameter values followed by loglike value.


[root]summary.txt
There are nmode+1 (nmode = number of modes) rows in this file. First row has the statistics for the global 
posterior. After the first line there is one row per mode with nPar*4+2 values in each line in this file. 
Each row has the following values in its column mean parameter values, standard deviations of the parameters, 
bestfit (maxlike) parameter values, MAP (maximum-a-posteriori) parameter values, local log-evidence, maximum 
loglike value. If IS = T (i.e. INS being used), first row has an additional value right at the end, INS 
log-evidence estimate.

---------------------------------------------------------------------------

INS Output Files:

In INS mode (when IS = T), MultiNest will produce 3 additional binary files: [root]IS.iterinfo, IS.points, IS.ptprob
These files are used for resuming job with IS (Importance Nested sampling) mode set to T. They can be quite
large and can be deleted once the job has finished.

---------------------------------------------------------------------------

C/C++ Interface:

Starting in MultiNest v2.18 there is a common C/C++ interface to MultiNest.  The file is located in the 
"includes" directory.  Examples of how it may be used are found in the example_eggboxC and 
example_eggboxC++ directories.

You may need to check the symbol table for your platform (nm libmulitnest.a | grep nestrun) and edit the 
multinest.h file to define NESTRUN.  Please let us know of any modifications made so they may be included
in future releases.


---------------------------------------------------------------------------

Python Interface:

Johannes Buchner has written an easy-to-use Python interface to MultiNest called PyMultiNest which 
provides integration with existing scientific Python code (numpy, scipy). It allows you to write Prior & 
likelihood functions in Python. It is available from:

https://github.com/JohannesBuchner/PyMultiNest


---------------------------------------------------------------------------

R Interface:

Johannes Buchner has written an R bridge for MultiNest called RMultiNest. It allows likelihood functions 
written in R (http://www.r-project.org) to be used by MultiNest. It is available from:

https://github.com/JohannesBuchner/RMultiNest


---------------------------------------------------------------------------

Matlab version of MultiNest:

Matthew Pitkin and Joe Romano have created a simple Matlab version of MultiNest. It doesn't have all the 
bells-and-whistles of the full implementation and just uses the basic Algorithm 1 from arXiv:0809.3437. It
is available for download the MultiNest website:

http://ccpforge.cse.rl.ac.uk/gf/project/multinest/


---------------------------------------------------------------------------

Visualization of MultiNest Output:

[root].txt file created by MultiNest is compatable with the format required by getdist package which is 
part of CosmoMC package. Refer to the following website in order to download or get more information about
getdist:
http://cosmologist.info/cosmomc/readme.html#Analysing


Johannes Buchner's PyMultiNest can also be used on exisiting MultiNest output to plot & visualize results.
https://github.com/JohannesBuchner/PyMultiNest


---------------------------------------------------------------------------

Toy Problems

There are 8 toy programs included with MultiNest.

example_obj_detect: The object detection problem discussed in arXiv:0704.3704. The positions, amplitudes &
widths of the Gaussian objects can be modified through params.f90 file. Sampling parameters are also set in
params.f90.

example_gauss_shell: The Gaussian shells problem discussed in arXiv:0809.3437. The dimensionality, positions and
thickness of these shells can be modified through params.f90 file. Sampling parameters are also set in
params.f90.

example_gaussian: The Gaussian shells problem discussed in arXiv:1001.0719. The dimensionality of the problem can 
be modified through params.f90 file. Sampling parameters are also set in params.f90.

example_eggboxC/example_eggboxC++: The C/C++ interface includes the egg box problem discussed in arXiv:0809.3437. 
The toy problem and sampling parameters are set in eggbox.c & eggbox.cc files.

example_ackley: The Ackley mimimization problem (see T. Bäck, Evolutionary Algorithms in Theory and Practice, 
Oxford University Press, New York (1996).)

example_himmelblau: The Himmelblau's minimization problem. (see http://en.wikipedia.org/wiki/Himmelblau's_function)

example_rosenbrock: Rosenbrock minimization problem. (see http://en.wikipedia.org/wiki/Rosenbrock_function)

example_gaussian: Multivariate Gaussian with uncorrelated paramters.

---------------------------------------------------------------------------

Common Problems & FAQs:


1. MultiNest crashes after segmentation fault.

Try increasing the stack size (ulimit -s unlimited on Linux) & resume your job.


2. Output files (.txt & post_equal_weights.dat) files have very few (of order tens) points.

If tol is set to a reasonable value (tol <= 1) & the job hasn't finished then it is possible for these files to 
have very few points in them. If even after the completion of job these files have very few points then increase the stack
size (ulimit -s unlimited on Linux) & resume the job again.


3. Not all modes are being reported in the stats file.

stats file reports all the modes with local log-evidence value greater than Ztol. Set Ztol to a very large 
negative number (e.g. -1.d90) in order for the stats file to report all the modes found.


4. Compilation fails with error something like 'can not find nestrun function'.

Check the symbol table for your platform (nm libnest3.a | grep nestrun) & edit multinest.h file to define NESTRUN
appropriately.


5. When to use the constant efficiency mode?

If the sampling efficiency with the standard MultiNest (when ceff = F) is very low & no. of free parameters is relatively 
high (roughly greater than 30) then constant efficiency mode can be used to get higher sampling efficiency. Users should use
ask for around 5% or lower targer efficiency (efr <= 0.05) in constant efficiency mode & ensure that posteriors are stable 
when a slightly lower value is used for target efficiency. Evidence values obtained with constant efficiency mode will not be
accurate in the default mode (IS = F). With INS (IS = T) accurate evidences can be calculated (read arXiv:1306.2144) for more
details.

---------------------------------------------------------------------------

Importance Nested Sampling:


Sampling Parameters:

INS can calculate an order of magnitude more accurate evidence values than the vanilla nested sampling, for the same number
of live points & target efficiency (efr). INS can also calculate reasonably accurate evidence values in constant efficiency
mode (when ceff = T). However, users should always check that the posterior distributions & evidence values are stable for
reasonable change in these sampling parameters. Potentially users can use fewer live points &/or higher target efficiency
(efr) with INS to get the same level of accuracy on evidence & therefore analysis can be sped-up.


Quoted Error on Log-Evidence:

The quoted error on log-evidence in INS mode is generally an overestimate. Users should use the quoted error on log-evidence
from vanilla nested sampling as an upper bound on the error on INS log-evidence.


Output Files:

In INS mode, MultiNest will produce 3 additional binary output files (see output files section) in which it will record 
information about all the points it has collected. These files can be quite large. It is recommended that the user sets 
updint parameter (which sets the number of iterations after which posterior files are updated) to 1000 or more otherwise 
a lot of time might be wasted in writing these large files. These binary files are only used while resuming jobs & can be
deleted once the job has finished.


Memory Requirements:

INS requires a lot more memory than the defult mode (when IS = F). With nlive <= 1000, memory requirements are feasible on
modern computers but with nlive >= 4000, segfaults may results due to not enough memory available.


Multi-Modal Mode:

Multi-modal mode is not supported yet when IS = T. If mmodal = T along with IS = T, MultiNest will set mmodal = F.

---------------------------------------------------------------------------