/openQCD_force-gradient

adapted version of the openQCD package (based on version 2.4) enabling the use of Hessian-free force-gradient integrators

Primary LanguageCGNU General Public License v3.0GPL-3.0

********************************************************************************

                        openQCD Simulation Programs

     extended version including Hessian-free force-gradient integrators

********************************************************************************

INTEGRATORS FOR MOLECULAR-DYNAMICS EQUATIONS

This adapted version of the openQCD 2.4 code allows for nested hierarchical integrators 
for the molecular-dynamics equations, based on any combination of Hessian-free 
force-gradient integrators introduced in <https://doi.org/10.48550/arXiv.2403.10370>. 

This also includes the non-gradient algorithms of Omelyan-Mryglod-Folk (OMF) up to 6th order
<https://doi.org/10.1016/S0010-4655(02)00754-3>. 

The current coefficient sets are minimizing the (weighted) norm of the leading error coefficients. 
The investigations have been made for Hessian-free force-gradient integrators on a single level, 
similar to the investigations made in the aforementioned paper by Omelyan, Mryglod ad Folk. 
For nested integrators, the optimization of the degrees of freedoms will most likely result in 
different coefficients and is part of future research.


CHANGES TO THE VERSION 2.4

Compared to the openQCD package v2.4, the following files have been changed in order to enable 
the use of Hessian-free force-gradient integrators:
/include/flags.h
/include/update.h
/modules/flags/mdint_parms.c
/modules/forces/force0.c
/modules/update/mdint.c
/modules/update/mdsteps.c
/modules/update/wsize.c
/modules/update/update.c

Moreover, the check functions in /devel/update have been adapted so that the check functions 
check1, check2 and check3 make use of the added Hessian-free force-gradient integrators.


HOW TO USE THE NEW INTEGRATORS?

The syntax of the input parameter files and the meaning of the various parameters did not change 
and is described in some detail in main/README.infiles and doc/parms.pdf. To use the new 
integrators in a nested integrator, one just has to change the "integrator" parameter in the 
input parameter file. The integrators are named according to their abbreviations in 
[<https://doi.org/10.48550/arXiv.2403.10370>,Table 1]. For example, "BADAB" is the frequently used 
five-stage Hessian-free force-gradient integrator with just one approximated gradient evaluation.
The idea of approximating the force-gradient term has been introduced for this particular 
integrator in <https://doi.org/10.48550/arXiv.1111.5059> and a nested integrator based on this 
Hessian-free force-gradient integrator has already been applied to the two-dimensional Schwinger 
model, see <https://doi.org/10.4208/cicp.OA-2016-0048>.


LATTICE THEORY

Currently the common features of the supported lattice theories are the
following:

* 4-dimensional hypercubic N0xN1xN2xN3 lattice with even sizes N0,N1,N2,N3.
  Open, Schrödinger functional (SF), open-SF or periodic boundary conditions
  in the time direction. Periodic boundary conditions in the space directions.

* SU(3) gauge group, plaquette plus planar double-plaquette gauge action
  (Wilson, Symanzik, Iwasaki,...).

* O(a)-improved Wilson quarks in the fundamental representation of the gauge
  group. Among the supported quark multiplets are the classical ones (pure
  gauge, two-flavour theory, 2+1 and 2+1+1 flavour QCD), but doublets with a
  twisted mass and theories with many doublets, for example, are also
  supported.

The O(a)-improvement includes the boundary counterterms required for the
improvement of the correlation functions near the boundaries of the lattice
in the time direction if open, SF or open-SF boundary conditions are chosen.
Either the traditional implementation of the O(a) Pauli term in the quark
action or an "exponential" variant (which offers some technical advantages)
may be used. For the quark fields phase-periodic boundary conditions in the
space directions are implemented too.


SIMULATION ALGORITHM

The simulation programs are based on the HMC or the SMD (Stochastic Molecular
Dynamics) algorithm. For the strange and the heavier quarks, a rational
approximation of the exact pseudo-fermion action is used as in the RHMC
algorithm.

Several advanced techniques are implemented that can be configured at run
time:

* Nested hierarchical integrators for the molecular-dynamics equations, based
  on any combination of the leapfrog, 2nd order Omelyan-Mryglod-Folk (OMF) and
  4th order OMF elementary integrators, are supported.

* Twisted-mass Hasenbusch frequency splitting, with any number of factors
  and twisted masses. Optionally with even-odd preconditioning.

* Twisted-mass determinant reweighting.

* Deflation acceleration and chronological solver along the molecular-dynamics
  trajectories.

* A choice of solvers (CGNE, MSCG, SAP+GCR, deflated SAP+GCR) for the Dirac
  equation, separately configurable for each force component and
  pseudo-fermion action.

* Parallel I/O of gauge, momentum and quark fields.

All of these depend on a number of parameters, whose values are passed to the
simulation program together with those of the action parameters (coupling
constants, quark masses, etc.) through a structured input parameter file.


MASTER-FIELD SIMULATIONS

Since version 2.0, openQCD includes various technical improvements (quadruple-
precision accumulation of large sums, uniform-norm solver stopping criterion,
etc.), which are expected to make master-field simulations of very large
lattices numerically safe.


PROGRAM FEATURES

All programs parallelize in 0,1,2,3 or 4 dimensions, depending on what is
specified at compilation time. They are highly optimized for machines with
current Intel and AMD processors, but will run correctly on any system that
complies with the ISO C89, MPI 1.2 and OpenMP 4.5 standards.

For the purpose of testing and code development, the programs can also be run
on a desktop or laptop computer. All what is needed for this is a compliant C
compiler supporting OpenMP and a local MPI installation such as Open MPI.


DOCUMENTATION

The simulation program has a modular form, with strict prototyping and a
minimal use of external variables. Each program file contains a small number
of externally accessible functions whose functionality is described at the top
of the file.

The data layout is explained in various README files and detailed instructions
are given on how to run the main programs. Further documentation, specifying
the normalization conventions and algorithms used, is included in the doc
directory.


IEEE 754 COMPLIANCE

The program is portable to a wide range of architectures. It assumes, however,
that the machine and the compiler support a 4 byte integer type, an 8 byte
integer type and the standard IEEE 754 floating-point types float (4 bytes)
and double (8 bytes). Moreover, the floating-point arithmetic performed by the
CPUs must comply with the IEEE 754 standard too and the default rounding rules
(rounding to nearest, ties to even) must be used.

These conditions are usually satisfied on machines with 64bit OS, but on x86
machines with 32bit OS some particular compiler options (such as the gcc
option -mfpmath=sse) may have to be set. The main programs check whether the
machine complies with the standard and abort the program with an informative
error message if not.


COMPILATION

The current version (2.4) of openQCD is the first using both MPI communication
functions and OpenMP parallel processing directives. The compilation of the
programs requires an ISO C89 compliant compiler supporting OpenMP 4.5 and a
compatible MPI installation complying with the MPI standard 1.2. The OpenMP
4.5 standard is supported since GCC version 7, for example.

In the main and devel directories, GNU-style Makefiles are included, which
compile and link the programs (type "make" to compile everything; "make clean"
removes the files generated by "make"). The compiler options can be set by
editing the CFLAGS line in the Makefiles.

The Makefiles assume that the following environment variables are set:

  GCC             GNU C compiler command [Example: /usr/bin/gcc].

  MPI_HOME        MPI home directory [Example: /usr/lib64/mpi/gcc/openmpi].
                  The MPI libraries are expected in $MPI_HOME/lib.

  MPI_INCLUDE     Directory where the mpi.h file is to be found.

All programs are then compiled and linked using the mpicc command.

The compiler command to be used for the compilation of the modules and the
link step can be changed by editing the CC and CLINKER lines in the Makefiles.
Independently of what is specified there, the GCC compiler is used to resolve
the dependencies on the include files.


SSE/AVX ACCELERATION

Current Intel and AMD processors are able to perform arithmetic operations on
short vectors of floating-point numbers in just one or two machine cycles,
using SSE and/or AVX instructions.

Many programs in the module directories include inline-assembly SSE and AVX
code. Inline assembly is a GCC extension of the C language that may not be
supported by other compilers. On 64bit systems the code can be activated by
setting the compiler flags -Dx64 or -DAVX, respectively (the option -DAVX
implies -Dx64).

The latest x86 processors furthermore support fused multiply-add (FMA3)
instructions. OpenQCD makes use of these if the option -DFMA3 is set in
addition to -DAVX (setting -DFMA3 alone has no effect).

In addition, SSE prefetch instructions will be used if one of the following
options is specified:

  -DP4     Assume that prefetch instructions fetch 128 bytes at a time
           (Pentium 4 and related Xeons).

  -DPM     Assume that prefetch instructions fetch 64 bytes at a time
           (Athlon, Opteron, Pentium M, Core, Core 2 and related Xeons).

  -DP3     Assume that prefetch instructions fetch 32 bytes at a time
           (Pentium III).

These options have an effect only if -Dx64 or -DAVX is set. If none of these
two options is set, the programs do not make use of any C language extensions
and are fully portable.

On recent x86-64 machines the recommended compiler flags are thus

  -std=c89 -O -mfpmath=sse -mno-avx -DAVX -DFMA3 -DPM

For older machines that only support the SSE3 instruction set, the recommended
flags are

  -std=c89 -O -mfpmath=sse -mno-avx -Dx64 -DPM

Aggressive optimization levels such as -O2 and -O3 tend to have little effect
on the execution speed of the programs, but the risk of generating wrong code
is higher.

AVX instructions and the option -mno-avx may not be known to old versions of
the GCC compiler, in which case one may be limited to SSE accelerations with
option string -std=c89 -O -Dx64 -DPM (or no acceleration at all).

If compilers other than GCC are used together with the option -Dx64 or -DAVX,
it is strongly recommended to verify the correctness of the compilation using
the check programs in the devel directory.


DEBUGGING FLAGS

For troubleshooting and parameter tuning, it may helpful to switch on some
debugging flags at compilation time. The simulation program then prints a
detailed report to the log file on the progress made in specified subprogram.

The available flags are:

-DCGNE_DBG         CGNE solver for the Dirac equation.

-DMSCG_DBG         MSCG solver for the Dirac equation.

-DFGCR_DBG         FGCR solver for the Dirac equation.

-DFGCR4VD_DBG      FGCR solver for the little Dirac equation.

-DGCR4V_DBG        GCR preconditioner for the little Dirac equation.

-DDFL_MODES_DBG    Deflation subspace generation.

-DFORCE_DBG        Quark forces in the molecular-dynamics evolution.

-DMDINT_DBG        Integration of the molecular-dynamics equations.

-DRWRAT_DBG        Computation of the rational function reweighting
                   factor.

-DIGNORE_ERRORS    Error messages are printed as usual, but the program
                   execution continues.


RUNNING A SIMULATION

The simulation programs reside in the directory "main". For each program,
there is a README file in this directory which describes the program
functionality and its parameters.

Before compilation, the lattice sizes and a few other basic parameters must be
set in the file include/global.h. Explanations of what these macros mean are
provided in the file main/REAME.global.

Running a simulation for the first time requires its parameters to be chosen,
which tends to be a non-trivial task. The syntax of the input parameter files
and the meaning of the various parameters is described in some detail in
main/README.infiles and doc/parms.pdf. Examples of valid parameter files are
contained in the directory main/examples.


OPENMP THREAD AFFINITY

The number of OpenMP threads per MPI process is determined at compilation time
by the parameters in the file include/global.h (see main/README.global). If
the number is greater than 1, the performance of the program is strongly
influenced by the mapping of the threads to the physical processing cores.

The best results are usually obtained if each thread gets its private core.
Moreover, the threads launched by a given MPI process should occupy a set of
cores with an efficiently shared cache. If possible, the association of thread
to core number should be static.


EXPORTED AND BLOCK-EXPORTED STORAGE

The field configurations generated in the course of a simulation can be saved
to disk in machine-independent formats (see main/README.io). Independently of
the machine endianness, the data are converted to little endian byte order
before they are written to disk.


AUTHORS

The initial release of the openQCD package was written by Martin Lüscher and
Stefan Schaefer. Support for Schrödinger functional boundary conditions was
added by John Bulava. Phase-periodic boundary conditions for the quark fields
were introduced by Isabel Campos and the implementation of the "exponential"
variant of the O(a) Pauli term was developed by Antonio Rago. Several modules
were taken over from the DD-HMC program tree, which includes contributions
from Luigi Del Debbio, Leonardo Giusti, Björn Leder and Filippo Palombi.
This adapted version enabling the use of Hessian-free force-gradient integrators 
is based on version 2.4 of the openQCD package and was written by Jacob Finkenrath
and Kevin Schäfers. 


ACKNOWLEDGEMENTS

In the course of the development of the openQCD code, many people suggested
corrections and improvements or tested preliminary versions of the programs.
The authors are particularly grateful to Isabel Campos, Dalibor Djukanovic,
Georg Engel, Patrick Fritzsch, Leonardo Giusti, Björn Leder, Daniel Mohler,
Carlos Pena and Hubert Simma for their communications and help. 


LICENSE

The software may be used under the terms of the GNU General Public Licence
(GPL).


BUG REPORTS

If a general bug in the openQCD package is discovered, please send a report 
to <Martin.Luescher@cern.ch>. If a bug in the adapted files is discovered, 
please send a report to <kschaefers@uni-wuppertal.de>.


ALTERNATIVE PACKAGES AND COMPLEMENTARY PROGRAMS

A public BG/Q version of openQCD that takes advantage of the machine-specific
features of IBM BlueGene/Q computers can be downloaded from
<http://hpc.desy.de/simlab/codes/>.

As explained in <https://arxiv.org/abs/1806.06043>, the most time-consuming
parts of the programs can be further accelerated using AVX-512 intrinsics on
machines supporting these instructions. A public code implementing this in
openQCD-1.6 can be downloaded from <https://github.com/sa2c/OpenQCD-AVX512>
and may work with later versions of openQCD too.

The openQCD programs currently do not support reweighting in the quark
masses, but a module providing this functionality can be downloaded from
<http://www-ai.math.uni-wuppertal.de/~leder/mrw/>.

Another extension of openQCD-1.6, available at <https://fastsum.gitlab.io/>,
supports simulations of anisotropic lattices and pseudo-fermion actions, where
the gauge-field variables are replaced by stout-smeared ones.

Full-fledged QCD simulation programs tend to have many adjustable parameters.
In the case of openQCD, most parameters are passed to the programs through a
human-readable structured file. Liam Keegan's sleek graphical editor for these
parameter files offers some guidance and complains when inconsistent parameter
values are entered (see <http://lkeegan.github.io/openQCD-input-file-editor>).