An adaptation for python 3. ___________________PyCOOL README____________________ PyCOOL v. 0.997300203937 Copyright (C) 2011 Jani Sainio <jani.sainio at utu.fi> Distributed under the terms of the GNU General Public License http://www.gnu.org/licenses/old-licenses/gpl-2.0.txt Please cite http://arxiv.org/abs/1201.5029 if you use this code in your research. See also http://www.physics.utu.fi/tiedostot/theory/particlecosmology/pycool/ I have done my best to add comments and info texts into different functions in the Python and CUDA codes in order to make the program more understandable. If you still have questions please email me or post them at https://github.com/jtksai/PyCOOL/ . Please submit any errors at https://github.com/jtksai/PyCOOL/issues . ------------------------------------------------------ 1 Introduction PyCOOL is a Python + CUDA program that solves the evolution of interacting scalar fields in an expanding universe. PyCOOL uses modern GPUs to solve this evolution and to make the computation much faster. See http://arxiv.org/abs/0911.5692 for more information on the GPU algorithm. The program has been written in the Python language in order to make it really easy to adapt the code to different scalar field models. What this means is that in an ideal situation only a list of potential functions in a model file has to be changed to study a new preheating model. Everything else should be done automatically. Please see the included model files for guidance on how to write a model file. Files included: - Python files: - calc_spect.pyx - f_init.pyx - field_init.py - lattice.py - main_program.py - misc_functions.py - procedures.py - setup.py - solvers.py - spectrum.py - symp_integrator.py - CUDA and template files: - gpu_3dconv.cu - kernel_gws.cu - kernel_gws_new.cu - kernel_H2.cu - kernel_H3.cu - kernel_H3_new.cu - kernel_k2.cu - kernel_linear_evo.cu - pd_kernel.cu - rho_pres.cu - spatial_corr.cu - Model files: - AD.py - AD2.py - chaotic.py - chaotic_massless.py - curvaton.py - curvaton_si.py - curvaton_single.py - oscillon.py - q_ball.py - README-file - compile_libs script ------------------------------------------------------ 2 Installation and dependencies PyCOOL naturally needs CUDA drivers installed into your machine and preferably a fast NVIDIA GPU (GTX 470 and Tesla C2050 cards tested) in order to achieve the speed advantage it can give. PyCOOL is built on PyCUDA http://mathema.tician.de/software/pycuda which needs to be installed. The data output also uses Pyvisfile http://mathema.tician.de/software/pyvisfile to write SILO files that are visualized in LLNL VisIt program. Short description of the installation process: It is highly recommended to use SILO files for the output. HDF5 files are supported in theory but they have not been tested properly and the program writes much more data in the SILO files. SILO can be downloaded from https://wci.llnl.gov/codes/silo/downloads.html and installed with the given instructions. The easiest way to install PyCUDA and Pyvisfile is to use git and the following commands: git clone http://git.tiker.net/trees/pycuda.git git clone http://git.tiker.net/trees/pyublas.git git clone http://git.tiker.net/trees/pyvisfile.git which will create new folders and download the necessary files. Before proceeding it is recommended to create .aksetup-defaults.py file in your $HOME folder or in /etc folder (in Linux). I have included an example of an .aksetup-defaults.py file at end of this README. The next step is to install the python libraries by running python setup.py build sudo python setup.py install in the downloaded package folders. Further PyCUDA install instructions are available in http://wiki.tiker.net/PyCuda/Installation . The field initialization also uses pyfftw package to calculate DST-II. This can be downloaded from https://launchpad.net/pyfftw . Note that there was an error in older version of pyfftw that caused an error in DST-II calculations https://bugs.launchpad.net/pyfftw/+bug/585809 . In newer version this is fixed. The analytical calculations needed when calculating field potentials and potential derivatives are done with SymPy http://sympy.org/ . This can be installed with easy_install or downloaded from https://github.com/sympy/sympy . PyCOOL uses then textual templating to write the necessary CUDA files. This is done with jinja2 library http://jinja.pocoo.org/docs/ . 'easy_install jinja2' should install this library. The spectrum calculations are done with Cython to speed the calculations (~200 times compared to ordinary Python code). Therefore Cython has to be also installed. This can done for example with 'sudo easy_install cython' in Linux. Finally run the compile_libs script that runs 'python setup.py build_ext --inplace' to make the calc_spect.pyx into a shared library file that the PyCOOL can use to calculate the spectra. After all these necessary steps the program is ready to be used. In summary, install - Numpy - Scipy - SymPy - SILO libraries - PyCUDA (and CUDA) - Pyublas - Pyvisfile - pyfftw - jinja2 - VisIt Then run compile_libs script to build the shared library file for the spectrum calculations. Current version has been tested in Debian Squeeze and Ubuntu 10.04 but it should work also in other operating systems. ------------------------------------------------------ 3 Running Typing 'python main_program.py' runs the code for a specific model that is imported from the models directory. The current program prints to the screen information of current step number, scale factor, Hubble parameter, physical time and the numerical error. How frequently these are calculated is determined by flush and flush_hom parameters in the model file. The initial field perturbations are created with the method that was also used in Defrost. The main program is divided into - Homogeneous system solver - Non-linear solver that includes a linearized perturbation solver The program creates a new folder by time stamp in the data folder. For large lattices and frequent data writing the generated data can be several hundred gigabytes if the fields are also stored. The created folder will also include a simple info file that tells basic information about the model used to generate the data. These include the initial values and the potential functions. During a simulation run PyCOOL can also calculate various spectra used previously in LatticeEasy and Defrost that are written into the generated SILO files. In VisIt these are available under Add -> Curve. See section 5 for details. ------------------------------------------------------ 4 Running your own models The code has been tested with the included models. In order to simulate your own model create a model file in models folder that has the model object and then import this python file in main_program.py. Possible variables to change/consider include: - n controls the number of points per side of lattice - L is the length of the side of the lattice - mpl is the reduced Planck mass - Mpl is the unreduced Planck mass - number of fields - initial field values - field potential - flush_freq determines how frequently to calculate the selected postprocessing functions and to write data to disk if saveQ is True. We've done our best to test the functionality of the program with different potential functions. It might however fail to write CUDA compatible code in some cases. In this case the code in lattice.py and misc_functions.py should be studied carefully to understand why the program failed. One cause of errors is exponentiation e.g. terms of the form f**n. Currently the program uses format_to_cuda function (found in misc_functions.py) to write these terms open in to form f**n = f*f*···*f. The code is also able to write powers of functions into suitable CUDA form, e.g. Cos(f1)**n = Cos(f1)*Cos(f1)*...*Cos(f1). User has to include the function into a power_list in the used model file. ------------------------------------------------------ 5 Output and Post-processing functions PyCOOL writes variety of variables into the Silo files during run time. Which of these and how often to write are determined in the scalar field model object. Note that the different spectra are calculated on the CPU and therefore the data must be transferred from the device to the host memory. This can significantly hinder the performance of the code. The variables include: - scale factor a - Hubble parameter H - Comoving horizon in units of lattice length 1/(a*H)/L - field f - canonical momentum of field pi - energy density rho - fractional energy densities of fields rho_field(n)/rho_total (omega_field(n) in Silo output) - fractional energy density of the interaction term between fields (omega_int in Silo output) - Absolute and relative numerical errors - spatial correlation length of the total energy density (l_p in Silo output) (See Defrost paper for the definition.) When solving homogeneous equations for the fields the program writes - scale factor a - Hubble parameter H - Comoving horizon in units of lattice length 1/(a*H)/L - homogeneous field f - homogeneous canonical momentum of field pi - energy density rho These are labeled by adding '_hom' at the end of the variable name in Silo output. In addition the program writes - field_i as a function field_j where i and j label the different fields. This can be used to see if the (field_i(t),field_j(t)) space is confined to some area. PyCOOL has also various post-processing functions. These include: - field spectrum (S_k in Silo output) - number density spectrum (n_k in Silo output) - energy density spectrum (rho_k in Silo output) - effective mass of the field(s) (m_eff in Silo output) - comoving number density (n_cov in Silo output) - the fraction of particles in relativistic regime (n_rel in Silo output) - empirical CDF and PDF from energy densities (rho_CDF and rho_PDF respectively) (This is still experimental!) - skewness and kurtosis of the scalar field(s) (field(n)_skew and field(n)_kurt in Silo output) Most of these functions are defined/used in 'Dmitry I. Podolsky et al. : Equation of state and Beginning of Thermalization After Preheating' http://arxiv.org/abs/hep-ph/0507096v1 . N.B. These functions have been tested but not thoroughly. If you notice that the output is clearly wrong please submit this to the issues page at github and/or create a functioning fork. N.B.2 Currently PyCOOL uses effective momenta when calculating the number spectra and this can't be turned off without modifying lattice.py file. ------------------------------------------------------ 6 Curvature perturbation \zeta To calculate the curvature perturbation \zeta zetaQ should be set to True in the model file. It is also recommended to switch off all the post-processing functions and saveQ should be set to False (See for example curvaton_single.py in models folder). The number of simulations to run with the same initial values is controlled by sim_num parameter in the used model object. superfolderQ term should be set to True so that all the curvature perturbation simulation data is stored in a same folder. The name of this folder is controlled by superfolder parameter in the model file. The range of the homogeneous initial field values over which the calculations are performed is determined by fields0 list in the main_program.py. This uses delta_f10 parameter from the model which has been included only with the curvaton model files. How this parameter is calculated has been explained in 'Non-Gaussianity from resonant curvaton decay' http://arxiv.org/abs/0909.4535v3 on page 13 starting from equation (55). N.B. The currently used method has been applied only to the curvaton model. Therefore when applying this to a different model the way \zeta is calculated might be quite different. The relevant files that might have to be edited are main_program.py (f0list and r_decay), procedures.py in postprocess folder (calc_zeta function) and solvers.py (run_non_linear function starting from line 154 onwards). N.B.2 If for some reason the calculation of curvature perturbation zeta has terminated before completion it is possible to continue the calculations from the last completed homogeneous initial value onwards. All the completed values should be removed from f0list (Check info.txt files from the data folders for the last values) and after the remaining f0_list values have been also computed zeta_data_from_file function in the postprocess procedures can read the necessary data from zeta_data.csv files stored in the used superfolder. ------------------------------------------------------ 7 Tensor perturbations Tensor perturbations are solved with the method presented in 'A Gravitational Wave Background from Reheating after Hybrid Inflation' http://arxiv.org/abs/0707.0839v2 meaning that the tensor components are solved separately and the transverse-traceless part is extracted only when needed. Gravitational waves are solved if self.gwsQ == True in the model file. Current output of the program is the gravitational wave spectrum (gw_spectrum) and the fractional energy density of the gravitational waves (omega_gw). N.B. Note that this method has not been tested thoroughly. Especially the used projector operators should be verified. (These are in postprocess/procedures.py and calc_spect.pyx files.) ------------------------------------------------------ 8 Different dicretizations PyCOOL has three different dicretizations implemented: the one presented in DEFROST, the one implemented in Latticeeasy and the one used in HLattice. The last two of these use an identical CUDA implementation and are also more suitable for a multi-GPU implementation. The used stencil can be currently chosen with the self.discQ parameter in the model file by setting it equal to 'defrost', 'latticeeasy' or 'hlattice'. !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! N.B. Note that currently the 'hlattice' discretization leads to higher errors than the 'Defrost' stencil. Haven't figured what's causing this (could be anisotropic errors?). In simulations this also causes very clear artifacts in the fields. Therefore it is recommended to use 'Defrost' stencil. !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! ------------------------------------------------------ 9 To Do list/Improvements - Multi-GPU support should be included. This might however take some work due to the periodicity of the lattice. - OpenCL support would allow to run the simulations also on AMD Radeon cards. Kernel functions would have to be however rewritten (completely?) and also tested on Radeon cards. This might take a long time and a lot of work. ------------------------------------------------------ This is an example of an .aksetup-defaults.py file. Remember to modify this to your systems settings. .aksetup-defaults.py: ------------------------------------------------------ BOOST_COMPILER = 'gcc43' BOOST_INC_DIR = ['/usr/include'] BOOST_LIB_DIR = ['/usr/lib'] BOOST_PYTHON_LIBNAME = ['boost_python-mt-py26'] BOOST_THREAD_LIBNAME = ['boost_thread-mt'] CUDADRV_LIBNAME = ['cuda'] CUDADRV_LIB_DIR = ['/usr/lib'] CUDA_ENABLE_GL = False CUDA_ROOT = '/usr/local/cuda' CUDA_TRACE = False CXXFLAGS = [] LDFLAGS = [] SILO_INC_DIR = ['/usr/local/silo/include'] SILO_LIBNAME = ['silo'] SILO_LIB_DIR = ['/usr/local/silo/lib'] USE_SHIPPED_BOOST = True USE_SILO = True ------------------------------------------------------ Jani Sainio jani.sainio@utu.fi http://www.physics.utu.fi/tiedostot/theory/particlecosmology/pycool/ Department of Physics and Astronomy University of Turku