patchy-ecosterics

PATCHY ECOsystem cluSTER dynamICS: Investigating cluster dynamics in models of patchy ecosystems

Requirements

This repository requires installation of compilers/interpreters of the following programming languages, and the corresponding packages/modules:

Python: matplotlib, multiprocessing, numba, numpy, pickle, scikit-image, tqdm
R: ggplot, reticulate, spatialwarnings (from Genin et. al)

Models

This repository contains code to simulate the following models/automatons:

Static null model: A 2D lattice that is randomly assigned with 0's and 1's, based on some occupational probability 'p'. Used as benchmark for percolation transitions
Stochastic null model: A model that is initialized with some fractional occupancy 'f', and evolves in a completely random manner (0 -> 1, 1 -> 0) whilst hovering around the initialized fractional occupancy. Proposed in Kefi et. al. Used as control for cluster dynamics
Tricritical Directed Percolation (TDP): The central model of this project. Proposed in this paper. Initialized with two parameters, (p, q), the working of this model is decrbied by the following schematic from Sankaran et. al

Scanlon Model: This model is from Scanlon et. al. The only parameter in this model is rainfall. Ubiquitously produces power-law distribution in cluster sizes. Used in addition to gauge how cluster dynamics varies across models. This model is simulated using Markov chain Monte Carlo method, with the following transition rates:

Contact Process: Tricritical Directed Percolation with q = 0 is known as Contact Process. Infact, TDP is actually an extension of this model

Code for all models can be found inside the models folder. Each model has different versions of similar code that are intended for different purposes:

in_place_processing.py: For tracking cluster dynamics. After equilibriation, this script will pass consecutive landscapes that differ by a single update to cluster_dynamics.py
coarse_dynamics.py: Utilizes a difference map technique to approximate cluster dynamics across larger timescales
dumper.py: Returns the final density associated with initialized parameter value(s), averaged across all ensembles. Ideal for plotting phase transitions/diagrams
final_lattice.py: Saves all final lattices associated with initialized parameter value(s). Required for probing cluster size distribution
spanning_cluster.py: Returns the percolation probability associated with initialized parameter value(s). Basically, it is the fraction of ensembles that have a percolation/spanning cluster

Cluster Dynamics

Given two lattices that differ by a single update, cluster_dynamics.py returns the following information:

Size of the participating cluster(s)
Size of the resulting cluster(s)
Overall change associated with the process

The following schematic showcases all possible processes. The caption describes how the inferred data is used to obtain distribution of changes in cluster sizes, along with approximations to drift and diffusion:

How to run

The entire repository needs to be downloaded since none of the scripts are self-contained. All versions of all models are imported in the terminal.py file. Use this file to run jobs. The 'code_snippets' folder contains lines of code for specific purposes. A particular snippet has been explained below:

set_start_method("spawn")
num_simulations = cpu_count() - 1
p_values = [0.70, 0.71]
q = 0.25

for p in p_values:
    purge_data()
    print(f"\n---> Simulating p = {p} <---")
    file_string = str(p).replace('.', 'p')
    tricritical(p, q, num_simulations, save_series=False, save_cluster=True, calc_residue=True)
    compile_changes("tricritical", range(num_simulations), plot_name=file_string)
    plot_changes(file_string)

set_start_method("spawn"): required for proper parallelization
num_simulations = cpu_count() - 1: specifies the number of simulations that will be run parallely
p_values = [0.70, 0.71] and q = 0.25: we will simulate p = 0.7 and 0.71 for q = 0.25 (TDP)
purge_data(): required to get rid of transient data created by step 5 from previous simulation
file_string = str(p).replace('.', 'p'): when multiple realisations are being run (in this case, 2), then file_string allows the program to distinguish between code generated by distinct realisations
tricritical(p, q, num_simulations, save_series=False, save_cluster=True, calc_residue=True): simulates the TDP model using in_place_processing.py (hence, cluster dynamics will be tracked). Will not save time evolution of landscape, but will save cluster dynamics and also calculate residues (for deducing nature of noise). The saved files are dumped in the corresponding model's folder. If you intend to modify secondary model parameters like length of the landscape or time of the simulation/equilibration phase, then it can be done in the bottom of in_place_processing.py
compile_changes("tricritical", range(num_simulations), plot_name=file_string): Goes through the tricritical model folder and compiles relevant data into text files (prefixed with file_string) which is saved in the outputs folder.
plot_changes(file_string): Goes through the outputs folder and plots graphs from the text files beginning with file_string. The plots are saved in the outputs folder itself.

Interpret results

Many text files will be generated by compile_changes.py. Many figures will be generated be plot_changes.py. Although the figures are mostly self-explanatory, the text files are not. Hence, this is a description of all text files:

<prefix>_changes.txt: the first column depicts the change in cluster size. The second column denotes the number of times that has taken place during the simulation, across all ensembles
<prefix>_cluster_size_distribution.txt: the first column depicts the cluster size. The second column denotes the number of occurences of clusters of this size in the final iteration of the simulation, across all ensembles
<prefix>_cluster_ds.txt: the first column depicts the cluster size. The second and third column contain the approximated drift and diffusion values associated with this cluster size. The fourth column contains the number of samples which were used in the approximation
<prefix>_cluster_growth_probabilities.txt: cluster size vs probability that it has grown in the next frame, given that it has changed in size
<prefix>_cluster_processes.txt: Cluster size | Number of growths recorded | Number of decays recorded | Number of merges | Number of splits
<prefix>_densities.txt: Ensemble index vs final density
<prefix>_residue_info.txt: Contains information to reconstruct the histogram of residue, which describes the nature of noise in the system. Cluster size : (lower value) (higher value) : (frequency of occurrences from lower value to higher value)

Additional information

benchmarking folder: contains plots used to ascertain that the models have been correctly implemented
consistency_checks folder: contains scripts that try to reproduce the cluster size distribution using the approximated drift and diffusion terms of the stochastic differential equation
fitters folder: contains scripts used to fit power-law, truncated power-law and exponential distributions to cluster sizes as well as cluster dynamics
plotters folder: contains scripts to plot each figure in the manuscript
depth_first_clustering.py: when given a lattice, returns the cluster size distribution assuming Von-Neumann neighbourhood
organizer.py: groups text files and figures with same prefix into same folder
peer_data.py: allows the user to look at various aspects of simulation results. Useful for debugging