/pyFoF

Python package to perform group finding in redshift surveys.

Primary LanguagePythonMIT LicenseMIT

pyFoF

Python package to perform group finding in redshift surveys.

Installation

Via Pip

Note that pyFoF needs to be run using python >= 3.10 pyFoF is pip installable. One can simply run,

pip3 install FoFpy

Note that it is important to make sure the package is spelled correctly and with proper case sensitivity so as to not conflict with similarly named packages in other domain areas. Specifically, while the repository name is pyFoF, the deployed and installable package is named Fofpy. We are busy working on refactoring the repository to resolve this name conflict and subsequent mismatch.

Via Source

To install pyFoF from source, first we download the repository using git,

git clone https://github.com/TrystanScottLambert/pyFoF.git

Then once we've cloned the repo, we will change directory into the pyFoF folder using,

cd pyFoF

Following this, we need only run,

pip3 install .

Remember to test your installation using the example provided in the sections below.

How PyFoF Works

Pyfof finds galaxy groups using a modified version of the friends-of-friends algorithm (Huchra et. al., 1982). This is the same modified version which was used to create the 2MRS galaxy-group catalog (Lambert et. al., 2020). This modified version tries to correct for two main short-commings of the original friends-of-friends algorithm: The over-reliance on the---ultimately unjustifiable---pair of linking lengths, and the fact that the order in which the algorithm is run results in different outputs.

The modified version runs the original algorithm multiple times and averages the results using graph-theory, but varies the linking lengths for each run. This allows the algorithm to instead build a group cataog by using an area of parameter space instead of trying to justify any particular set of linking lengths. Each run of the original algorithm produces a group-catalog and these group catalogs are used to build one master graph where each node represents a galaxy and each edge represents how many times that galaxy was found in a group with the connecting galaxy. Once this main graph has been generated, a selection can be performed, removing all edges below a certain threshold. This threshold is left to the user to decide. Lambert et. al., (2020) justified a 0.5 cut. Once the cut is down the final group catalog is generated by finding all connected subgraphs of the main graph and only selecting groups with 3 or more members.

Example

First lets import the package.

import pyFoF
from astropy.cosmology import FlatLambdaCDM

We will use the KIDS survey as an example. This file is saved as: Kids_S_hemispec_no_dupes_updated.tbl and has the columns ra, dec, vel and the wise magnitudes---dentoted as w1. A cosmology must be decided upon before using running the group finder, astropy provides a useful cosmology package which pyfof reads. Most of the time FlatLambdaCDM should be sufficient.

cosmo = FlatLambdaCDM(H0=70, Om0=0.3)

We have included a data_handling module to read in catalogs stored as fits files or as standard IAUPAC tables. The minimum number of columns required is ra, dec, velocity, and magnitudes. Although, the read_data function will create a data object weather or not these columns exist, but the program will not run without them.

from pyFoF.data_handling import read_data
INFILE = 'Kids_S_hemispec_no_dupes_updated.tbl'
data = read_data(INFILE)

We must create a "Survey" Object where we will pass the cosmology that was decided apon as well as the parameters of the Schecter function that the user wishes to use. If no shecter parameters are given then the default parameters, suggested in Kochanek et. al., (2001): α = −1.02, M∗ = −24.2 mag, and Φ∗ = 0.42×10−2 Mpc-3 are assumed. The apparent-magnitude-limit of the survey needs to be given. If this isn't set by the design of the survey, then either the smallest magnitude can be used, or any another reasonable minimum (such as the 3 sigma of a guassian).

from pyFoF.survey import Survey
KIDS = Survey(data, cosmo, apparent_mag_limit = 17.6, alpha = -1.02, m_star = -24.2, phi_star = 0.0108)

The survey object has a helper function to convert columns from redshift to cz (using the cosmology)---in case that the latter is the given column. Doing this will create a new column in the Survey object with the correct naming.

KIDS.convert_z_into_cz('z_helio')

Once a survey has been created with the correct four columns with the case-sensitive names ra, dec, vel, and mag, the program be run:

    from pyFoF.experiment import Experiment
    run = Experiment(
        d0_initial=0.3, d0_final=0.8,
        v0_initial=100, v0_final=500,
        d_max=2., v_max=1000,
        n_trials=10, cutoff=0.5, survey = KIDS
        )
        
    run.run()

The parameters which must be passed are the initial and final linking lenghts (v0 and d0), the max allowable value for the d0 linking length and the max value for the v0 linking lengths, the number of trials to run (note that these trials will be run in parrallel), the final cutoff that will happen after averaging all the results and the survey to run on.

After this has been run we can write the results as ascii files. Two files will be produced, a galaxy catalog with the galaxy information data and a group catalog with galaxy-group data. Importantly this will have a galaxy id column which can be used to find the associate members in the galaxy catalog.

run.write_all_catalogs(overwrite = True)

It's also possble to print out the edge data which is very useful for visualizing how the updated algorithm works. The edge_data.txt file contains the id of one galaxy, id of another galaxy, and the weight between them.

run.write_edge_data()