The code is based on the methodology presented in 'A test for partial correlation with censored astronomical data', Akritas & Siebert, MNRAS, 278, 919 (1996).
The idea here is suppose you have measurements for two variables, X and Y. X and Y correlate well with each other. However, they mutually correlate with a third variable Z, which you have also measured. How can you be sure that the correlation you see between X and Y is not actually driven by Z?
One important astronomical example is when you are studying the correlation between luminosities at different bands—say X-rays and radio—for a sample of sources. The "hidden variable" Z in this case is the luminosity distance dL, which you used to convert from fluxes to luminosities.
This statistical test quantifies the p-value for the null hypothesis Pnull of no correlation between X and Y taking into account the effect of Z. If Pnull is high, then your X-Y correlation is caused by both variables depending on Z.
Make sure you have a fortran (sorry) compiler such as gfortran
or pgfortran
. This code was originally written in 1995, so be understanding.
Compile it with the command
gfortran -O cens_tau.f -o cens_tau
or by running
./make.sh
1.. Put your data in an ASCII file with the following structure (no need for the first line of cols in the file OK?):
col1 col2 col3 col4 col5 col6
X censX Y censY Z censZ
- X: independent variable
- Y: dependent variable
- Z: test variable
- censX, censY, censZ: integer which is 1 if X/Y/Z is a detection or 0 if it is an upper limit
The following python snippet can be useful. Suppose you have all variables each stored in a numpy array. To create an ASCII file with the appropriate structure to be processed by cens_tau
, issue the following command:
# "censored tag" array if all your data points are detections
censX=numpy.ones_like(X,dtype=numpy.int)
numpy.savetxt(fileout, transpose((X,censX,Y,censY,Z,censZ)), fmt='%10.4f %i %10.4f %i %10.4f %i')
2.. Run the test
./cens_tau
If you want to test this code with artificial data, first run gendata.py
which will generate a mock dataset in the file test01.dat
where X and Y both are correlated with Z.
If you use this code in your work and it gets published, you are morally obliged to cite the original paper: 'A test for partial correlation with censored astronomical data', Akritas & Siebert, MNRAS, 278, 919 (1996).
I also ask you to cite Nemmen, R. et al. Science, 2012, 338, 1445 (bibtex citation info) as one of the examples of application of the this test. I spent some time improving this code, so I would appreciate your citation of my paper as a token of gratitute. Thanks! 🙂