/ldsc

LD Score Regression (LDSC)

Primary LanguagePythonGNU General Public License v3.0GPL-3.0

LDSC (LD SCore) v1.0.0

ldsc is a command line tool for estimating heritability and genetic correlation from GWAS summary statistics. ldsc also computes LD Scores.

Getting Started

First, you will need to install python as well as the packages listed under the requirements header below. The easiest way to do this is with the Anaconda python distribution. All of the required packages come standard with Ananconda (Broad users: do use Anaconda).

In order to download ldsc, you should clone this repository via the command

git clone https://github.com/bulik/ldsc.git

Once you have installed ldsc as well as the required packages, typing

./ldsc.py -h
./munge_sumstats.py -h

will print a list of all command-line options. If these commands fail with an error, then something as gone wrong during the installation process.

Short tutorials describing the four basic functions of ldsc (estimating LD Scores, h2 and partitioned h2, genetic correlation, the LD Score regression intercept) can be found in the wiki. If you would like to run the tests, please see the wiki.

Updating LDSC

You can update to the newest version of ldsc using git. First, navigate to your ldsc/ directory (e.g., cd ldsc), then run

git pull

If ldsc is up to date, you will see

Already up-to-date.

otherwise, you will see git output similar to

remote: Counting objects: 3, done.
remote: Compressing objects: 100% (3/3), done.
remote: Total 3 (delta 0), reused 0 (delta 0), pack-reused 0
Unpacking objects: 100% (3/3), done.
From https://github.com/bulik/ldsc
   95f4db3..a6a6b18  master     -> origin/master
Updating 95f4db3..a6a6b18
Fast-forward
 README.md | 15 +++++++++++++++
 1 file changed, 15 insertions(+)

which tells you which files were changed. If you have modified the ldsc source code, git pull may fail with an error such as error: Your local changes to the following files would be overwritten by merge:.

Where Can I Get LD Scores?

You can download European and East Asian LD Scores from 1000 Genomes here. These LD Scores are suitable for basic LD Score analyses (the LD Score regression intercept, heritability, genetic correlation, cross-sex genetic correlation). You can download partitioned LD Scores for partitioned heritability estimation here.

Support

Before contacting us, please try the following:

  1. The wiki has tutorials on estimating LD Score, heritability, genetic correlation and the LD Score regression intercept and partitioned heritability.
  2. Common issues are described in the FAQ
  3. The methods are described in the papers (citations below)

If that doesn't work, you can get in touch with us via the google group.

Issues with LD Hub? Email ld-hub@bristol.ac.uk

Citation

If you use the software or the LD Score regression intercept, please cite

Bulik-Sullivan, et al. LD Score Regression Distinguishes Confounding from Polygenicity in Genome-Wide Association Studies. Nature Genetics, 2015.

For genetic correlation, please also cite

Bulik-Sullivan, et al. An Atlas of Genetic Correlations across Human Diseases and Traits. bioRxiv doi: http://dx.doi.org/10.1101/014498

For partitioned heritability, please also cite

Finucane, HK, et al. Partitioning Heritability by Functional Category using GWAS Summary Statistics. bioRxiv doi: http://dx.doi.org/10.1101/014241

If you find the fact that LD Score regression approximates HE regression to be conceptually useful, please cite

Bulik-Sullivan, Brendan. Relationship between LD Score and Haseman-Elston, bioRxiv doi http://dx.doi.org/10.1101/018283

For LD Hub, please cite

Zheng, et al. LD Hub: a centralized database and web interface to perform LD score regression that maximizes the potential of summary level GWAS data for SNP heritability and genetic correlation analysis. Bioinformatics (2016) https://doi.org/10.1093/bioinformatics/btw613

Requirements

  1. Python (3 > version >= 2.7)
  2. argparse
  3. bitarray
  4. numpy
  5. pandas
  6. scipy

The python data science stack is still under constant development, with frequent breaking changes. We will attempt to keep ldsc compatible with the newest releases of numpy/scipy/pandas, and we therefore recommend that you make sure you are running the latest versions of these three packages. This is most easily accomplished using the Anaconda python distribution and the included package manager conda.

ldsc is not presently compatible with python 3.x.

License

This project is licensed under GNU GPL v3.

Authors

Brendan Bulik-Sullivan (Broad Institute of MIT and Harvard)

Hilary Finucane (MIT Department of Mathematics)