Repurposing large health insurance claims data to estimate genetic and environmental contributions in 560 phenotypes

This github repo consists of code used for the analysis of our paper titled "Repurposing large health insurance claims data to estimate genetic and environmental contributions in 560 phenotypes". The directory structures are as follows:

dataAnalysis - this folder contains the R scripts used to run all variance component models for all phenotypes.
manuscriptAnalysis - this folder contains scripts used to process output from scripts in dataAnalysis folder, estimate meta-analysis of estimates, and generate figures for the manuscript.

Note: It is not possible to run the scripts in manuscriptAnalysis because some of the data needed for the scripts contain personally identifiable information. All of the summary data for our analysis can be found in the CaTCH webapp.

Additional detailed analysis can be found on our "Claims Analysis of Twin Correlation and Heritability (CaTCH)" web application : http://apps.chiragjpgroup.org/catch/

cmlakhan/twinInsurance

Repurposing large health insurance claims data to estimate genetic and environmental contributions in 560 phenotypes