Scripts for HLA analysis in FinnGen data by Courtney Smith et al. Data generated from this study are available at https://doi.org/10.5281/zenodo.12763469.
Directory Overview
-
Genetic_Correlations: Running LDSC genetic correlation
- Snakemake_gencor, filter_indeptraits.R, combine_gen_cor.sh, plot_gen_cor.R (identifying pairs of non-redundant (rg < 0.95) traits for HLA-associated traits from GWAS)
-
Enrichment_Analysis: Quantifying enrichment in HLA region
- prep_HLA_combine.R, prep_genome_hits.R, plot_enrich.R, combine_hitssumstats.R, manhattan_hitsacrosstraits.R (performs enrichment analysis and corresponding manhattan-like plot and barplots)
-
Pleiotropy_Matrix: Generation of matrix of disease hits and heatmap
- Snakefile_fullhits, make_matrix_fullhits.R, make_heatmap.R, make_heatmap_wspacing.R (filters to summary stats of all HLA GWAS hits combined across all traits by all traits and generates matrix and heatmap for fig 3)
-
Haplotype_Regression: Identifying haplotypes, clustering into haplotype groups, performing haplotype regression analysis and adjacent analyses
- haplo_blockdefining.R (generated list of snps in each block)
- hapclusterandregress.R (Defines haplotype groups and runs regression; hapclusterandregress_alltraits.R is the same but on all traits not just the SNP ascertained HLA-associated traits)
- hapclusterandregress_alltraits.R (Repeats about but for all traits)
- hap_regress_plot.R (Makes the dendrogram plots and haplotype genotype heatmaps for fig 5)
- hap_results_updown.R (Analyzing haplotype disease trade-offs and generates figure S4)
- annotategenes.R (Makes annotate snps with genes for fig 5)
- regress_enrich.R (Haplotype Group Burden Analysis)
- comparecorrresults.R (Haplotype Regression Trait Pair Correlation Measure analyses for fig 6)
- clusterregadjallele.R (Regression analysis with adjusting for relevant classical HLA alleles (VIF < 5) for each block + Plots comparing the z-scores for regression analysis before and after adjustment)
- allele_regress.R (Allele regression analyses w/ two approaches: one allele at a time and with all alleles together (after removing colinear) + plot z-score heatmap results)
- compareregmethods.R (Compare different regression methods, generate data tables for supplement)
- regresults_traitadjtrait.R (Trait adj for trait regression analysis)
- plot_clusterallelecor.R, check_cluster_correlations.R (Makes plot for correlations between alleles and haplotype groups)
- hapclusterandregress_permutations.R (Permutation analysis to rerun regressions to identify expected number of false positives)
- hapclusterandregress_checks.R (rerunning full pipeline of original hapclusterandregress.R but with different SNP inputs/haplotypes)
- hap_regress_plot_checks.R (rerunning full pipeline of original hap_regress_plot.R but with different SNP inputs/haplotypes, generated by hapclusterandregress_checks.R)
- b2705_clusterreg.R (Repeat associations analysis on same original haplotype groups but with only individuals negative for B2705)
-
Figure_Generation: Generation of figures to visualize results
- trait_categories.R (Makes barplot with breakdown of trait categories for the FinnGen traits/classification, and barplot manual classification of HLA-associated triats by pathophys then by organ block)
- genes_hlahits_clean.R (Makes gene plot for fig 1)
- dist_hitstraits.R (Makes ridgeline plot showing distribution by trait group)
- R10_HLA_figures.R (Makes hits binned into genes barplot, makes plot with horizontal LD lines)