HLA_finngen

Scripts for HLA analysis in FinnGen data by Courtney Smith et al. Data generated from this study are available at https://doi.org/10.5281/zenodo.12763469.

Directory Overview

  • Genetic_Correlations: Running LDSC genetic correlation

    • Snakemake_gencor, filter_indeptraits.R, combine_gen_cor.sh, plot_gen_cor.R (identifying pairs of non-redundant (rg < 0.95) traits for HLA-associated traits from GWAS)
  • Enrichment_Analysis: Quantifying enrichment in HLA region

    • prep_HLA_combine.R, prep_genome_hits.R, plot_enrich.R, combine_hitssumstats.R, manhattan_hitsacrosstraits.R (performs enrichment analysis and corresponding manhattan-like plot and barplots)
  • Pleiotropy_Matrix: Generation of matrix of disease hits and heatmap

    • Snakefile_fullhits, make_matrix_fullhits.R, make_heatmap.R, make_heatmap_wspacing.R (filters to summary stats of all HLA GWAS hits combined across all traits by all traits and generates matrix and heatmap for fig 3)
  • Haplotype_Regression: Identifying haplotypes, clustering into haplotype groups, performing haplotype regression analysis and adjacent analyses

    • haplo_blockdefining.R (generated list of snps in each block)
    • hapclusterandregress.R (Defines haplotype groups and runs regression; hapclusterandregress_alltraits.R is the same but on all traits not just the SNP ascertained HLA-associated traits)
    • hapclusterandregress_alltraits.R (Repeats about but for all traits)
    • hap_regress_plot.R (Makes the dendrogram plots and haplotype genotype heatmaps for fig 5)
    • hap_results_updown.R (Analyzing haplotype disease trade-offs and generates figure S4)
    • annotategenes.R (Makes annotate snps with genes for fig 5)
    • regress_enrich.R (Haplotype Group Burden Analysis)
    • comparecorrresults.R (Haplotype Regression Trait Pair Correlation Measure analyses for fig 6)
    • clusterregadjallele.R (Regression analysis with adjusting for relevant classical HLA alleles (VIF < 5) for each block + Plots comparing the z-scores for regression analysis before and after adjustment)
    • allele_regress.R (Allele regression analyses w/ two approaches: one allele at a time and with all alleles together (after removing colinear) + plot z-score heatmap results)
    • compareregmethods.R (Compare different regression methods, generate data tables for supplement)
    • regresults_traitadjtrait.R (Trait adj for trait regression analysis)
    • plot_clusterallelecor.R, check_cluster_correlations.R (Makes plot for correlations between alleles and haplotype groups)
    • hapclusterandregress_permutations.R (Permutation analysis to rerun regressions to identify expected number of false positives)
    • hapclusterandregress_checks.R (rerunning full pipeline of original hapclusterandregress.R but with different SNP inputs/haplotypes)
    • hap_regress_plot_checks.R (rerunning full pipeline of original hap_regress_plot.R but with different SNP inputs/haplotypes, generated by hapclusterandregress_checks.R)
    • b2705_clusterreg.R (Repeat associations analysis on same original haplotype groups but with only individuals negative for B2705)
  • Figure_Generation: Generation of figures to visualize results

    • trait_categories.R (Makes barplot with breakdown of trait categories for the FinnGen traits/classification, and barplot manual classification of HLA-associated triats by pathophys then by organ block)
    • genes_hlahits_clean.R (Makes gene plot for fig 1)
    • dist_hitstraits.R (Makes ridgeline plot showing distribution by trait group)
    • R10_HLA_figures.R (Makes hits binned into genes barplot, makes plot with horizontal LD lines)