/pangenome_pseudogene_null

Primary LanguageRGNU General Public License v3.0GPL-3.0

Code and source data for:

Douglas GM, Shapiro BJ. 2024. Pseudogenes act as a neutral reference for detecting selection in prokaryotic pangenomes. Nature Ecology and Evolution. doi:10.1038/s41559-023-02268-6

The published paper requires is behind a paywall, but you can read an earlier version of the manuscript as a bioRxiv preprint. This blog post is an accessible description of our work.

Please feel free to open an issue if you have any questions.

Repository structure:

  • display_source_data - Source data for each display item, included for convenience. Please note that this is only the final processed data used for plotting. Larger processed datafiles are found at this Zenodo repository.

  • scripts

    • analyses - R scripts to generate reported models and to run key statistical tests.

    • broad_analysis_processing - R scripts for analyzing and processing files for broad pangenome analysis.

    • display - R scripts for generating display items.

    • indepth_analysis_processing - R scripts for analyzing and processing files for indepth pangenome analysis.

    • preprocessing - R and Python scripts for preprocessing raw and intermediate files.

    • sanity_checks - Quick scripts for running checks of specific results reported in the manuscript (i.e., regenerating results with independent code).

    • text_results - Small scripts for computing values reported in text (intended to be run on datafiles distributed on Zenodo repository).