Comprehensive catalogs for microbial genes and metagenome-assembled genomes of the swine lower respiratory tract microbiome
This repository contains scripts and data that used for characterizing respiratory microbiome of the manuscript "Comprehensive catalogs for microbial genes and metagenome-assembled genomes of the swine lower respiratory tract microbiome elucidate the relationship of microbial species with lung lesions".
Consruction of genes and MAGs catalogs for pig lower respiratory microbiome.
Requirements:
- fastp (tested v0.20.1)
- Bowtie2 (tested v2.3.5.1)
- SAMtools (tested v1.7)
- MEGAHIT (tested v1.2.9)
- prodigal (tested v2.6.3)
- CD-HIT (tested v4.8.1)
- Diamond (tested v2.0.12.150)
- BASTA (tested v1.3)
- blast (tested v2.12.0)
- MetaBAT2 (tested v2.15)
- Maxbin2 (tested v2.2.7)
- CONCOCT (tested v0.5.0)
- metaWRAP (tested v1.3.2)
- CheckM(tested v1.0.18)
- metaSPAdes (tested v3.13.0)
- VAMB (tested v3.0.2)
- dRep (tested v3.2.2)
- GTDB-Tk (tested v1.7.0)
Codes used to calculate the abundance of genes and metagenome-assembled genomes.
Requirements:
- BWA MEM2 (tested v2.2.1)
- SAMtools (tested v1.7)
- FeatureCounts (tested v2.0.1)
- metaWRAP (tested v1.3.2)
Script to perform functional annotation.
Requirements:
- EggNOG mapper (tested v2.6.1)
- HMMER (tested v3.1b2)
- KOBAS (tested v3.0.3)
- Diamond (tested v2.0.12.150)
- blast (tested v2.12.0)
- Associated data for statistical analysis and visualization.
gene_Freq_Abundance_Counts.py
:Calcultaed the gene presence in 745 tested samples.blast_best.py
:Extracted the best blast results of VFDB alignment.vfg.freq.absent.sh
: Find the presence or absence of virulence factor genes in the Mycoplasma hyopneumoniae genomes.gene_info_deal.sh
: Statistic for gene information.
R Scripts/
sequence_depth_hist.R
: The histogram of sequencing depth in raw data and clean data.gene_accumulation.R
: Plot the gene accumulation curve in the PRGC90.gene_range_stats.r
: Statistics of the genes relative abundance in different ranges.Items _proportion.R
: The proportion of shraed items in all samples.HL_SVLL_top20_species.R
: Plot the top20 species in relative abudance in healthy lung samples and severe lung-lesion samples.MAG_taxa_count.R
: Plots of the MAGs taxonomic proportions.sgb_usgb.R
: The numbers of SGB and the proportion of uSGB in 18 phylum.MAG_quality.R
: Plots of the MAGs quality.pan_core_gene_accumulation.R
: The accumulation cuvers of the pan-genes and core-genes.pan_core_cog_pie.R
COG annotaion of the pan-genes and core-genes.mds_ani.R
: Multidimensional Scaling analysis based on the average nucleotide identity.pan_core_top20ko_bar.R
: The 20 KEGG pathways with the largest number of annotated genes in pan-genes and core-genes.function_clade_compare.R
: Comparison of the numbers of annotated functional genes in two Mycoplasma hyopneumoniae clades.vfg_heatmap.R
: Distribution of virulence factor genes in 285 Mycoplasma hyopneumoniae genomes.diversity_box.R
: Alpha and beta diversity of lung microbiome in the F7 population.Kingdom.R
: Average species composition of virus, fungi, and archaea kingdom and the proportion of different kingdom.