Finding Tn-Seq Essential genes (FiTnEss)
FiTnEss is an R package using Transposon insertion sequencing data to identify essential genes in the genome.
Original paper on bioRxiv: Defining the core essential genome of Pseudomonas aeruginosa
After installing FiTnEss package, run main FiTnEss function by FiTnEss_Run
Arguments in this function include:
- strain
- file_location: path and name of tally file for run:
- permissive_file: path and name of non-permissive TA site file that generated from genomic pre-processing step:
- homologous_file: path and name of homologous TA site file that generated from pre-processing step:
- gene_file: path and name of GFF3 gene annotation file. For example, GFF3 file could be downloaded from Pseudomonas Genome Database:
- save_location: path and name of where to save final results file:
- repeat_time: how many times to run the pipeline in order to obtain best results: by default, we run 3 times.
Packages <- c("dplyr","fBasics","goftest","openxlsx","scales","stats","tidyr")
lapply(Packages, library, character.only = TRUE)
repeat_time = 3)
Locus.CIA | gtot | Nta | pvalue | padj | Ess_fwer | pfdr | Ess_fdr |
PA14_00410 | 5 | 1 | 0.015989 | 1 | NE_fwer | 0.093033 | NE_fdr |
Each tab in the .xlsx file saves results from each replicate. Within each results table, there are 8 columns:
- Locus.CIA: gene index
- gtot: total reads for the gene
- Nta: number of TA sites in this gene
- pvalue: unadjusted p-value of being essential
- padj: FWER-adjusted p-value
- Ess_fwer: confident essential category
- pfdr: FDR-adjusted p-value
- Ess_fdr: candidate essential category