In silico identification of the clade-8 specific SNP from E. coli O157.
In a study published in 2008, 528 of Escherichia coli O157 strains were divided into nine clades according to a panel of 48 SNP loci. O157 strains that were isolated from patients suffering from Hemolytic Uremic Syndrome (HUS) were found frequently belonging to the clade-8 (Manning, Motiwala et al. 2008).
This correlation was confirmed later in a few independent studies (Iyoda, Manning et al. 2014, Soderlund, Jernberg et al. 2014, Tarr, Shringi et al. 2018), although the nature of the phenomenon has not been completely understood.
To facilitate the study of E. coli O157 strains, I developed this tool to identify the clade-8 specific SNP from whole-genome sequence.
The program is based on a PCR assay detecting a SNP (C539A at ECs2357 in strain Sakai) that is unique to clade-8 strains (Iyoda, Manning et al. 2014).
Before start, you need to make sure the following program is fully functional in your system:
First one would need to generate a list of WGS assembly files to be analyzed, e.g.
ls *.fasta >list.fasta.txt
Then one could put clade8.pl
, list.fasta.txt
alongside the WGS assemblies files, and run command like:
perl clade8.pl list.fasta.txt
The program will start to work and write the result into a report file list.fasta.xls
in the same directory.