Esoteric python scripts
Replace specific parts of a fasta file with N's to mask it in a BLAST (or BLAST-like) search
./mask_fasta.py input.fasta mask.csv output.fasta
The mask.csv file should be a comma-delimeted file with information on where to mask formatted as: fastaID,start,end. Start and end positions are inclusive in the masking
Parse the clustering output from CDHIT to a tab-delimted table
./cdhit_cluster_parse.py cdhit.clstr
Cluster BLAST matches of repeat sequences to putative CRISPR arrays
./crispr_repeat_cluster.py -h
Parse output from minced to get all spacers in a fasta, all repeats in a fasta, or just a tab-delimeted files with positions of crisprs
./minced_parser.py minced.out tab
./minced_parser.py minced.out repeats
./minced_parser.py minced.out spacers
Concatenate a set of alignments to create for example a phylogenetic tree of core genes. If a genomes is lacking a gene it will be replaced with the gap '-' symbol
./combine_alignments.py genome_list alignment_dir