Rank off targets
Closed this issue · 3 comments
bretonics commented
Get match hits for each CRISPR and rank matches according to how many bp are matched. Not only how many occurrences per target throughout genome but which has the least matching base pairs.
bretonics commented
Commit 4d3c31a adds output with:
- Number of significant hits in entire sequence
- Number of matching nucleotides per hit
Occurrence in format length
: matches
Length = window length (CRISPR sequence size)
Matches = number of nucleotide matches in hit
bretonics commented
3591c4f and b7925f0 adds support.
Need to switch ranking priority to identities as primary sorting, then by number of occurrences.
Name Sequence Strand Reverse Occurrences Identities
CRISPR_3 TGTGATCACGTACTATTATGCGG plus GGCGTATTATCATGCACTAGTGT 3 23,8,8
CRISPR_2 AAAAATTTTCTCTATCTAACGGG minus GGGCAATCTATCTCTTTTAAAAA 4 23,15,8,8
CRISPR_1 AAAAAATTTTCTCTATCTAACGG minus GGCAATCTATCTCTTTTAAAAAA 4 23,16,8,8
CRISPR_8 AAAAAAAATTTTCCCTATCGGGG minus GGGGCTATCCCTTTTAAAAAAAA 2 23,9
CRISPR_9 AAAAAAATTTTCCCTATCGGGGG minus GGGGGCTATCCCTTTTAAAAAAA 2 23,9
CRISPR_6 CGAAAAAAAATTTTCCCTATCGG minus GGCTATCCCTTTTAAAAAAAAGC 2 23,9
CRISPR_7 GAAAAAAAATTTTCCCTATCGGG minus GGGCTATCCCTTTTAAAAAAAAG 2 23,9
CRISPR_4 AAAAATCCCATCGATCTAGCAGG minus GGACGATCTAGCTACCCTAAAAA 8 23,9,7,7,7,7,7,7
CRISPR_0 ATGTAGCTAGCTAGCTAGTAGGG plus GGGATGATCGATCGATCGATGTA 5 23,14,12,10,10
CRISPR_5 TCCCATCGATCTAGCAGGCCCGG minus GGCCCGGACGATCTAGCTACCCT 7 23,15,9,7,7,7,7
Less base pair matches in match hit (identities) == better CRISPR, followed by fewer occurrences.