/k-means-to-BLAST-

Uses k-means prediction to ascertain genomic sequences within ambiguously attributed fasta format data. The resulting cluster sequences are used to develop NCBI BLAST requests.

Primary LanguagePython

k-means-to-BLAST-

Uses k-means prediction to ascertain genomic sequences within ambiguously attributed FASTA format data. The resulting cluster sequences are used to develop NCBI BLAST requests, and return sequence identities.

Notably, this was used to find a complete gene used by a haloarchaeal virus to pack its capsids with DNA and determine self from non-self DNA within a haloarchaeal genomic sequence.