all new nido/cov assemblies
Opened this issue · 2 comments
List of the 990 accessions where there's possibly a new CoV/nido RdRp according to sra_species_table.tsv
:
s3://serratus-rayan/pro_new_cov_nido-assembly/all_new_cov_nido.sra
900 could be assembled, here are the:
scaffolds.fasta
(900) :s3://serratus-rayan/pro_new_cov_nido-assembly/all_new_cov_nido.scaffolds.txt
gene_clusters.fasta
(900) :s3://serratus-rayan/pro_new_cov_nido-assembly/all_new_cov_nido.gc.txt
gene_clusters.checkv_filtered.fasta
(898):s3://serratus-rayan/pro_new_cov_nido-assembly/all_new_cov_nido.gc_cv
An immediate take-away is that most of the checkv_filtered
assemblies are empty. So I recommend not using them but instead take gene_clusters.fasta
or to be even more conservative, the whole scaffolds.fasta
.
all contigs having a motifator hit:
s3://serratus-rayan/pro_new_cov_nido-assembly/all_new_cov_nido.scaffolds_motifator.whole_contigs_hits.fasta
hmmsearch results versus Pfam-A:
s3://serratus-rayan/pro_new_cov_nido-assembly/all_new_cov_nido.scaffolds_motifator.whole_contigs_hits.fasta.transeq.faa.*
(ran with hmmsearch -A [.sto] --tblout [.tbl] --domtblout [.domtbl] -o [.hmmsearch_stdout] Pfam-A.hmm [contigs.fa]
)