Convert accession code to fasta file and identifying loci
Closed this issue · 1 comments
Hi! Do you have any scripts on how you obtained the fasta files from the accession codes in master_table_resistance.csv? And then how you identified the loci for each resistance gene within each fasta file?
Hello! First let me apologize for the very late response, I have just returned from a leave of absence and am catching up on what I missed.
I have created a new directory in input_data
called prep_fasta_files
. This directory includes our code that creates the fasta files from input vcf files by comparing to the reference h37rv.fasta
. The code directly extracts the region of interest in the isolate genome based on the parameters you provide, and outputs an alignment of all isolates for which you provided a vcf. You will have to create the vcf files yourself by downloading the read data for the accession codes and processing the data through a variant calling pipeline.
I hope this is helpful, please reach out if you have further questions.