E-mail: heruiliao2-c@my.cityu.edu.hk
- Python >=3.6
- GraphAligner 1.0.11 (https://github.com/maickrau/GraphAligner)
Make sure these programs have been installed and added in path.
git clone https://github.com/liaoherui/GraphV.git
Then, you need to download the genome graph database of 8 RNA viruses. Run:
cd GraphV
sh download.sh
If you fail to download database with download.sh
, try another script then, Run:
cd GraphV
sh download_2.sh
Still failed, please email to the author to get the database.
Use python GraphV.py -h
to check the usage.
A demo real data of SARS-Cov-2 is included in "Data" folder, which can be uesd for test.
A running demo: (Result will be generated in the folder called "GraphV"
by default)
python GraphV.py -i Data/SRR10948550_801.fastq -v SCOV2
The below table shows relationship between virus name and virus_type
parameter:
Virus Name | virus_type parameter |
---|---|
SARS-Cov-2 | SCOV2 |
HIV | HIV |
HCV | HCV |
Ebolavirus | EBV |
Zika virus | ZKV |
Dengue virus | DGV |
Lassa virus | LSV |
Enterovirus | ETVA |
There will be 5 output files of GraphV.
-
*.json
file --- The alignment result file from GraphAligner. -
*_Most_possible_Strain_report.txt
--- The final report generated by GraphV. -
*_All_Cov.txt
--- The GraphV result file which is sorted by the descending order of alignment coverage. -
*_All_Cov_by_length.txt
--- The GraphV result file which is sorted by the descending order of alignment length. -
*_Unique_Cov.txt
--- The GraphV result file which is sorted by the descending order of unique coverage.
Note:
For 3, 4, the meaning of each column in the file is: Strain name, alignment length, genome length, alignment coverage
.
For 5, the meaning of each column in the file is: Strain name, unique alignment length, genome length, unique coverage, strain name in database
.