tempus-challenge: A Python repository from ericproffitt

Tempus Challenge

The relevant data is read from test_vcf_data.txt.

For each variant, the variant IDs are built in HGVS sequence variant nomenclature format and submitted in query blocks of 300 to the VEP HGVS API.

Data is retrievd from the API in JSON format, processed, and then written to the output file, variants.tsv, which contains the following fields,

chromosome
position
reference allele (humanG1Kv37)
alternative allele
depth of sequence coverage at the site of variation
number of reads supporting the variant
percentage of reads supporting the variant versus those supporting reference reads
gene of the variant
variant class
variant effect
minor allele frequency

Note that loci with multiple alternative alleles are broken up into separate rows.

Unavailable data is denoted NA.

CLI run command,

python3 tempus_challenge.py