Can PGxPOP handle unphased VCFs?
Closed this issue · 1 comments
cqgd commented
Hi Greg, Adam,
Many thanks for releasing this tool and for providing a nice overview of CYP AF in UKB!
One question: does PGxPOP handle unphased VCFs?
--phased
being an optional argument seems to suggest the input can be either phased or unphased:
________________________________________
| ___ ___ ___ ___ ___ |
| | _ \/ __|_ _| _ \/\ \| _ \ |
| | _/ (_ \ \ / _/ \ | _/ |
| |_| \___/_\_\_| \__\/|_| |
| |
| v1.0 |
| Written by |
| Adam Lavertu and Greg McInnes |
| with help from PharmGKB. |
|________________________________________|
Copyright (C) 2020 Stanford University.
Distributed under the Mozilla Public License 2.0 open source license.
usage: PGxPOP.py [-h] [-f VCF] [-g GENE] [--phased] [--build BUILD] [--extra_variants] [-d] [-b] [-o OUTPUT]
CityDawg determines star allele haplotypes for samples in a VCF file and outputs predicted pharmacogenetic phenotypes.
optional arguments:
-h, --help show this help message and exit
-f VCF, --vcf VCF Input VCF
-g GENE, --gene GENE Gene to run. Select from list. Run all by default. CFTR, CYP2C9, CYP2D6, CYP4F2, IFNL3, TPMT, VKORC1, CYP2C19,
CYP3A5, DPYD, SLCO1B1, UGT1A1, CYP2B6, NUDT15
--phased Data is phased. Will try to determine phasing status from VCF by default.
(...)
The GitHub README.md, on the other hand, mentions only phased data input:
PGxPOP is a population-scale PGx allele caller designed to handle 100,000s of samples. Input is a phased VCF file, that has been indexed with tabix.
Many thanks,
Chris
alavertu commented
Hi Chris, While PGxPOP accepts unphased input and will describe the pgx variants it identifies at those loci, it will not be able to give proper star allele calls as those require phase information to determine haplotypes. We suggest phasing samples prior to PGxPOP runs with EAGLE or a similar tool.