Phenotype data

Question

Phenotype data

RafaFariasVarona opened this issue a year ago · 1 comments

Hello,
I'm writing to you because I'd like to know if it is possible to use PRSice-2 without knowing the phenotype. In my case, I'd like to calculate COVID-19 Polygenic Risk Scores. Firstly, I obtained the base data (GWAS associated to COVID-19) and target data (1000G European samples). Then, I followed the steps of your tutorial for PRS (https://choishingwan.github.io/PRS-Tutorial/) and I computed PRS for each individual using PLINK. I used PRSice-2 too, but I'm not sure about the results outputted by this program. PRSice-2 requires the phenotype, but I don't have that information available. Thus, I decided to use as phenotype the PRS values computed by PLINK and I obtained other PRS values. I used the following command in PRSice-2:
Rscript PRSice.R
--prsice PRSice_linux
--base listasnp.txt
--target EUR.QC
--binary-target F
--pheno EUR.covid \ (PLINK values as phenotype)
--stat OR
--thread 3 \

I noticed that the PRS values estimated by PRSice-2 differ a lot from those computed by PLINK. Additionally, I used "--no-regress" and I obtained other scores. Therefore, which methodology should I use? PLINK scores as phenotype or "-no-regress" option?

Thank you very much for your help.
Regards,
Rafa Farias

Answer 1 · 2023-05-03T15:53:00.000Z

This is not an appropriate use of our Software and based on your comment I suggest you to first read our paper https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7612115/ to better understand what is a polygenic score and what are the optimization involved