2707 caQTL in RASQUAL paper

Question

2707 caQTL in RASQUAL paper

YaCui opened this issue 5 years ago · 12 comments

YaCui commented 5 years ago

Dear Natsuhiko,
Thanks so much for developing rasqual! Could you provide the 2707 caQTLs identified in RASQUAL paper?

best,
Ya

Answer 1 · 2019-04-19T20:58:40.000Z

Hi,

Here is the link to the google drive: https://drive.google.com/open?id=0B-aFDIHv9Wy3M3kwS1hPM09TRlU

You can find the peak annotation (peaks.bed.gz) as well as the peak IDs at FDR 10% (pid.fdr10.txt).

I would, however, recommend to use the latest caQTL result with 100 British samples presented in our latest paper (https://www.nature.com/articles/s41588-018-0278-6?WT.feed_name=subjects_epigenetics).

Best regards,

Natsuhiko

Answer 2 · 2019-04-19T21:58:04.000Z

Great! Thanks for sharing!

best,
Ya

Answer 3 · 2019-05-26T03:34:27.000Z

Dear Natsuhiko,
I have a small question. How should I determine the values of -l and -m? Can I just use "-l 378 -m 62" in my analysis for all features?

Thanks,
Ya

Answer 4 · 2019-05-26T08:24:54.000Z

You need to count appropriate numbers of SNPs for each feature by your self. It's relatively easy to count the number of tested SNPs (-l) by counting the number of rows in VCF that are fed to RASQUAL (you can just use wc command on linux). You could set the number of feature SNPs (-m) as the number of tested SNPs if you have enough memory and not sure how to count the number of SNPs overlapping with multiple features.

Best regards,
Natsuhiko

Answer 5 · 2019-06-19T22:54:18.000Z

Dear Natsuhiko,
I am a little confused about the results of Rasqual. I can get the results like "rasqual_atac_1M.gz", but how can I get the q-values in "Q.val.txt.gz"? It seems that q-values in "Q.val.txt.gz" are different from the "Log_10 Benjamini-Hochberg Q-value" in "rasqual_atac_1M.gz".

All files are from https://drive.google.com/drive/folders/0B-aFDIHv9Wy3M3kwS1hPM09TRlU.

Thanks,
Ya

Answer 6 · 2019-06-20T08:09:48.000Z

Sorry for the confusion. The file "rasqual_atac_1M.gz" is old and the 10th column is not the Q value. This is because we provide the Q values as a separate file.

Best regards,
Natsuhiko

Answer 7 · 2019-06-20T15:26:11.000Z

Hi Natsuhiko,
So how can I get the Q values file? I cannot get this file if I just run the commands like below:

cd $RASQUALDIR
tabix data/chr11.gz 11:2315000-2340000 | bin/rasqual -y data/Y.bin -k data/K.bin -n 24 -j 1 -l 378 -m 62 -s 2316875,2320655,2321750,2321914,2324112 -e 2319151,2320937,2321843,2323290,2324279 -t -f C11orf21 -z

Thanks,
Ya

Answer 8 · 2019-06-21T08:24:43.000Z

Sorry, but I don't understand your problem. I believe Q.val.txt.gz gives you the Q value for each peak in the rasqual_atac_1M.gz file.

The example command found in the github page is for RNA-seq, but not ATAC-seq we provided in the Google drive.

Best regards,
Natsuhiko

Answer 9 · 2019-06-21T15:54:53.000Z

Hi Natsuhiko,
Got it. Thank you so much for your help.

Thanks,
Ya

Answer 10 · 2019-11-04T22:58:27.000Z

Hi Natsuhiko,

regarding the caQTL result with 100 British samples (https://www.nature.com/articles/s41588-018-0278-6?WT.feed_name=subjects_epigenetics), I have your summary statistics with the probabilities but I don't know what is the cutoff you use to define a caQTL and how many are there in total? I cannot find it in the paper. Thank you very much!!!!!
Paola

Answer 11 · 2019-11-05T08:58:10.000Z

Hi Paola,

The RASQUAL mapping result based on 24 LCLs (not 100 LCLs) is found here: https://drive.google.com/drive/folders/0B-aFDIHv9Wy3M3kwS1hPM09TRlU

The paper you cited is different. In the paper, we used 100 LCLs and performed caQTL mapping with a different approach to detect causal interactions in the genome. Because we used a Bayesian approach, we don't have "significant caQTLs" but just posterior probabilities.

Best regards,
Natsuhiko

Answer 12 · 2019-11-05T22:08:47.000Z

Thank you Natsuhiko!
Yes I have been using the results from the 24 LCLs of the first study, but since in your comment above you said:
"I would, however, recommend to use the latest caQTL result with 100 British samples presented in our latest paper (https://www.nature.com/articles/s41588-018-0278-6?WT.feed_name=subjects_epigenetics)", I though that you also identified caQTL, maybe more than using 24 samples so I though to use this new study.... Anyway I can just use the results from the 24 samples !
Thank you very much!!
Paola