Can results from Mustache loop calling be used in enrichment
Closed this issue · 6 comments
Hi,
I was wondering if there is a way to use precomputed loops (bedpe file) from mustache in the enrichment analysis?
Thank you
Are you asking the question with regard to the usage of dcHiC and combining the result with Mustache?
Otherwise, you should raise the issue to the Mustache thread.
But, in general the loops can be used for enrichment analysis. You just need to find the genes overlapping (if you're doing a geneset enrichment) with the loop anchors using bedtools pairtobed
option.
Yes in regard to the usage of dcHiC and combining the result with Mustache.
For instance I want to run Rscript dchicf.r --file samples_p7.txt --pcatype enrich --genome hg38 --diffdir ctrl_vs_hgps_p7_50Kb --exclA T --region both --pcgroup pcQnm --interaction intra --pcscore F --compare F
but with mustache loops and the differential compartments. How would I go about doing this?
So, dcHiC by default expects a significant file generated from fithic. It creates a FithicResult.txt file (should be under DifferentialResult/<prefix_resolution>/fithic_run/
directory). This file looks like the following -
ES NPC
chr10_100000000_chr10_100000000 1 1
chr10_100000000_chr10_100100000 1 1
chr10_100000000_chr10_100400000 1 1
chr10_100000000_chr10_102200000 0 1
chr10_100000000_chr10_102300000 1 1
chr10_100000000_chr10_102400000 0 1
Here columns are sample names (should be same as given in your samples_p7.txt file) and rows are the interacting regions (starting coordinate). Each element represents the interaction category (1 = significant, 0 = not significant) within each sample.
So, you can replace this file with Mustache (Keep the name FithicResult.txt) and run dcHiC. It will do the job.
Sorry I'm a little confused about the format, so if I have my 2 mustache loop files are as such
ctrl:
> chr1 1030000 1035000 chr1 1305000 1310000
> chr1 1120000 1125000 chr1 1240000 1245000
> chr1 1120000 1125000 chr1 1290000 1295000
hgps:
> chr1 985000 990000 chr1 1290000 1295000
> chr1 1120000 1125000 chr1 1285000 1290000
> chr1 1125000 1130000 chr1 1215000 1220000
> chr1 1745000 1750000 chr1 1900000 1905000
I should attach the starting coordinates of each file? and then the 1 and 0s are if that specific interaction occurs in the specific sample?
I also get this error when running (dchic) naveen@Naveens-MacBook-Pro dchic % ./enrich.sh Finding gene enrichment for ctrl_p7 hgps_p7 samples [1] "ctrl_p7" "hgps_p7" [1] "hg38_50000_goldenpathData" Error in file(file, "rt") : cannot open the connection Calls: geneEnrichment -> read.table -> file In addition: Warning message: In file(file, "rt") : cannot open file 'DifferentialResult/ctrl_vs_hgps_p7_50Kb/fdr_result/differential.intra_compartmentLoops.bedpe': No such file or directory Execution halted
For the example, your file should look like the following -
ctrl hgps
chr1_985000_chr1_1290000 0 1
chr1_1030000_chr1_1305000 1 0
chr1_1120000_chr1_1240000 1 0
chr1_1120000_chr1_1285000 0 1
...
Regarding the error, you should first run the --pcatype dloop
and then perform the enrichment. This step will create the differential.intra_compartmentLoops.bedpe file.
I am closing this issue. Please open another thread if the issue persists.
Thanks.