giggle: Could not open human_hm_sort/3651_sort_peaks.narrowPeak.bed.gz.
Closed this issue · 3 comments
To who it may concerned,
When I was trying to build the index using the cistrome histone modification peak files, giggle report this errors,which says "could not open XXX."
giggle index -i "human_hm_sort/*.gz" -o human_hm_index -f -s
the error information:
Could not open file 'human_hm_sort/3651_sort_peaks.narrowPeak.bed.gz'
giggle: Could not open human_hm_sort/3651_sort_peaks.narrowPeak.bed.gz.
I have had sort these bed files using the script sort_bed.
I think this error information is too vague.How can I fix this problem?
Thanks for your kind advice.
Xiu
Hi,
I am sorry to trouble you again.This error have still given me great confusion in these days although I try to use many mesures to fix it.I believe GIGGLE's developer still have the responsibility to maintain it after its birth. I hope I can receive the help from you.
Here is my code:
/home/xxzhang/workplace/software/giggle/scripts/sort_bed "./named/[A-J]*" ./named_sort/ 30
time giggle index -i "./named_sort/*gz" -o ./named_sort_b -s -f
and this is the error:
Could not open file './named_sort/H3K27ac_H1_Embryonic_Stem_Cell_Embryo_.18.bed.gz'
giggle: Could not open ./named_sort/H3K27ac_H1_Embryonic_Stem_Cell_Embryo_.18.bed.gz.
I am using the cistrome histone mark data in my project and I expect some enrichment results using GIGGLE.
However,I continuely meet with the same error,it says "could not open file……"and without any other token.
I have noticed the previous issues of GIGGLE.I made some trials but still failed.
(1) the cistrome files for indexing I have checked they are both tab split.
(2) I have set the ulimit -c 100000
(3) I also put all the files in one fold and operate according to the above codes.
(4) I try to use other files to index it meet with the same error
I really do not know what to do next.You advice for me is really important.This problem has puzzled me for nearly a mouth.
if this problem can not fix,I may try to use other tools to solve my problems.
Thanks!
Xiu
Hi, everyone! I solved this problem successfuly.
Maybe the reason is the GIGGLE requires more memory or other computational resources that exceed the usual user limitation.
The way I take to solve this problem is to split the files into different part and bulid index sepeately.And then I combine the result files as the final results.
The detailed code is as follows,
(base) [xxzhang@cu08 human_histone_mark]$ mkdir H3K27me3
(base) [xxzhang@cu08 human_histone_mark]$ cp ./named_sort/H3K27me3* ./H3K27me3/
(base) [xxzhang@cu08 human_histone_mark]$ cd ./H3K27me3/
(base) [xxzhang@cu08 H3K27me3]$ mkdir named_H3K27me3_s1
(base) [xxzhang@cu08 H3K27me3]$ mkdir named_H3K27me3_s2
(base) [xxzhang@cu08 H3K27me3]$ mkdir named_H3K27me3_s3
(base) [xxzhang@cu08 H3K27me3]$ ls -Q ./ |head -500 |xargs -i mv ./{} ./named_H3K27me3_s1/
ls: write error: Broken pipe
(base) [xxzhang@cu08 H3K27me3]$ ls -Q ./ |head -500 |xargs -i mv ./{} ./named_H3K27me3_s2/
ls: write error: Broken pipe
(base) [xxzhang@cu08 H3K27me3]$ mv ./*.gz ./named_H3K27me3_s3/
(base) [xxzhang@cu08 H3K27me3]$ giggle index -i "./named_H3K27me3_s1/*" -o ./named_H3K27me3_s1_index -s -f
Indexed 5884451 intervals.
(base) [xxzhang@cu08 H3K27me3]$ giggle index -i "./named_H3K27me3_s2/*" -o ./named_H3K27me3_s2_index -s -f
Indexed 4270175 intervals.
(base) [xxzhang@cu08 H3K27me3]$ giggle index -i "./named_H3K27me3_s3/*" -o ./named_H3K27me3_s3_index -s -f
Indexed 4924467 intervals.
(base) [xxzhang@cu08 H3K27me3]$ cp ../Hs_repeat.bed.gz ./
(base) [xxzhang@cu08 H3K27me3]$ giggle search -i ./named_H3K27me3_s1_index/ -q Hs_repeat.bed.gz -s >Hs_repeat.bed.gz.giggle.H3K27me3_s1.result
(base) [xxzhang@cu08 H3K27me3]$ giggle search -i ./named_H3K27me3_s2_index/ -q Hs_repeat.bed.gz -s >Hs_repeat.bed.gz.giggle.H3K27me3_s2.result
(base) [xxzhang@cu08 H3K27me3]$ giggle search -i ./named_H3K27me3_s3_index/ -q Hs_repeat.bed.gz -s >Hs_repeat.bed.gz.giggle.H3K27me3_s3.result
(base) [xxzhang@cu08 H3K27me3]$ cat Hs_repeat.bed.gz.giggle.H3K27me3_s* >Hs_repeat.bed.gz.giggle.H3K27me3_all.result
(base) [xxzhang@cu08 H3K27me3]$ awk '$8>0' Hs_repeat.bed.gz.giggle.H3K27me3_all.result >repeat_positive.H3K27me3.result
This solution is so complex but it can fix this problem.
Hope this may give your some clues for your own problems.
Hi Xiu, sorry we missed your issue. We ran into similar issues in the past for large indices and employed a similar sharding strategy. If you look under the sharding
directory in this repository, you will find a script that can be used to build and search a sharded giggle index (along with instructions for running).