ay-lab/FitHiChIP

'No child processes' in select, wrong formatted BED file and python3 error

moritzschaefer opened this issue · 3 comments

When I run FitHiChIP everything runs smooth up to a certain point where a couple of (probably reated) error messages occur. Here is the log:

===>>> using bias regression --- modeled fit_spline_coeff_Intercept [327/1204]
===>>> using bias regression --- modeled fit_spline_coeff_Logbias1
===>>> using bias regression --- modeled fit_spline_coeff_Logbias2

*** modeled the bias regression based probability and expected contact count ****

*** p-value estimation is complete for the bias regression ****

Warning messages:
1: In selectChildren(ac[!fin], -1) : error 'No child processes' in select
2: In selectChildren(ac[!fin], -1) : error 'No child processes' in select

******** FINISHED calling significant interactions
----- Extracted significant interactions ---- FDR threshold lower than: 0.01

---- Within function of plotting distance vs contact count ----
Input interaction file: /home/schamori/Hi-ChIP/data/fithichip/out/FitHiChIP_Peak2Peak_b2500_L20000_U2000000/Coverage_Bias/FitHiC_BiasCorr/WT.interactions_FitH
iC_Q0.01.bed
Output plot file: /home/schamori/Hi-ChIP/data/fithichip/out/FitHiChIP_Peak2Peak_b2500_L20000_U2000000/Coverage_Bias/FitHiC_BiasCorr/WT.interactions_FitHiC_Q0.
01_Dist_CC.png Error in InpLoopData[, 5] - InpLoopData[, 2] :
non-numeric argument to binary operator
Execution halted
generated WashU epigenome browser compatible significant interactions
********** Merged filtering option is true ************
******** applying merge filtering on the FitHiChIP significant interactions ******
File "./src/CombineNearbyInteraction.py", line 148
print '****** Merge filtering of adjacent loops is enabled '
^
SyntaxError: Missing parentheses in call to 'print'. Did you mean print('
* Merge filtering of adjacent loops is enabled *****')?
----- Applied merged filtering (connected component model) on the adjacent loops of FitHiChIP
cat: /home/schamori/Hi-ChIP/data/fithichip/out/FitHiChIP_Peak2Peak_b2500_L20000_U2000000/Coverage_Bias/FitHiC_BiasCorr/Merge_Nearby_Interactions/WT.interaction
s_FitHiC_Q0.01_MergeNearContacts.bed: No such file or directory
SORRY !!!!!!!! FitHiChIP could not find any statistically significant interactions after applying merge filtering on the generated set of loops !!
Option 1: use significant loops without merge filtering
Option 2: If the number of significant loops (without merge filtering) is also very low, Check the input parameters, or check if the number of input nonzero co

Now I'm a little confused whether all these error messages are independent from one another or are caused by a common issue. The only error that is straightforward to understand for me is that your scripts are not compatible with python3, for the others I would highly appreciate if you could hint me towards a strategy on how tofix them.

I use the most recent master branch.

Thank you for your time

The reason for the second error is, that in the generated BED file, the following row appears in some random row (not in the first row..):

chr1    s1      e1      chr2    s2      e2      cc      Coverage1       isPeak1 Bias1   Mapp1   GCContent1      RESites1        Coverage2       isPeak2 Bias2
   Mapp2   GCContent2      RESites2        p       exp_cc_Bias     p_Bias  dbinom_Bias     P-Value_Bias    Q-Value_Bias

I had a look in the code and observed the following:

  • The generated "Interactions.bed" files already have this header, but it's at the right position (first row)
  (base) schamori@mhs-cclab-srv001:~/Hi-ChIP/data/fithichip/out/FitHiChIP_Peak2Peak_b2500_L20000_U2000000/Coverage_Bias$ head -n 1 ../../FitHiChIP_ALL2ALL_b2500_L20000_U2000000/Coverage_Bias/Interactions.bed
chr1    s1      e1      chr2    s2      e2      cc      Coverage1       isPeak1 Bias1   Mapp1   GCContent1      RESites1        Coverage2       isPeak2 Bias2 Mapp2    GCContent2      RESites2
(base) schamori@mhs-cclab-srv001:~/Hi-ChIP/data/fithichip/out/FitHiChIP_Peak2Peak_b2500_L20000_U2000000/Coverage_Bias$ head -n 1 Interactions.bed
chr1    s1      e1      chr2    s2      e2      cc      Coverage1       isPeak1 Bias1   Mapp1   GCContent1      RESites1        Coverage2       isPeak2 Bias2 Mapp2    GCContent2      RESites2
  • The generated file "Interactions.sortedGenDist.bed" still has the header in the right row (first row)
  • The script src/FitHiC_SigInt.r generates the file .interactions_FitHiC.bed, where the header row is written in the wrong row (sorted by first column it seems).
  • The script src/FitHiC_SigInt.r takes a command-line argument 'headerInp' (which is indeed provided in the main FitHiChIP shell script) and should correctly read the Interactions.sortedGenDist.bed file, treating the first row, as a header row.
  • In
    system(paste('sort -k1,1 -k2,2n -k5,5n', paste0('-k',opt$cccol,',',opt$cccol,'nr'), temp_outfile, '>', opt$OutFile))
    , sort is called and sorts the header-row somewhere into the data

I think the issue here might be that I am working with 'chr'-less chromosome names (e.g. 1,2,..,X,Y), so this error wouldn't show up for the other chromosome-annotation (chr1, chr2, chr3) since the header row would accidentially end up on the first position after the sorting.

In any case, the solution would be to sort the file while ignoring the header row, as for example shown in the stackoveflow thread:

https://stackoverflow.com/questions/14562423/is-there-a-way-to-ignore-header-lines-in-a-unix-sort

I've posted a potential solution in PR #58 and am testing it at this moment.

Hi @moritzschaefer
Thanks for using our package and for all your suggestions. I have now updated the merge filtering routine to support python3 (version >= 3.4) and also resolved the indentation issue. Please give it a try and let me know for any questions.

Thank you for having a look into it!

Is there a specific reason you didn't use the according PRs (e.g. #57)? Should I close them?