get_valleys core dump

Question

get_valleys core dump

Opened this issue 5 years ago · 2 comments

Hello. I'm getting a core dump when running the get_valleys function. This is somewhat related to the core dump I was getting running spline fitting in that it fails with larger files. They don't all fail on the same chromosome. Some get a little further than others before failing. I've been working with my IT department to run it on our HPC cluster. They've determined that it is failing at line 1390 in src/epsfr_episafari_utils.cpp when cur_tot_signal / 200 turns out to be more than the size of log_factorials (1000*1000).

I'm running it with:

max_trough_sig=1000
min_summit_sig=5
min_summit2trough_frac=1.2
max_summit2trough_dist=1000
min_summit2trough_dist=0
min_multimapp=1.2
sparse_data=0
pval_type=0

EpiSAFARI -get_valleys -signal_dir ${PROC_READS_DIR} -mmapp_dir ${mmap_dir} -genome_dir ${seq_dir} \
	-max_signal_at_trough ${max_trough_sig} -min_signal_at_summit ${min_summit_sig} \
	-f_min ${min_summit2trough_frac} -l_min ${min_summit2trough_dist} -l_max ${max_summit2trough_dist} \
	-max_mmap ${min_multimapp} -max_qval 0.01 -sparse_profile ${sparse_data} -pval_type ${pval_type}

Answer 1 · 2020-08-11T00:22:50.000Z

Hi, yes, you have done the hard part for this! Thanks much for this. Your data is very high depth compared to the data we ran for the manuscript. It makes sense that the code segfaults at different time bc it goes over the array boundary and may or may not have illegal access.There is a call around where you pointed out. It goes sth like this:

double* log_factorials = buffer_log_factorials(1000*1000);

This call stores factorials to make p-value computes go faster. You can increase that value to sth like

double* log_factorials = buffer_log_factorials(2010001000);

I assumed scaling of 20 above. Assuming 1000*1000 factorials are enough for 30-40 million reads in a typical chip-seq data, you could, in principle, find the scaling factor based on your library size.

We clearly need to make sure the code outputs an error rather than reading uncharted memory.

Hope this helps.

Answer 2 · 2020-08-11T05:02:26.000Z

That did the trick! For your reference, scaling by 20 supported files as large as 500M reads. I've got one file that's 550M and a pooled set that is 1500M. Those two failed and I'll just bump it up again.

Thanks for your rapid and helpful responses! I realise my data are bizarrely large.