LohseLab/gIMble

Simulating individual blocks is prohibitively slow

Opened this issue · 2 comments

Hi,

I have found that simulating windows of a single block is very slow. The below command gets through 100 / 200_000_000 simulations in one hour.

gimble simulate -o sims -s DIV -p 50 -e 44 -a 1 -b 1 -r 100 -w 2_000_000 -n 1 -l 100 -u 2.9e-9 -k 10,10,10,10 -m DIV -A 100_000 -B 100_000 -C 100_000 -T 200_000 --rec_rate 0.6

If I instead simulate the same amount of blocks in only 10 windows per replicate (each with 200_000 blocks), then the simulation finishes in less than two hours.

gimble simulate -o sims -s DIV -p 50 -e 44 -a 1 -b 1 -r 100 -w 10 -n 200_000 -l 100 -u 2.9e-9 -k 10,10,10,10 -m DIV -A 100_000 -B 100_000 -C 100_000 -T 200_000 --rec_rate 0.6

I am using 50 parallel processes in the above sims, but it is just as slow if I use only one process.

Cheers,

Alex

Thanks @A-J-F-Mackintosh!
I will do some profiling to find out what is going on.

KLohse commented

Thanks @A-J-F-Mackintosh! My guess is that when simulating few blocks but very many windows most of the time is spent writing the window-wise summaries into the zar store where each window will have it's own bSFS tally.