Runtime estimation
Opened this issue · 7 comments
Hi,
do you know the estimated run time for AS discovery on a human RNA-Seq experiment? I am running ASpli on just several human alignment files with around 50 mln reads each (40 cores, 384 Gb RAM) for more than 10 hours. Is it expected or something might be wrong?
Thank you!
Hi Olga,
It is not possible to estimate time and memory requirements in advance.
How deep is your sequencing? And how many samples are you trying to proccess??
Thanks and sorry for the incovenience!
Estefi
Hi,
I have two case samples - two control samples. Each is around 50 mln reads, paired-end, the read length is 76 np. AsDiscover function is running more than a day. I was just wondering, whether I should just wait, or I should check my input files.
Thank you!
Hi,
I am still not sure where the issue was - my first run on the human genome did not end. But when I split the human gtf annotation (version 98) file by chromosomes and ran ASpli with the same input files, but chromosome by chromosome, it took only several hours. Could it be the case, that to process the whole gtf file took a lot of resources (time, memory)?
Yes, it could be.
It is a combination of genome's size, coverage and number of samples / replicates ...
For example, I used to analyze 36 samples in rice in a 64Gb RAM machine.
We are about to update ASpli with a version for improved proccesing of BAM files.
I'll let you know asap it is available
thanks and sorry for the incovenience,
Hi,
thank you!
Hi Olga!
Can you check our updated version of the package?
http://bioconductor.org/packages/devel/bioc/html/ASpli.html
It should overcome the problem of memory loading BAM files!
Please, take a look and keep in touch. Ill be glad to help you!
thanks
Estefi
Thank you!