nerettilab/RepEnrich2

Computing time

Opened this issue · 5 comments

Hi,

I am currently using RepEnrich2 and everything is fine except that i don't know the computing time needed.
I have 9 fastq files running on a cluster. Each file have 20Go of data. And i have actually 183 hours of cpu computing time for each sample. Is it ok? or something went wrong ?

Hi, it is now more than 1000 h of cpu time, is it normal ?

Hi there,

Sorry for the delayed reply -- we've never observed runtimes this long; how many hours does an individual 20gb file take to run? We've typically run on RNA seq files between 2-7 gigs and we don't typically see the runtime exceed much over 24h for a single sample.

Hi,

No problem,
All the samples are running in parallel on the cluster so i don't know the running time for one since they are still running. It is more than 4000 hours in cpu-time now. But there is still knew result files created everyday so it seems to work.

That does seem much higher than I would expect for run time based on the samples we've run and their relative sizes... the aligner might be the bottleneck in the case with 20gb files (although we would need to test this more ourselves to be sure).

We're actually going to be working on integrating RepEnrich2 with the STAR aligner sometime soon which should increase the speed greatly, but unfortunately I'm not sure whether there is much that can be done for the time being.

Ok, thanks for the information, i look forward to this integration.
The run should be done soon, i will tell you the final time and try to identify what i did wrong.
Thanks for your time and your kindness.