Recommendations on running RiboDetector twice on the fastq files?

Question

Recommendations on running RiboDetector twice on the fastq files?

resol341 opened this issue 2 years ago · 2 comments

Hi,

First, thank you for such wonderful tool. It really help a lot with our data. I apologize this is a question rather than an actual issue.

We are working with this dataset where the ribo depletion essentially did not work (please see "before.html " in the attached zip file). After running RiboDetector, the "Per sequence GC content" was greatly improved (please see "after.html" in the attached zip file). The top Overrepresented sequences are blasted to be rRNA content still, albeit with reduced count. So my question is, would it be worthwhile/recommended to run RiboDetector for a second time on the fastq files? Would that further reduce the rRNA reads, or would it introduce some sort of bias?

Our experiment was a simple bulk RNA-seq of mouse cells with low cell count/RNA as input.
I ran RiboDetector with the following settings:

ribodetector
-t 10
-l 150
-i ./${seq_name}_R1_001_val_1.fq.gz ./${seq_name}_R2_001_val_2.fq.gz
-m 40
-e norrna
--chunk_size 256
-o nonrrna/${seq_name}_R1_001.fastq.gz nonrrna/${seq_name}_R2_001.fastq.gz

Thank you again.

Best!
fastqc.zip

Answer 1 · 2023-03-17T22:15:41.000Z

Thank you for your interest in RiboDetector (RD). Your command line looks good to me. Running RD for the second time will not further reduce the rRNA count in the output. How many remaining rRNA reads do you estimate in the output of RD? You could try a smaller -l parameter, e.g. 100.

Answer 2 · 2023-03-17T22:29:40.000Z

Hi, Thank you for your reply. I have tried a few tools since submitting this question. It appears that using HTStream supplied with rRNA sequence .fa file completely removed all the rRNA contaminations. Since our data is single species (mouse), that was an easier solution for us. I understand RiboDetector might be oriented more for metatranscriptomic/metagenomic. Thank you for your time.