read length should be not smaller than 50, but ribo-seq reads are ~30 bp
huguanjing opened this issue · 4 comments
-l LEN, --len LEN Sequencing read length, should be not smaller than 50.
Is this correct? Ribo-seq reads are ~ 30 bp
To be clear, my question is whether RiboDetector can be used to detect and remove rRNA from ribo-seq samples? Thanks!
-l LEN, --len LEN Sequencing read length, should be not smaller than 50.
Is this correct? Ribo-seq reads are ~ 30 bp
Thank you for pointing this out. You can ignore the help message. this is just a suggestion, but you can still use it for reads shorter than 50 or 40bp. Yes, you can use RiboDetector for Ribo-Seq reads, however, the accuracy will be slightly lower when the reads are short (I think this will be the same for the other methods/tools). I will update the help message in the new release. Thank you!
May I add a related question: After quality filtering by trimmomatic, the (uniform) read length of e.g. 150 bp in paired end mode changes to a length distribution (e.g. 36 to 150 bases, depending on the settings). How should the -l LEN parameter used in this cases? What will happen to reads shorten than the -l value?
You can check the mean length by using seqkit stats
, then use the mean length LEN
for the -l
parameter. If the read is longer than the mean, only the first LEN
bases will be used to capture the sequence features for classification. If the read is shorter than or equal to the mean, the whole read will be used. In any case, the output files will give you the whole read. So you don't need to worry about the variable length of your input reads.