StevenWingett/FastQ-Screen

More info about rRNA and mitochondrial indexes

Closed this issue · 2 comments

Thanks for creating this useful tool. Would it be possible to provide more information about the rRNA and mitochondrial default indexes? For example, which species are included?

Secondly, in order to evaluate the 'rRNA' alignment rate properly, it would be really helpful to know which sequences are included. In particular, does this database include sequence outside of the genome assemblies? For example, the 'Rn45s' annotated on chr17:39842997-39848829 in mm10 seems to be a partial 45S compared to the full pre-rRNA 45S (NR_046233.2) but that is the best match when I BLAST NR_046233.2 against mm10. Previously, I have also observed a decreased unmapped rate for RNA-seq (ribo-depletion, so we were expecting higher rRNA) when I add NR_046233.2 to the mm10 reference.

Would appreciate your thoughts on accurately quantifying rRNA and mitochondrial reads in a library. Thanks again!

To follow up, I was getting unmapped 45S reads when I ran fastq_screen with the default 'rRNA' database. I was able to rectify this issue by making a new index from FASTA files from the SILVA rRNA database (https://www.arb-silva.de/), concatenating the SSU and LSU datasets. More info about the default indexes would help the user decide whether they need to create their own custom indexes. Thanks again for this tool!

Hi,

We put together the genome databases to get people started with using the FASTQ Screen software. You should be able to extract the original FASTA files from the Bowtie2 index files. I believe you need to use the bowtie2-inspect command to do that. The FASTA headers may then reveal the source of the genome.

We put together the databases over time to help us with QC of our data, and for which purposes it works fine. If for your research you need complete clarity and and an audit of these databases, then I suggest you obtain the desired files and build the new index files from scratch. I think this is the simplest and most transparent thing to do. Building Bowtie2 index files in very simple and requires the user to run the bowtie2-build command.

I hope that helps.

All the best,
Steven