nf-core/rnaseq

Significantly different versions of STAR in star_rsem (2.7.6a) and star (2.6.1d)

chewgl opened this issue · 3 comments

Great work on the latest version, the increased modularity is much appreciated.

I did notice that the new star_rsem aligner option uses a newer version of STAR (2.7.6a) that differs from the default (2.6.1d). 2.6.1d was previous specified to maintain iGenomes compatibility, but 2.7.6a would break that. In addition, the same indices cannot be used between these versions.

Are there long term plans for resolving this? Perhaps a note of this could be made in the documentation.

Best regards,
Guo-Liang

Hi Guo-Liang,

yes, there are either plans to update the STAR indices to provide compatible version for STAR v2.7+ or switch to some other solution. Currently very unclear on how we will proceed here, but there are ideas and discussion around this already.

Hi @chewgl ! Well spotted! Yes, as @apeltzer mentioned this is a tricky situation because the AWS iGenomes files require an older version of STAR as specified in the module file:

conda (params.enable_conda ? "bioconda::star=2.6.1d" : null)

which is why I chose for the standard --aligner star_salmon route through the pipeline to use that version. Mainly so things don't break for backward compatibility.

However, for the --aligner star_rsem route given that the indices are not on AWS iGenomes and will need to be re-generated anyway with all of the RSEM specific files, I built a multi-tool Biocontainer with the latest version of STAR as specified here

conda (params.enable_conda ? "bioconda::rsem=1.3.3 bioconda::star=2.7.6a" : null)

As you mentioned, this is slightly annoying but it was a toss up between using a much older version of STAR or getting everyone to re-build the indices if they are using --aligner star_rsem.

Added a note in the main README summarising this discrepancy @chewgl 58b702f