xryanglab/RiboCode

Whether multiple mapping affect ribocode result

Hilarialy opened this issue · 1 comments

Sorry to bother you.
I followed your workflow use STAR arguments --quantMode TranscriptomeSAM and --outFilterMultimapNmax 1,this is the command looks like

STAR --outFilterType BySJout --runThreadN 10 --outFilterMismatchNmax 2 \
--genomeDir /reference/GRCm38/STAR \
--readFilesIn /my/rmrRNA/${sample}_trim_norrna.fq  \
--outFileNamePrefix ${sample} \
--outSAMtype BAM SortedByCoordinate \
--quantMode TranscriptomeSAM GeneCounts \
--outFilterMultimapNmax 1 --outFilterMatchNmin 16 --alignEndsType EndToEnd

the output Aligned.toTranscriptome.out.bam file still have multiple mapping sequence ( NH:i > 1 ) , because STAR wiil output all records in this toTranscriptome bam file , Does these multiple mapping records affect the Ribocode result to find uORF ?

STAR excludes those reads aligned to multiple genome regions, not the transcriptome. It is expected to see "NH:i>1" in transcritome bam file for those genes having multiple overlapped isoforms. It has no influence on RiboCode result.