mhalushka/miRge3.0

visualization.HTML output is empty

Closed this issue · 6 comments

I installed Mirge3.0 using conda and running using the following line obtained from run.log:

/home/ubuntu/anaconda3/envs/mirge3/bin/miRge3.0 -s 1.fastq.gz,2.fastq.gz,3.fastq.gz -lib /home/ubuntu/mirge3_lib -on human -db mirbase -o mirge_out/ -gff -nmir -trf -tcf -cpu 12 -ai -a illumina -g TTAGGC...TGGAATTCTCGGGTGCCAAGGAACTCCAGT

The file mirge3_visualization.html is created, but all of the tabs are empty. The file index_data.js is created with all of the relevant data for plotting. I tried plotting the readlength chart manually in R, but the categories (read length) and data (counts/frequency) do not have equal lengths and cannot be plotted. Furthermore, annotation.report.html is generated and the summary table appears with no issues.

I'm wondering if this is an issue on my end (too few samples/low quality samples) or something else.

Hi @TomRivas,

I doubt that the use of adapter sequence might be the issue here. Can you tell me how many Unique miRNAs are reported (from annotation.report.csv), and can you do a grep of let-7a sequence and share the output (command below).

zgrep "TGAGGTAGTAGGTTGTATAGTT" 1.fastq.gz | head -20

I presume that you have renamed your files as 1.fastq.gz for the purpose of reporting the issue. If not, we recommend your file names to start with a character, eg: a1.fastq.gz, a2.fastq.gz and so on.

Thank you,
Arun.

I presume that you have renamed your files as 1.fastq.gz for the purpose of reporting the issue. If not, we recommend your file names to start with a character, eg: a1.fastq.gz, a2.fastq.gz and so on.

Yes that's correct! Maybe not the best anonymized sample name.

Can you tell me how many Unique miRNAs are reported (from annotation.report.csv)

Unique miRNAs:

Sample 1 - 75
Sample 2 - 44
Sample 3 - 10

We do have ~98% of trimmed reads not mapping to miRNAs, tRNA fragments, snoRNAs, rRNAs, other ncRNAs, and mRNAs, so I wouldn't be surprised if it's a sample quality issue (and it's something we're actively troubleshooting).

can you do a grep of let-7a sequence and share the output

TGAGGTAGTAGGTTGTATAGTTTGGAATTCTCGGGTGCCAAGGAACTCCA
TGAGGTAGTAGGTTGTATAGTTTGGAATTCTCGGGTGCCAAGGAACTCCA
TGAGGTAGTAGGTTGTATAGTTTGGAATTCTCGGGTGCCAAGGAACTCCA
TGAGGTAGTAGGTTGTATAGTTGGAATTCTCGGGTGCCAAGGAACTCCAG
CTGAGGTAGTAGGTTGTATAGTTTGGAATTCTCGGGTGCCAAGGAACTCC
TGAGGTAGTAGGTTGTATAGTTTGGAATTCTCGGGTGCCAAGGAACTCCA
TGAGGTAGTAGGTTGTATAGTTTGGAATTCTCGGGTGCCAAGGAACTCCA
TGAGGTAGTAGGTTGTATAGTTTGGAATTCTCGGGTGCCAAGGAACTCCA
TGAGGTAGTAGGTTGTATAGTTTGGAATTCTCGGGTGCCAAGGAACTCCA
TGAGGTAGTAGGTTGTATAGTTTGGAATTCTCGGGTGCCAAGGAACTCCA
TGAGGTAGTAGGTTGTATAGTTTGGAATTCTCGGGTGCCAAGGAACTCCA
TGAGGTAGTAGGTTGTATAGTTTGGAATTCTCGGGTGCCAAGGAACTCCA
CTGAGGTAGTAGGTTGTATAGTTTGGAATTCTCGGGTGCCAAGGAACTCC
CTGAGGTAGTAGGTTGTATAGTTACGTGGTGGAATTCTCGGGTGCCAAGG
TGAGGTAGTAGGTTGTATAGTTTGGAATTCTCGGGTGCCAAGGAACTCCA
TGAGGTAGTAGGTTGTATAGTTTGGAATTCTCGGGTGCCAAGGAACTCCA
TGAGGTAGTAGGTTGTATAGTTTGGAATTCTCGGGTGCCAAGGAACTCCA
TGAGGTAGTAGGTTGTATAGTTGGAATTCTCGGGTGCCAAGGAACTCCAG
TGAGGTAGTAGGTTGTATAGTTATGGAATTCTCGGGTGCCAAGGAACTCC
TGAGGTAGTAGGTTGTATAGTTTGGAATTCTCGGGTGCCAAGGAACTCCA

Hi @TomRivas,

Yes, this is the adapter issue. You should just use -a illumina and skip -g part. (Giving -g as this TTAGGC...TGGAATTCTCGGGTGCCAAGGAACTCCAGT specifies trim adapter at both ends). If you observer let-7a in your sequences, you can see there is no 5' adapter sequence and has only 3' adapter sequence, which in this specific case is illumina adapter (TGAGGTAGTAGGTTGTATAGTTTGGAATTCTCGGGTGCCAAGGAACTCCA).

let-7a sequence: TGAGGTAGTAGGTTGTATAGTT
illumina adapter: TGGAATTCTCGGGTGCCAAGGAACTCCA

This command will yield more unique miRNAs and less unmapped reads:
miRge3.0 -s 1.fastq.gz,2.fastq.gz,3.fastq.gz -lib /home/ubuntu/mirge3_lib -on human -db mirbase -o mirge_out/ -gff -nmir -trf -tcf -cpu 12 -ai -a illumina

This should also fix the visualization html files issue. Let me know if this helps.

Thank you,
Arun.

Hi @arunhpatil, thanks so much for your help! I've tried your suggested changes to the mirge3 command and still have an empty report. The number of unique miRNAs is unchanged and the number of reads corresponding to other RNAs are negligibly different. Furthermore, the lengths of the data to generate the Read Length Distribution plot in index_data.js are still 34 and 35, see example below.

    title: {
        text: 'Sample.1: Read Length Distribution'
    },
    chart: {
        marginRight: 80,
        zoomType: 'xy'
    },
    credits: {
        enabled: false
    },
    xAxis: {
        categories: [16.0, 17.0, 18.0, 19.0, 20.0, 21.0, 22.0, 23.0, 24.0, 25.0, 26.0, 27.0, 28.0, 29.0, 30.0, 31.0, 32.0, 33.0, 34.0, 35.0, 36.0, 37.0, 38.0, 39.0, 40.0, 41.0, 42.0, 43.0, 44.0, 45.0, 46.0, 47.0, 48.0, 49.0, 50.0]
    },
    yAxis: {
        allowDecimals: false,
        title: {
            text: 'Frequency'
        }
    },
    series: [{
        name: 'Read length',
        pointWidth: 10,
        type: 'column',
        colorByPoint: false,
        data: [58487, 65243, 76967, 87066, 74050, 73397, 81951, 92195, 80926, 64658, 56068, 49967, 43080, 39553, 42164, 33331, 24669, 22769, 16921, 15081, 15774, 11566, 9473, 9677, 5603, 3537, 3005, 1286, 659, 484, 398, 424, 0, 82331],
    }]
});```

Hi @TomRivas,

Does visulization HTML file is still not functional? That difference in the lengths in JS doesn't effect the visualization. I am attaching a sample output here. Unless your data is exosomes, you should get a good number of unique miRNA counts. If you would be interested, I can inspect one file (or a subset). My email is arun26feb at gmail dot com.

visual.zip

Hi @arunhpatil, I figured out my issue. I was accessing the .html file on the cloud through my terminal which creates a temporary file locally and does not have access to the .js file. If I download the .html and .js file together, everything works fine.