Quantify each input nucleotide contribution by the layer-wise relevance calculation

Each input feature was calculated to obtain its contribution to the results by DeepExplain’s epsilon-LRP method. The feature importance plots were based on the EIF3a binding site datasets (Figure 4). The higher score that the position gets, the larger probability that the center nucleotide is an EIF3a reader binding site if this nucleotide present at that position. As shown in the graph, positions located around the predicted m6A sites got significantly higher scores than other positions, which means those positions are more important in determining whether the center nucleotide is m6A reader substrate site or not. Additionally, the prediction of modification site would benefit from taking sequence more than 50bp upstream or downstream the predicted site since they include positions with high importance score

Specifically, a site would be less likely to be m6A modification site if the adenosine represents in 100bp downstream since the majority of position within this sequence got importance scores smaller than 0. In comparison, the presence of cytosine in 50 upstream/downstream the predicted site tends to boost the chance of the center nucleotide being modified. No specific patterns were found for guanine and thymine as the importance plot present a shape like the sine function.

The results showed that if those positions 34bp, 59bp, 11bp, 58bp, 27bp, 49bp, 72bp upstream, 21bp, 27bp, 24bp, 25bp, 116bp downstream the modification site is cytosine, the site would more likely to be the EIF3a reader binding site. In addition, the probability of the modification site being EIF3a substrate site would decrease if guanosine was found on positions 21bp, 71bp, 33bp, 32bp, 31bp, 22bp upstream the center site or uridine was found on positions 54bp upstream or 53bp downstream the center site. The screened top 20 nucleotides that will decrease the change of the site being EIF3a modification site include: adenosines on positions 39bp, 27bp, 47bp, 61bp, 10bp, 12bp, 23bp, 170bp, 157bp, 51bp, 14bp, 226bp, 52bp upstream the center nucleotide, cytosine on positions 93bp upstream and 185bp downstream the center nucleotide, guanosine on positions 92, 97bp downstream the center site as well as uridines on positions 56bp, 63bp upstream the modification site.

Alt text

Figure 5 Feature importance scores in EIF3A full transcript prediction. We both extracted upstream/downstream 50bp and upstream/downstream 250 bp of the sequence to rank the contribution of each nucleotide in determining the binding site. In each position, the higher score it gains, the higher contribution towards the binding sites.