The secondary structure of ribonucleic acid (RNA) is more stable and accessible in the cell than its tertiary structure, making it essential in functional prediction. Though deep learning has shown promising results in this field, current methods suffer from either the post-processing step with a poor generalization or the pre-processing step with high complexity. In this work, we present RFold, a simple yet effective RNA secondary structure prediction in an end-to-end manner. RFold introduces novel Row-Col Softmax and Row-Col Argmax functions to replace the complicated post-processing step while the output is guaranteed to be valid. Moreover, RFold adopts attention maps as informative representations instead of designing hand-crafted features in the pre-processing step. Extensive experiments demonstrate that RFold achieves competitive performance and about eight times faster inference efficiency than the state-of-the-art method.
We show the overall RFold framework.
We comprehensively evaluate different results on the RNAStralign, ArchiveII datasets.
We provide a Colab demo for reproducing the results and testing RNA sequences by yourself:
If you are interested in our repository and our paper, please cite the following paper:
@article{tan2022rfold,
title={RFold: Towards Simple yet Effective RNA Secondary Structure Prediction},
author={Tan, Cheng and Gao, Zhangyang and Li, Stan Z},
journal={arXiv preprint arXiv:2212.14041},
year={2022}
}
If you have any issue about this work, please feel free to contact me by email:
- Cheng Tan: tancheng@westlake.edu.cn