/RFold

Primary LanguagePython

RFold: Towards Simple yet Effective RNA Secondary Structure Prediction

GitHub stars GitHub forks

Introduction

The secondary structure of ribonucleic acid (RNA) is more stable and accessible in the cell than its tertiary structure, making it essential in functional prediction. Though deep learning has shown promising results in this field, current methods suffer from either the post-processing step with a poor generalization or the pre-processing step with high complexity. In this work, we present RFold, a simple yet effective RNA secondary structure prediction in an end-to-end manner. RFold introduces novel Row-Col Softmax and Row-Col Argmax functions to replace the complicated post-processing step while the output is guaranteed to be valid. Moreover, RFold adopts attention maps as informative representations instead of designing hand-crafted features in the pre-processing step. Extensive experiments demonstrate that RFold achieves competitive performance and about eight times faster inference efficiency than the state-of-the-art method.

Model overview

We show the overall RFold framework.

Benchmarking

We comprehensively evaluate different results on the RNAStralign, ArchiveII datasets.

Colab demo

We provide a Colab demo for reproducing the results and testing RNA sequences by yourself:

Open In Colab

Citation

If you are interested in our repository and our paper, please cite the following paper:

@article{tan2022rfold,
  title={RFold: Towards Simple yet Effective RNA Secondary Structure Prediction},
  author={Tan, Cheng and Gao, Zhangyang and Li, Stan Z},
  journal={arXiv preprint arXiv:2212.14041},
  year={2022}
}

Feedback

If you have any issue about this work, please feel free to contact me by email: