/PaddleHelix

Bio-Computing Platform featuring Large-Scale Representation Learning and Multi-Task Deep Learning “螺旋桨”生物计算工具集

Primary LanguageJupyter NotebookApache License 2.0Apache-2.0

English | 简体中文


Version python version support os

PaddleHelix is a machine-learning-based bio-computing framework aiming at facilitating the development of the following areas:

  • Vaccine design
  • Drug discovery
  • Precision medicine

Features

  • High Efficency: We provide LinearRNA, a highly efficient toolkit for RNA structure prediction and analysis. LinearFold & LinearParitition achieve O(n) complexity in RNA-folding prediction, which is hundreds of times faster than traditional folding techniques.

  • Large-scale Representation Learning and Transfer Learning: Self-supervised learning for molecule representations offers prospects of a breakthrough in tasks with limited annotation, including drug profiling, drug-target interaction, protein-protein interaction, RNA-RNA interaction, protein folding, RNA folding, and molecule design. PaddleHelix implements a variety of representation learning algorithms and state-of-the-art large-scale pre-trained models to help developers to start from "the shoulders of giants" quickly.

  • Easy-to-use APIs: PaddleHelix provides frequently used structures and pre-trained models. You can easily use those components to build up your models and systems.

Installation

The installation prerequisites and guide can be found here.


Documentation

Tutorials

  • We provide abundant tutorials to help you navigate the repository and start quickly.
  • PaddleHelix is based on PaddlePaddle, a high-performance Parallelized Deep Learning Platform.

Examples

The API reference

  • Detailed API reference of PaddleHelix can be found here.

Guide for developers

  • If you need help in modifying the source code of PaddleHelix, please see our Guide for developers.