This repository contains the code for the “Random forest classification for predicting lifespan-extending chemical compounds” (https://doi.org/10.1038/s41598-021-93070-6) paper.
Ageing is a major risk factor for many conditions including cancer, neurodegenerative and cardiovascular diseases. Interventions that can prevent, delay or protect against age-related disorders are a growing research area. The DrugAge database (https://genomics.senescence.info/drugs/) consists of compounds that have been found to increase the lifespan of model organisms such as the worm, Caenorhabditis elegans (C. elegans). Random forest classifiers were built for predicting whether a chemical entry will “increase” or “not increase” the lifespan of C. elegans. This repository contains the random forest models built with chemical descriptors and Extended Connectivity Fingerprints (ECFPs).
Python packages required
- pandas
- numpy
- sklearn
- RDKit
- matplotlib
- seaborn