/Protein_ML

Machine learning workflows for analyzing high-throughput protein data

Protein_ML

This repository serves as a template for machine learning on high-throughput expression and solubility data.

The machine learning models have been implemented in Python in the form of IPython notebooks. The workflow is run in the following order:

  1. create_feature_matrix.ipynb
  2. classification_workflow.ipynb
  3. retrospective_analysis.ipynb

A reduced workflow was implemented for solubility data in the solubility subdirectory.

All information presented in this Git Repository is strictly for academic purposes.