ziweiwu/ML-Mass-Spectrometry

Classifying prostate cancer, ovarian cancer, and control by looking 10,000 attributes from mass-spectrometric data in 900 subjects

Jupyter Notebook

ML-Mass-Spectrometry

In this project, a dataset of mass spectrometry with 10,000 features and 901 samples was studied(https://archive.ics.uci.edu/ml/datasets/Arcene).

The dataset was used as a part of feature selection competition in 2003 (http://clopinet.com/challenges/)

Goal

To find effective machine learning models that work well high dimensional dataset when p >> n
Perform feature selection to have a better understanding of how mass spectrometry data provides insights in cancer prediction

Resources

Multi-classes ROC: http://scikit-learn.org/stable/auto_examples/model_selection/plot_roc.html#multiclass-settings