ml4sp_metastasis
Downloading the data
Getting GSM samples from http://hcmdb.i-sanger.com/.
Use GEOparse to download data into Python. (this will also download a textfile of the sample in your root directory)
Getting mRNA data
We look at all samples with the same ID_REF
row names, and add those values to our data frame.
Dimensionality reduction using PCA
Task 1: Identifying normal, metastatic tumor, and primary tumors
- [] random forest
- [] logistic regression
- [] gaussian