Multiple-myeloma_prognosis: A MATLAB repository from TeamSundar

Autoencoder and NCA based neural network model to estimate survival prognosis in multiple myeloma using arrayCGH data

Vidhi Malik 1, Shayoni Dutta 2, Navaneethan Radhakrishnan 1, Yogesh Kalakoti 1, Ritu Gupta 3,* and Durai Sundar 1,4,*

1 Department of Biochemical Engineering & Biotechnology, Indian Institute of Technology (IIT) Delhi, New Delhi - 110016, India;

2 Certara UK Ltd, Quantitative systems pharmacology division of SimCyp, Level 2-Acero, 1 Concourse Way, Sheffield S1 2BJ, United Kingdom.;

3 Laboratory Oncology Unit, Dr. B.R.A.IRCH, All India Institute of Medical Sciences (AIIMS), Ansari Nagar, New Delhi, 110029, India.

4 Yardi School of Artificial Intelligence, Indian Institute of Technology (IIT) Delhi, New Delhi – 110016, India.

About The Project

Multiple myeloma (MM) is malignancy of plasma cells, found in the bone marrow, which aids in fighting infections by synthesis of immunoglobulins. Clonal proliferation of abnormal plasma cell outgrows normal plasma cells and carry on synthesis of abnormal proteins, leading to MM. With advancements in clinical research, the disease has become highly manageable, but not curable. Various clinical factors are considered by medical practitioners for prediction of prognosis and treatment regimens for patients. An attempt has been made here to develop a tool that can help in predicting the survival and prognosis of MM patients, which will eventually support clinicians in designing suitable treatment regimen for the patients.

Built With

MATLAB R2020a: academic license

Usage

To use the proposed neural network based survival prediction model for Multiple myeloma patients, use commands:

cd TeamSundar/Multiple-myeloma_prognosis/NCA-Neuralnet

load mm_NCANN.mat
newoutput = mm_NCANN(newinput);

The pipeline require two input files:

Clinical features file should have seven columns in a format specified below

aCGH ID_1	Age	Gender	OS_Time (days)	Chemotherapy Regimen	ISS Staging
253058713873_1	73	1	52	1	2
253058713873_3	75	0	175	4	2
253058713877_1	54	1	182	2	3
253058713877_2	58	1	203	7	3

Please refer following table for symbols used for features like gender, chemotherapy regimen, ISS satging and response columns:

Gender	Chemotherapy Regimen	Staging (International Staging System)
0 (Male)	1 : lenalidomide-dexamethasone (RD)	1 (ISS 1)
1 (Female)	2 : thalidomide-dexamethasone (TD)	2 (ISS 2)
	3 : bortezomib-dexamethasone (VD)	3 (ISS 3)
	4 : melphalan-prednisone-thalidomide (MPT)
	5 : bortezomib- thalidomide-dexamethasone (VTD)
	6 : bortezomib-lenalidomide-dexamethasone (VRD)
	7 : bortezomib-cyclophosphamide-dexamethasone (VCD)
	8: cyclophosphamide, thalidomide, dexamethasone (CTD)

CNV file The required CNV input file should in the format specified in table below:

Sample	Gene1	Gene2	..	GeneN
Sample 1
Sample2
..
SampleN

The neighbourhood component analysis (NCA) algorithm was used to reduce the dimension of input dataset that provided us a gene signature comprised of 211 genes that were able to classify the patients into three classes based on the progression event and death event of the participant. The input file should have CNV values for these 211 genes. The input file can be formatted using script ./Input_Data/Input_prep.py

Model will classifiy patient into three classes based on progression and death event chances i.e.,

Class 1: 11 (Dead with relapse i.e., Progression event: 1 and death event :1)
Class 2: 10 (alive with relapse i.e., Progression event : 1 and death event: 0)
Class 3: 0 (alive with no relapse i.e., Progression event :0 and death event: 0)

The Matlab live script for proposed NCA-Neural network-based model is located in ./NCA-Neuralnet/ArrayCGH_NCA_Neural_net_92_7percent_accuracy_final_model.mlx

The Matlab live scripts for autoencoder based prediction models, DNN1 and DNN2 is located in directory ./DNN1_and_DNN2/ArrayCGH_DNN1_52_6_andDNN2_68_4percent_SVM_41_2_RUS_33percent.mlx

License

Distributed under the MIT License. See LICENSE for more information.