Multi-Omics-Analysis-and-Pathological-Feature-Selection-

This study focuses on using feature selection techniques to predict patient survival in colon cancer based on gene expression or other omics profiles. The aim is to evaluate the performance of four sparse feature selection methods in estimating survival, taking into account competing risks.
Data included levels of miRNA expression related to 1871 genes, mRNA expression of 20502 genes, along with 219 histological specimens, and survival data from 209 colon cancer patients from the Cancer Genome Atlas Colon Adenocarcinoma (TCGA-COAD) dataset. Four methods of least absolute shrinkage and selection operator (LASSO), smoothly clipped absolute deviation (SCAD), smooth integration of counting and absolute deviation (SICA), minimax concave penalty (MCP), and a deep convolutional neural network (CNN) were used for simultaneous whole slide images (WSIs) variable selection and estimation under an additive hazard model. Feature extraction via additive models based on miRNA and mRNA expression and WSIs along with labeling patients based on survival were used as input of machine learning models to predict labels of survival status. Using these results, high-risk and low-risk groups were identified. hsa-miR-149-3p, hsa-miR-5698, and DCAF7, RNF157, MORN4, and GPRC5 could be considered as potential COAD (Colon adenocarcinoma) biomarkers. The DLM (Deep learning model) achieved an accuracy of 98.79 % by using multi-omics ranging from Genomics to Pathomics. Biomarkers related to colon cancer based on mRNA, miRNAs, and histological features could be considered cancer detection markers. Their joint use can be fruitful in both areas of cancer early diagnosis and prognosis. Keywords: Pathomics, Survival analysis, deep learning, Histology slides, Additive hazards model, Feature selection, colon cancer.