/geneFS

Primary LanguagePython

Metagenomic-gene-abundance

This framework is to propose a way combining feature selection and visualization for gene family abundance

Advancements in machine learning in general and in deep learning in particular have achieved great success in numerous fields. For personalized medicine approaches, frameworks derived from learning algorithms play an important role in supporting scientists to investigate and explore novel data sources such as metagenomic data to develop and examine methodologies to improve human healthcare. Some challenges when processing this data type include its very high dimensionality and the complexity of diseases. Metagenomic data that include gene families often have millions of features. This leads to a further increase of complexity in processing and requires a huge amount of time for computation. In this study, we propose a method combining feature selection using perceptron weight-based filters and synthetic image generation to leverage deep-learning advancements in order to predict various diseases based on gene family abundance data. An experiment was conducted using gene family datasets of five diseases, i.e. liver cirrhosis, obesity, inflammatory bowel diseases, type 2 diabetes, and colorectal cancer. The proposed method provides not only visualization for gene family abundance data but also achieved a promising performance level.

If the work can help you, please cite our work at:

Hai Thanh Nguyen, Tai Tan Phan, Tinh Cong Dao, Thao Minh Nguyen Phan, Phuc Vinh Dang Ta, Cham Ngoc Thi Nguyen, Ngoc Huynh Pham, Hiep Xuan Huynh. Gene Family Abundance Visualization based on Feature Selection Combined Deep Learning to Improve Disease Diagnosis. Journal of Engineering and Technological Sciences. ISSN: 2337-5779. Vol. 53 Issue 1, pp 99-115. 2021. DOI: https://doi.org/10.5614/j.eng.technol.sci.2021.53.1.9

for bib file citation:

@article{nguyen_gene_2021,
	title = {Gene Family Abundance Visualization based on Feature Selection Combined Deep Learning to Improve Disease Diagnosis},
	volume = {53},
	rights = {Copyright (c)},
	issn = {2338-5502},
	url = {http://journals.itb.ac.id/index.php/jets/article/view/13387},
	doi = {10.5614/j.eng.technol.sci.2021.53.1.9},
	pages = {210109--210109},
	number = {1},
	journaltitle = {J. Eng. Technol. Sci.},
	author = {Nguyen, Hai Thanh and Phan, Tai Tan and Dao, Tinh Cong and Ta, Phuc Vinh Dang and Nguyen, Cham Ngoc Thi and Pham, Ngoc Huynh and Huynh, Hiep Xuan},
	urldate = {2021-02-28},
	date = {2021-01-30},
	langid = {english},
	note = {Number: 1},
	keywords = {personalized medicine},
	file = {Full Text PDF:files/1038/Nguyen et al. - 2021 - Gene Family Abundance Visualization based on Featu.pdf:application/pdf;Snapshot:files/1039/13387.html:text/html},
}

Thank you