/RiboSeqProject

Characterizing Public Ribosome Sequencing Libraries

Primary LanguageRGNU Affero General Public License v3.0AGPL-3.0

Characterizing Public Ribosome Profiling Libraries

Code Repository for BioMedical Data Science Immersion Scheme (BMDSIS) project under NTU Masters in Biomedical Data Science

Synopsis

First and foremost, I would like to thank Dr. Kaibo and Prof. Lee from LKCMedicine for the opportunity to contribute to this project. Briefly, I characterized 8 publicly available Ribosome Profiling Datasets and assess each dataset in terms of 5 metrics; Read Length Distribution, Contamination Level, Mapping Quality, Read Count Quantification, and 3-nucleotide periodicity. These are essential metrics to ensure high quality Ribosome Sequencing dataset before any novel open reading frame discovery can be made. Mapping rate, raw read length distribution, contaminant breakdown and read assignment can be found in the Appendix directory. The contamination used in this study can be found in the Contamination directory, the source can be found in the metadata file. src includes the scripts used in this study. I've included all scripts that were used, which may or may not be used in the final report. I will write a short sypnosis of the findings and attached a link here for those who are interested soon.