/BigData

Primary LanguagePython

BigData - Identifying Inserts in Human Chromosome 20 using Spark Big Data Analysis Methodology

Set NR_OF_FILES to determine on how many BAM files to run

Enter credentials for plotly to upload a graph visualizing the frequencies of unmapped reads in relation to their position to your account

Run inserts.py

The resulting k-mers and their frequencies will be listed in "k-mers.txt"