/Machine-Learning-Course-Project-NJIT

This is a simulated dataset of single nucleotide polymorphism (SNP) genotype data containing 29623 SNPs (total features). Amongst all SNPs are 15 causal ones which means they and neighboring ones discriminate between case and controls while remainder are noise. In the training are 4000 cases and 4000 controls. Your task is to predict the labels of 2000 test individuals whose true labels are known only to the instructor and TA.

Primary LanguagePython

Machine-Learning-Course-Project-NJIT

This is a simulated dataset of single nucleotide polymorphism (SNP) genotype data containing 29623 SNPs (total features). Amongst all SNPs are 15 causal ones which means they and neighboring ones discriminate between case and controls while remainder are noise. In the training are 4000 cases and 4000 controls. Your task is to predict the labels of 2000 test individuals whose true labels are known only to the instructor and TA.

https://web.njit.edu/~usman/courses/cs675_fall16/project.txt https://web.njit.edu/~usman/courses/cs675_fall16/index.html