Code Notes:
random_peptides.py was used to generate random peptide libraries of various lengths
protparamanalysis.py was used to get amino acid counts and values for various chemical properties for each peptide in a library
PCA_topval_5mil.Rmd and sorting.py were used in extreme value-based library pruning
PCAofkmeans.Rmd and performing_kmeans_clustering.Rmd were used in kmeans based pruning
PCA_1C.Rmd was used to do the gridded pruning
knngraph_updated.Rmd was used to do the KNN-based pruning
remove_mult_aa.Rmd and get_top_aa.Rmd were used in approach 2a
AA_freq.Rmd and PCA_AA_FREQS.Rmd were used in approach 2b