/fpr_final_project

Final project for EDLD 610 Functional Programming in R

Primary LanguageR

The app is available at https://kdestasio.shinyapps.io/fpr_final_project/

This repo is a collaboration between Brendan Cullen brendanhcullen and Krista DeStasio.

The dashboard is the final project for an R functional programming class taught by Daniel Anderson. We use the Kaggle Pokemon dataset to demonstrate how different visualization of k-means clustering can help to determine how well various clustering solutions fit the data.

Clustering algorithms are designed to group data based on their similarity or dissimilarity (e.g. distance in Euclidean space). K-means clustering is an unsupervised learning approach to grouping observations in a dataset based on the compactness of the observations. It is best suited for data in which there are a priori reasons to select a given number of clusters, though it can also be useful as a way to explore a dataset visually.

Visualizations in this project include a cluster plot, silhouette plot, and scatterplot, as well as a table that provides information about cluster size (number of observations), the number of observations that may be incorrectly included in a cluster (negative silhouette), cluster density (within cluster sum of squares) and cluster separation (between cluster sum of squares).