/DataAnalysisNotebook

This data analysis notebook demonstrates lossless, lossy visualizations techinques, and classification methods. We demonstrate analysis of scientific data on hot-swappable datasets.

Primary LanguageJupyter NotebookMIT LicenseMIT

Data Analysis Notebook

This data analysis notebook demonstrates lossless, lossy visualizations techinques, and classification methods. We demonstrate analysis of scientific data on hot-swappable datasets.

Datasets supported are numeric tabular datasets with a 'class' column, .csv files provided in datasets folder. For testing purposes we focus on fisher_iris.csv, others included.

The notebook is more easily viewed with a Jupyter viewer, web options include:

Notebook

data_analysis.ipynb

  • Pairplot

Lossless Visualizations:

  • Parallel coordinates
    • Parallel hulls
  • Andrew's curves
  • Star plot
  • GLC-Linear

Lossy Visualizations:

  • Radviz
  • T-SNE
  • PCA

Classification Methods:

  • Associative Rules (no reduction, only single-pass.)
    • Parallel coordinates interval visualization
  • LDA
  • Decision Tree with feature importance
  • Support Vector Machine
    • optimal parameter search
  • Gaussian Naive Bayes

License

This repository and all contents contained are freely available for personal and commercial use under the MIT License, see LICENSE file for full license details.