/scikit-hyper

scikit-hyper: hyperspectral data analysis and machine learning

Primary LanguagePythonBSD 3-Clause "New" or "Revised" LicenseBSD-3-Clause

scikit-hyper

Build Status Documentation Status Python Version 3.5 Python Version 3.6 PyPI version

Machine learning for hyperspectral data in Python

  • Simple tools for exploratory analysis of hyperspectral data
  • Built on numpy, scipy, matplotlib and scikit-learn
  • Simple to use, syntax similar to scikit-learn

Contents

  1. Installation
  2. Features
  3. Examples
  4. Documentation
  5. License

Installation

This package is currently being developed and is not yet ready for general release. The first general release will be v0.1.0

To install using pip:

pip install scikit-hyper

The following packages are required:

  • numpy
  • scipy
  • scikit-learn
  • matplotlib
  • seaborn
  • PyQt5
  • pyqtgraph

Features

The following features have currently been implemented:

  • Classification

    • Naive Bayes
    • K-nearest neighbors
    • Support vector machines
  • Clustering

    • KMeans
  • Decomposition

    • Principal component analysis
    • Independent component analysis
    • Non-negative matrix factorization
  • Hyperspectral viewer

    • A lightweight pyqt gui that displays and allows interactivity with the hyperspectral data
  • Tools

    • Spectral smoothing
    • Spectral normalization

Examples

Hyperspectral denoising

import numpy as np
from skhyper.process import Process
from skhyper.decomposition import PCA

# Generating a random 4-d dataset and creating a Process instance
test_data = np.random.rand(200, 200, 10, 1024)
X = Process(test_data, scale=True)

# The object X contains several useful features to explore the data
# e.g.
# X.mean_spectrum and X.mean_image (mean image/spectrum of the entire dataset)
# X.spectrum[:, :, :, :] and X.image[:, :, :, :] (image/spectrum in chosen region)
# X.view()  (opens a hyperspectral viewer with X loaded)
# X.flatten() (a 'flattened' 2-d version of the data)
# for all features, see the documentation

# To denoise the dataset using PCA:
# First we fit the PCA model to the data, and then fit_transform()
# All the usual scikit-learn parameters are available
mdl = PCA()
mdl.fit_transform(X)

# The scree plot can be accessed by:
mdl.plot_statistics()

# Choosing the number of components to keep, we project back 
# into the original space:
Xd = mdl.inverse_transform(n_components=200)

# Xd is another instance of Process, which contains the new
# denoised hyperspectral data

Hyperspectral clustering

import numpy as np
from skhyper.process import Process
from skhyper.cluster import KMeans

# Generating a random 3-d dataset and creating a Process instance
test_data = np.random.rand(200, 200, 1024)
X = Process(test_data, scale=True)

# Again, all the usual scikit-learn parameters are available
mdl = KMeans(n_clusters=4)
mdl.fit(X)

# The outputs are:
# mdl.labels_  (a 2d/3d image with n_clusters number of labels)
# mdl.image_components_  (a list of n_clusters number of image arrays)
# mdl.spec_components_  (a list of n_clusters number of spectral arrays)

Documentation

The docs are hosted here.

License

scikit-hyper is licensed under the OSI approved BSD 3-Clause License.