scikit-hyper

Machine learning for hyperspectral data in Python

Simple tools for exploratory analysis of hyperspectral data
Built on numpy, scipy, matplotlib and scikit-learn
Simple to use, syntax similar to scikit-learn

Installation
Features
Examples
Documentation
License

Installation

This package is currently being developed and is not yet ready for general release. The first general release will be v0.1.0

To install using pip:

pip install scikit-hyper

The following packages are required:

numpy
scipy
scikit-learn
matplotlib
seaborn
PyQt5
pyqtgraph

Features

The following features have currently been implemented:

Classification
- Naive Bayes
- K-nearest neighbors
- Support vector machines
Clustering
- KMeans
Decomposition
- Principal component analysis
- Independent component analysis
- Non-negative matrix factorization
Hyperspectral viewer
- A lightweight pyqt gui that displays and allows interactivity with the hyperspectral data
Tools
- Spectral smoothing
- Spectral normalization

Examples

Hyperspectral denoising

import numpy as np
from skhyper.process import Process
from skhyper.decomposition import PCA

# Generating a random 4-d dataset and creating a Process instance
test_data = np.random.rand(200, 200, 10, 1024)
X = Process(test_data, scale=True)

# The object X contains several useful features to explore the data
# e.g.
# X.mean_spectrum and X.mean_image (mean image/spectrum of the entire dataset)
# X.spectrum[:, :, :, :] and X.image[:, :, :, :] (image/spectrum in chosen region)
# X.view()  (opens a hyperspectral viewer with X loaded)
# X.flatten() (a 'flattened' 2-d version of the data)
# for all features, see the documentation

# To denoise the dataset using PCA:
# First we fit the PCA model to the data, and then fit_transform()
# All the usual scikit-learn parameters are available
mdl = PCA()
mdl.fit_transform(X)

# The scree plot can be accessed by:
mdl.plot_statistics()

# Choosing the number of components to keep, we project back 
# into the original space:
Xd = mdl.inverse_transform(n_components=200)

# Xd is another instance of Process, which contains the new
# denoised hyperspectral data

Hyperspectral clustering

import numpy as np
from skhyper.process import Process
from skhyper.cluster import KMeans

# Generating a random 3-d dataset and creating a Process instance
test_data = np.random.rand(200, 200, 1024)
X = Process(test_data, scale=True)

# Again, all the usual scikit-learn parameters are available
mdl = KMeans(n_clusters=4)
mdl.fit(X)

# The outputs are:
# mdl.labels_  (a 2d/3d image with n_clusters number of labels)
# mdl.image_components_  (a list of n_clusters number of image arrays)
# mdl.spec_components_  (a list of n_clusters number of spectral arrays)

Documentation

The docs are hosted here.

License

scikit-hyper is licensed under the OSI approved BSD 3-Clause License.

tensorstrings/scikit-hyper