Machine learning for hyperspectral data in Python
- Simple tools for exploratory analysis of hyperspectral data
- Built on numpy, scipy, matplotlib and scikit-learn
- Simple to use, syntax similar to scikit-learn
This package is currently being developed and is not yet ready for general release. The first general release will be v0.1.0
To install using pip
:
pip install scikit-hyper
The following packages are required:
- numpy
- scipy
- scikit-learn
- matplotlib
- seaborn
- PyQt5
- pyqtgraph
The following features have currently been implemented:
-
Classification
- Naive Bayes
- K-nearest neighbors
- Support vector machines
-
Clustering
- KMeans
-
Decomposition
- Principal component analysis
- Independent component analysis
- Non-negative matrix factorization
-
Hyperspectral viewer
- A lightweight pyqt gui that displays and allows interactivity with the hyperspectral data
-
Tools
- Spectral smoothing
- Spectral normalization
import numpy as np
from skhyper.process import Process
from skhyper.decomposition import PCA
# Generating a random 4-d dataset and creating a Process instance
test_data = np.random.rand(200, 200, 10, 1024)
X = Process(test_data, scale=True)
# The object X contains several useful features to explore the data
# e.g.
# X.mean_spectrum and X.mean_image (mean image/spectrum of the entire dataset)
# X.spectrum[:, :, :, :] and X.image[:, :, :, :] (image/spectrum in chosen region)
# X.view() (opens a hyperspectral viewer with X loaded)
# X.flatten() (a 'flattened' 2-d version of the data)
# for all features, see the documentation
# To denoise the dataset using PCA:
# First we fit the PCA model to the data, and then fit_transform()
# All the usual scikit-learn parameters are available
mdl = PCA()
mdl.fit_transform(X)
# The scree plot can be accessed by:
mdl.plot_statistics()
# Choosing the number of components to keep, we project back
# into the original space:
Xd = mdl.inverse_transform(n_components=200)
# Xd is another instance of Process, which contains the new
# denoised hyperspectral data
import numpy as np
from skhyper.process import Process
from skhyper.cluster import KMeans
# Generating a random 3-d dataset and creating a Process instance
test_data = np.random.rand(200, 200, 1024)
X = Process(test_data, scale=True)
# Again, all the usual scikit-learn parameters are available
mdl = KMeans(n_clusters=4)
mdl.fit(X)
# The outputs are:
# mdl.labels_ (a 2d/3d image with n_clusters number of labels)
# mdl.image_components_ (a list of n_clusters number of image arrays)
# mdl.spec_components_ (a list of n_clusters number of spectral arrays)
The docs are hosted here.
scikit-hyper is licensed under the OSI approved BSD 3-Clause License.