Vectorization of persistence barcode with applications in pattern classification of porous structures (Computers & Graphics 2020)

This is an implement of the algorithm in the paper titled "Vectorization of persistence barcode with applications in pattern classification of porous structures"

requirements

Operating System

Window (The authors used Windows 7 64-bit)

Matlab

Matlab version: MATLAB R2017b or later.
Include 'drtoolbox' provided in the directory.

Python

Python version: Python 3.6 Visit the website https://www.python.org/downloads/release/python-362/ to install Python 3.6.2 by downloading "Windows x86-64 executable installer" and installing it.

Python Packages

Required Python packages include: ripser, cython, numpy, matplotlib, scikit-tda, tqdm, seaborn, pandas, scikit-learn

Run the script ./install.sh to automatically install the Python packages.

Code and files

Computing Barcodes

python files:

BC_compute.py: To compute barcodes of point clouds.

Computing Feature Vectors for Vectorizing Barcodes

haarBasisM.m

PersBettiSur.m

haarDecomposition2DFunc.m

Dimensionality Reduction

Directory:

drtoolbox

Files:

pca.m

A_Isomap.m

connect_components.m

find_NN_supervised.m

out_of_sample_new.m

Supervised_Isomap.m

Classification

python file:

classifiers.py

main function

main_porous.m
main_dynam2D
main_timeseries

Instructions

Set path

Set the folder to the folder where the code is located.

data1

This data set is about porous classification. The results are shown in Table 2 and Section 6.

Data:

The original training and test data, i.e., 3D point clouds, are given in the format of txt file in the directory "Sample_training" and "Sample_test".

The barcodes of training and test data are given in the directory of "BC_training" and "BC_test".

Run "main_porous.m" in MATLAB, and then input "Y" or "y" to use the dimensionality reduction, or "N" or "n" not to use it.

The key parameters are printed on the screen, such as bound value, and reduced dimension.

Output:

The classification accuracy on four classifiers with the corresponding hyper-parameter, namely, Logistic Regression, KNN, Kernel SVM (SVC), and Linear SVM (LinearSVC).

data3

This data set is about the determination of the parameters of the 2D dynamical system proposed by Adams et al. (2017).

Data:

The original training and test data, i.e., 2D point clouds, are given in the format of txt file in the directory "training_data" and "test_data".

The barcodes of training and test data are given in the directory of "BC_training" and "BC_test".

Run "main_dynam2D.m" in MATLAB, and then input "Y" or "y" to use the dimensionality reduction, or "N" or "n" not to use it.

The key parameters are printed on the screen, such as bound value, and reduced dimension.

Output:

The classification accuracy on four classifiers with the corresponding hyper-parameter, namely, Logistic Regression, KNN, Kernel SVM (SVC), and Linear SVM (LinearSVC).

data4

This data set is about time-series for multi-source signals provided in UCR archive. Four of the data sets in our experiments are offered, including BirdChicken, Earthquakes, ECG200, and FreezerRegularTrain.

There are four sub-directories, namely, BirdChicken, Earthquakes, ECG200, and FreezerRegularTrain.

In each directory, the original training and test data, i.e., high-dimensional point clouds, are given in the format of txt file in the directory "TRAIN" and "TEST".

The barcodes of training and test data are given in the directory of "BC_train" and "BC_test", respectively.

Run "main_timeseries.m"
Input the name of data set, for exaple "BirdChicken".
Input "Y" or "y" to use the dimensionality reduction, or "N" or "n" not to use it.

Output:

The classification accuracy on four classifiers with the corresponding hyper-parameter, namely, Logistic Regression, KNN, Kernel SVM (SVC), and Linear SVM (LinearSVC).

Notes

For fast computing, we set the discretization parameter "sam_num" to be 8, which is 10 in our experiments. The accuracy does not lose much.
Note that the results of data3 and data4 are shown in Figure 2(a) in the paper.
Note that we use Matlab to call the command line to execute the python file for computing the barcodes of point clouds.

If, unfortunately, some errors occur in this process, execute the commands given by the notes in Matlab .m file in the command console.

For example, when running "main_porous.m", if it throws an error when computing barcodes stored in "BC_training, pause and parse the command "python BC_compute.py ./data1/Sample_training ./data1/BC_training" on the command console. Then rerun the "main_porous.m".

For convenience, we provide both original data and barcodes stored in "BC_train" and "BC_test" in each data set.

ZJUCAGD/persistence-rank-function

Vectorization of persistence barcode with applications in pattern classification of porous structures (Computers & Graphics 2020)

requirements

Operating System

Matlab

Python

Python Packages

Code and files

Computing Barcodes

Computing Feature Vectors for Vectorizing Barcodes

Dimensionality Reduction

Classification

main function

Instructions

Set path

data1

data3

data4

Notes