Extra distance Rcpp pointers for the kohonen package.
Enables generation of self-organizing maps (SOM) with additional distance measures provided as the dist.fcts
argument to som()
and supersom()
functions from kohonen package. Run get_kernel_info()
to get available distance measures. The distance pointers are called by their name (e.g. dist.fcts = 'dice'
).
You may easily fetch the package with devtools
:
devtools::install_github('PiotrTymoszuk/somKernels')
The package is available under a GPL-3 license.
The package maintainer is Piotr Tymoszuk.
soucer
uses tools provided by Rcpp. Many thanks to the developers, maintainers and contributors.
We'll start with loading the packages needed to test functionality of the distance pointers:
library(kohonen)
library(somKernels)
The available distances include most of distances covered by the R's philentropy package. The distance list is returned by get_kernel_info()
:
> get_kernel_info()
[1] "BrayCurtis" "cosine" "dice" "jaccard" "smc" "sumofsquares"
[7] "euclidean" "manhattan" "tanimoto" "canberra" "chebyshev" "soergel"
[13] "kulczynski_d" "lorentzian" "intersection" "wavehedges" "czekanowski" "motyka"
[19] "ruzicka" "inner_product" "harmonic_mean" "hassebrook" "fidelity" "bhattacharyya"
[25] "hellinger" "matusita" "squared_chord" "squared_euclidean" "pearson" "neyman"
[31] "squared_chi" "prob_symm" "divergence" "clark" "additive_symm" "jeffreys"
[37] "k_divergence" "topsoe" "jensen_difference" "taneja"
Usage of the distance pointers is quite simple: you just pass the pointer name to the dist.fcts
argument of the kohonen's som()
or supesom()
function. We'll check them out with the popular iris
data set. The data set will be clustered with
test_data <-
iris[c('Sepal.Length', 'Sepal.Width', 'Petal.Length', 'Petal.Width')]
## generating a SOM grid object required by `som()` and `supersom()`
som_grid <- somgrid(xdim = 5,
ydim = 4,
topo = 'hexagonal',
neighbourhood.fct = 'gaussian',
toroidal = TRUE)
## fitting SOMs with Canberra and Ruzicka distances
canberra_som <- som(as.matrix(test_data),
grid = som_grid,
rlen = 1000,
dist.fcts = 'canberra')
ruzicka_som <- som(as.matrix(test_data),
grid = som_grid,
rlen = 1000,
dist.fcts = 'ruzicka')
Let's check if the distances worked by examining plots of their training process - mean distances to the SOM nodes at subsequent iterations. A decrease in mean distance with progressing iterations followed by a plateau indicates algorithm convergence:
plot(canberra_som, type = 'changes')
plot(ruzicka_som, type = 'changes')