Self organizing maps in 3d

Question

Self organizing maps in 3d

Closed this issue 6 years ago · 6 comments

Hi, your script of the self organizing maps looks really amazing! I tried extending it to 3D and visualize it with vtkplotter, this is what i get:

which looks meaningful, yet I'm not completely sure if that's what one would expect to see (I might have messed up the indeces..). Does it make sense to you?
If so, may I include your script as an the example usage of the vtkplotter package?

# -----------------------------------------------------------------------------
# Copyright 2019 (C) Nicolas P. Rougier
# Released under a BSD two-clauses license
#
# References: Kohonen, Teuvo. Self-Organization and Associative Memory.
#             Springer, Berlin, 1984.
# https://github.com/rougier/ML-Recipes/blob/master/recipes/ANN/som.py
# -----------------------------------------------------------------------------
import numpy as np
import scipy.spatial

np.random.seed(0)


class SOM:
    """ Self Organizing Map """

    def __init__(self, shape, distance):
        """ Initialize som """

        self.codebook = np.random.uniform(0, 1, shape)
        self.labels = np.random.uniform(0, 1, len(self.codebook))
        self.distance = distance / distance.max()

    def learn(self, samples, n_epoch=10000, sigma=(0.25, 0.01), lrate=(0.5, 0.01)):
        """ Learn samples """

        t = np.linspace(0, 1, n_epoch)
        lrate = lrate[0] * (lrate[1] / lrate[0]) ** t
        sigma = sigma[0] * (sigma[1] / sigma[0]) ** t
        I = np.random.randint(0, len(samples), n_epoch)
        samples = samples[I]

        for i in range(n_epoch):
            # Get random sample
            data = samples[i]

            # Get index of nearest node (minimum distance)
            winner = np.argmin(((self.codebook - data) ** 2).sum(axis=-1))

            # Gaussian centered on winner
            G = np.exp(-self.distance[winner] ** 2 / sigma[i] ** 2)

            # Move nodes towards sample according to Gaussian
            self.codebook -= lrate[i] * G[..., np.newaxis] * (self.codebook - data)


def sample_spherical(npoints, ndim=3):
    vec = np.random.randn(ndim, npoints)
    vec /= np.linalg.norm(vec, axis=0)
    return vec.T


# -----------------------------------------------------------------------------
if __name__ == "__main__":

    from vtkplotter import Points, Line, show

    n = 16
    ls = np.linspace(0, 1, n)
    X, Y, Z = np.meshgrid(ls, ls, ls)
    P = np.c_[X.ravel(), Y.ravel(), Z.ravel()]
    D = scipy.spatial.distance.cdist(P, P)

    som = SOM((len(P), 3), D)

    samples = sample_spherical(2500)
    # samples = P # this looks reasonable

    som.learn(samples, n_epoch=10000, sigma=(0.5, 0.01), lrate=(0.5, 0.01))

    # Draw samples
    actors = [Points(samples, c="lb", alpha=0.2)]

    # Draw network
    x, y, z = [som.codebook[:, i].reshape(n, n, n) for i in range(3)]

    for k in [0, 8, 15]:

        for i in range(n):
            ptjs = []
            for j in range(n):
                ptjs.append((x[i, j, k], y[i, j, k], z[i, j, k]))
            actors.append(Line(ptjs))  # line through a serie of 3d points

        for j in range(n):
            ptjs = []
            for i in range(n):
                ptjs.append((x[i, j, k], y[i, j, k], z[i, j, k]))
            actors.append(Line(ptjs))

    show(actors, axes=8)

Answer 1 · 2019-03-13T05:30:28.000Z

Nice ! But the result is a bit weird I would say. It's like you have 3 disconnected SOMs. You can have a look at https://www.labri.fr/perso/nrougier/coding/article/movies/sphere.ogg that shows what to expect at the end (it was made offline with gnuplot a long time ago).

And of course, you can include the script in vtkplotter (once fixed 😄)

Answer 2 · 2019-03-13T13:57:05.000Z

Thanks! the video looks cool, maybe I know why i'm getting somthing different: i'm assuming a set of interconnected neurons as a 3d lattice and i'm only looping on 3 specific layers... let's see if i can reproduce what you had in the video.. :)

Answer 3 · 2019-03-13T17:41:23.000Z

..that's it!! there was nothing wrong in the above one, just one dimension too much in the distance matrix :)

cheers,
Marco

# -----------------------------------------------------------------------------
# Copyright 2019 (C) Nicolas P. Rougier
# Released under a BSD two-clauses license
#
# References: Kohonen, Teuvo. Self-Organization and Associative Memory.
#             Springer, Berlin, 1984.
# https://github.com/rougier/ML-Recipes/blob/master/recipes/ANN/som.py
# -----------------------------------------------------------------------------
import numpy as np
import scipy.spatial


class SOM:
    """ Self Organizing Map """

    def __init__(self, shape, distance):
        """ Initialize som """

        self.codebook = np.random.uniform(0, 1, shape)
        self.labels = np.random.uniform(0, 1, len(self.codebook))
        self.distance = distance / distance.max()

    def learn(self, samples, n_epoch=10000, sigma=(0.25,0.01), lrate=(0.5,0.01)):
        """ Learn samples """

        t = np.linspace(0, 1, n_epoch)
        lrate = lrate[0] * (lrate[1] / lrate[0]) ** t
        sigma = sigma[0] * (sigma[1] / sigma[0]) ** t
        I = np.random.randint(0, len(samples), n_epoch)
        samples = samples[I]

        for i in range(n_epoch):
            # Get random sample
            data = samples[i]

            # Get index of nearest node (minimum distance)
            winner = np.argmin(((self.codebook - data) ** 2).sum(axis=-1))

            # Gaussian centered on winner
            G = np.exp(-self.distance[winner] ** 2 / sigma[i] ** 2)

            # Move nodes towards sample according to Gaussian
            self.codebook -= lrate[i] * G[..., np.newaxis] * (self.codebook-data)

def sample_spherical(npoints, ndim=3):
    vec = np.random.randn(ndim, npoints)
    vec /= np.linalg.norm(vec, axis=0)
    return vec.T

# -----------------------------------------------------------------------------
if __name__ == "__main__":
    
    from vtkplotter import Points, Line, show, collection

    n = 25
    X, Y = np.meshgrid(np.linspace(0, 1, n), np.linspace(0, 1, n))
    P = np.c_[X.ravel(), Y.ravel()]
    D = scipy.spatial.distance.cdist(P, P)

    som = SOM((len(P), 3), D)

    samples = sample_spherical(2500)
    som.learn(samples, 20000, sigma=(0.50, 0.01), lrate=(0.50, 0.01))

    # Draw samples
    Points(samples, c="lb", alpha=0.2)

    # Draw network
    x, y, z = [som.codebook[:, i].reshape(n, n) for i in range(3)]
    for i in range(n):
        ptjs = [(x[i,j], y[i,j], z[i,j]) for j in range(n)]
        Line(ptjs) # create a line through a serie of 3d points

    for j in range(n):
        ptis = [(x[i,j], y[i,j], z[i,j]) for i in range(n)]
        Line(ptis)

    show(collection(), axes=8)

Answer 4 · 2019-03-13T20:46:57.000Z

Much better! Note that for a nicer output, you could use meshes instead of lines.

Answer 5 · 2019-03-26T12:31:29.000Z

..indeed with meshes it looks nicer:
https://github.com/marcomusy/vtkplotter/tree/master/examples/other
:)

Answer 6 · 2019-04-30T20:26:30.000Z

Hi again @rougier , I felt the urge of making a fancy visualization of the maze solver:

https://github.com/marcomusy/vtkplotter/tree/master/examples/other

I hope you don't mind!

I'm still amazed of how well this MDP thing can do the job!!
Best wishes
Marco