MiniSom is a minimalistic and Numpy based implementation of the Self Organizing Maps (SOM). SOM is a type of Artificial Neural Network able to convert complex, nonlinear statistical relationships between high-dimensional data items into simple geometric relationships on a low-dimensional display.
Just use pip:
pip install minisom
or download MiniSom to a directory of your choice and use the setup script:
python setup.py install
In order to use MiniSom you need your data organized as a Numpy matrix where each row corresponds to an observation or as list of lists like the following:
data = [[ 0.80, 0.55, 0.22, 0.03],
[ 0.82, 0.50, 0.23, 0.03],
[ 0.80, 0.54, 0.22, 0.03],
[ 0.80, 0.53, 0.26, 0.03],
[ 0.79, 0.56, 0.22, 0.03],
[ 0.75, 0.60, 0.25, 0.03],
[ 0.77, 0.59, 0.22, 0.03]]
Then you can run MiniSom just as follows:
from minisom import MiniSom
som = MiniSom(6, 6, 4, sigma=0.3, learning_rate=0.5) # initialization of 6x6 SOM
print "Training..."
som.train_random(data, 100) # trains the SOM with 100 iterations
print "...ready!"
MiniSom implements two types of training. The random training (implemented by the method train_random
), where the model is trained picking random samples from your data, and the batch training (implemented by the method train_batch
), where the samples are picked in the order they are stored.
The weights of the network are randmly initialized by default. Two additional methods are provided to initialize the weights in a data driven fashion: random_weights_init
and pca_weights_init
.
After the training you will be able to
- Compute the coordinate assigned to an observation
x
on the map with the methodwinner(x)
. - Compute the average distance map of the weights on the map with the method
distance_map()
. - Compute the number of times that each neuron have been considered winner for the observations of a new data set with the method
activation_response(data)
. - Compute the quantization error with the method
quantization_error(data)
.
The data can be quantized by assigning a code book (weights vector of the winning neuron) to each sample in data. This kind of vector quantization is implemented by the method quantization
that can be called as follows:
qnt = som.quantization(data)
In this example we have that qnt[i]
is the quantized version of data[i]
.
A model can be saved using pickle as follows
import pickle
som = MiniSom(7, 7, 4)
# ...train the som here
# saving the some in the file som.p
with open('som.p', 'wb') as outfile:
pickle.dump(som, outfile)
and can be loaded as follows
with open('som.p', 'rb') as infile:
som = pickle.load(infile)
Note that if a lambda function is used to define the decay factor MiniSom will not be pickable anymore.
The code that produces the following figures is in this notebook: https://github.com/JustGlowing/minisom/blob/master/examples/examples.ipynb
- Iris flower dataset.
For each observation we have a marker placed on the position of the winning neuron on the map. Each type of marker represents a class of the iris data. The average distance map of the weights is used as background.
Each neuron is associated with one of the labels in the dataset with a specific degree.
- Color quantization
- Images clustering
The graph above represent each image with the handwritten digit it contains. The position corresponds to the position of the winning neuron for the image. Here we also have a version of this graphs that shows the original images:
- Natural language processing
In this example each poem is associate with a cell in the map. The color represent the author. Check out the notebook in the examples for more details: https://github.com/JustGlowing/minisom/blob/master/examples/PoemsAnalysis.ipynb
The following video tutorials made by the GeoEngineerings School show how to use MiniSom to build a fraud detection system:
- Katsutoshi Masai, Kai Kunze, Yuta Sugiura, Maki Sugimoto. Mapping Natural Facial Expressions Using Unsupervised Learning and Optical Sensors on Smart Eyewear. Proceedings of the 2018 ACM International Joint Conference and 2018 International Symposium on Pervasive and Ubiquitous Computing and Wearable Computers, 2018 ACM.
- Ili Ko, Desmond Chambers, Enda Barrett. A Lightweight DDoS Attack Mitigation System within the ISP Domain Utilising Self-organizing Map. Proceedings of the Future Technologies, 2018 Springer.
- T. M. Nam et al. Self-organizing map-based approaches in DDoS flooding detection using SDN. 2018 International Conference on Information Networking (ICOIN), 2018.
- Li Yuan Implementation of Self-Organizing Maps with Python. Master Thesis, University of Rhode Island, 2018.
- Ying Xie, Linh Le, Yiyun Zhou, Vijay V.Raghavan. Deep Learning for Natural Language Processing. Elsevier Handbook of Statistics, 2018.
- Vincent Fortuin, Matthias Hüser, Francesco Locatello, Heiko Strathmann, and Gunnar Rätsch. Deep Self-Organization: Interpretable Discrete Representation Learning on Time Series. 2018.
- Birgitta Dresp-Langley, John Mwangi Wandeto, Henry Okola Nyongesa. Using the quantization error from Self‐Organizing Map (SOM) output for fast detection of critical variations in image time series. ISTE OpenScience, 2018.
- John M. Wandeto, Henry O. Nyongesa, Birgitta Dresp-Langley. Detection of Structural Change in Geographic Regions of Interest by Self Organized Mapping: Las Vegas City and Lake Mead across the Years. 2018.
- Denis Mayr Lima Martins, Gottfried Vossen, Fernando Buarque de Lima Neto. Learning database queries via intelligent semiotic machines. IEEE Latin American Conference on Computational Intelligence (LA-CCI), 2017.
- Udemy online course. Deep Learning A-Z™: Hands-On Artificial Neural Networks
- Fredrik Broch Elgaaen, Nicholas Mowatt Larssen. Data mining i banksektoren - Prediksjonsmodellering og analyse av kunder som sier opp boliglån. University of Oslo, May 2017.
- Óscar Clavería González, Enric Monte Moreno, Salvador Torra Porras. A self-organizing map analysis of survey-based agents׳ expectations before impending shocks for model selection: The case of the 2008 financial crisis. International Economics Volume 146, Pages 40–58. August 2016.
- Sameen Mansha, Faisal Kamiran, Asim Karim, Aizaz Anwar. A Self-Organizing Map for Identifying InfluentialCommunities in Speech-based Networks. Proceeding CIKM '16 Proceedings of the 25th ACM International on Conference on Information and Knowledge Management Pages 1965-1968. 2016.
- Sameen Mansha, Zaheer Babar, Faisal Kamiran, Asim Karim. Neural Network Based Association Rule Mining from Uncertain Data. Neural Information Processing Volume 9950 of the series Lecture Notes in Computer Science pp 129-136. 2016.
- Makiyama, Vitor Hirota, M. Jordan Raddick, and Rafael DC Santos. Text Mining Applied to SQL Queries: A Case Study for the SDSS SkyServer. 2nd Annual International Symposium on Information Management and Big Data. 2015.
- Remi Domingues. Machine Learning for Unsupervised Fraud Detection. Royal Institute of Technology School of Computer Science and Communication KTH CSC. 2015.
- Ivana Kajić, Guido Schillaci, Saša Bodiroža, Verena V. Hafner, Learning hand-eye coordination for a humanoid robot using SOMs. Proceedings of the 2014 ACM/IEEE international conference on Human-robot interaction Pages 192-193.
Minisom has been tested under Python 3.6.2.
MiniSom by Giuseppe Vettigli is licensed under the Creative Commons Attribution 3.0 Unported License. To view a copy of this license, visit http://creativecommons.org/licenses/by/3.0/.