/FlowSOM

FlowSOM algorithm in Python, using self-organizing maps and minimum spanning tree for visualization and interpretation of cytometry data

Primary LanguagePython

FlowSOM

PRs Welcome GitHub license PYPI version

This repository contains a Python implementation of FlowSOM algorithm for clustering and visualizing a mass cytometry data set.

For more details about the algorithm, please check (En|)

Installation

Just use pip

pip install FlowSom

Or download this repository to a directory of your choice and then run:

pip install -r requirements.txt

How to use it

Read Files
In order to use FlowSOM you need your data saved as a .csv file or a .fcs file.
file = r'flowmetry.fcs'
Or
file = 'flowmetry.csv'
Import Package
Then you need to import the package.
If you install the package via pip, then you should run
from flowsom import flowsom
If you download the repository, you should run
from flowsom import *
Play Around
Then you can run FlowSOM just as follows:
fsom = flowsom(file) # read the data
fsom.som_mapping(50, 50, 31, sigma=2.5, 
                 learning_rate=0.1, batch_size=100)  # trains SOM with 100 iterations
fsom.meta_clustering(AgglomerativeClustering, min_n=40, 
                     max_n=45, 
                     iter_n=3) # train the meta clustering for cluster in range(40,45)       

Use the trained output

After the training, you will be able to:

  • Get the weights of SOM with method fsom.map_som
  • Get the best number of clustering with method fsom.bestk
  • Get the prediction dataframe with method fsom.df and fsom.tf_df
  • Visualize the final clustering outcome with methodfsom.vis

Examples

The demo code could be found here.

The distance map of SOM trained from a sample flow cytometry data:

Flow example

The visualization example after meta-clustering using Minimal Spanning Tree (MST): MST example

FlowSOM Algorithm

FlowSOM analyzes flow or mass cytometry data using a self-Organizing Map (SOM). Using a two-level clustering and star charts, FlowSOM helps to obtain a clear overview of how all markers are behaving on all cells, and to detect subsets that might be missed otherwise.

The algorithm consists of four steps:

  • reading the data
  • building a Self-Organizing Map
  • building a minimal spanning tree
  • computing a meta-clustering

Self-Organizing Map

SOM is a type of unsupervised Artificial Neural Network able to convert complex, nonlinear statistical relationships between high-dimensional data items into simple geometric relationships on a low-dimensional display. Introduction

Minimum Spanning Tree

A minimum spanning tree (MST) or minimum weight spanning tree is a subset of the edges of a connected, edge-weighted undirected graph that connects all the vertices together, without any cycles and with the minimum possible total edge weight.

Meta-clustering

The meta-clustering technique conducted on the SOM is hierarchical consensus meta-clustering, which clusters the weights of trained SOM into different groups.

Acknowledge

FlowSOM is built based on FlowCytometryTools, MiniSom and Consensus Clustering.

Update pypi: source