/InSilicoFlow

Repository accompanying the paper "Flow cytometric single-cell identification of populations in synthetic bacterial communities".

Primary LanguagePython

Overview

This repository contains the code and data that is used to automatically identify the taxonomy of bacterial single-cells based on flow cytometry. If you find the research and/or data useful, please consider citing:

Rubbens, P., Props, R., Boon, N., Waegeman, W. (2017). Flow Cytometric Single-Cell Identification of Populations in Synthetic Bacterial Communities. PLOS ONE 12(1):e0169754.

Dependencies

InSilicoFlow depends on the following packages:

insilico.py

Script used to perform the first part of the analysis explained in the paper. It mainly performs the following steps:

  • Choose two or m bacterial populations of interest.
  • Aggregate data coming from replicates.
  • Aggregate data coming from the populations of interest (creating the so-called in silico community).
  • Use 70% of the community to train a classifier (e.g., LDA or Random Forests).
  • Use the other 30% to evaluate the performance of a classifier (expressed in for example the accuracy, AUC, ...).

invitro.py

Script used to retrieve the composition of a synthetic bacterial community. It mainly performs the following steps:

  • Create in silico community, representing the synthetic community of interest.
  • Train classifier on full community.
  • Evaluate performance by retrieving community composition for various in silico/in vitro communities.

Data availability

Our data can be found in .csv-format on this repo. Additionally, our data is made available in .fcs-format on the flowRepository, using the identifiers FR-FCM-ZZSH (axenic cultures) and FR-FCM-ZZSG. It has been preprocessed following the robust digital gating strategy by Prest et al. (2013).

fcstocsv.py

If one wants to start from the raw .fcs-files, it can be transformed to .csv-format using fcstocsv.py; therefore the data first needs to be transformed to a .csv-format; fcstocsv.py takes care of that. It makes use of fcsparser.