CellProfiling-research

Instructions

Preperations

conda create --name tf-env python=3.8.6 tensorflow-gpu=2.2.0 scikit-learn pandas tqdm progressbar2 matplotlib openpyxl

To create an enviroment named tf-env, need to be executed once.

If we want to be able to run a jupyter notebook server we will need to run conda install --name tf-env jupyterlab

Execute a batch job

There are some batch files currently in our work folder

sbatch-downloader for downloading plates
sbatch-run-diagnose-all for running the main project for the big plates folder
sbatch-run-diagnose-fewfor running the main project for the smaller folder of plates named "few"
sbatch-notebook for running a jupyter notebook server

In the sbatch files we can redefine various things like scripts' parameter, email notifications, etc.

In order to execute a job we need to do the following thing sbatch <sbatch-file-name>

Extras:

We can define a dependacy between excuting jobs like sbatch --dependency=afterok:<other_job_id> <sbatch-file-name> a possible usage for that is running with the big plate folder only after it was successfuly executed with the small folder
At our lab we have a few "golden tickets" which will give us priority for nodes. We can use it wisely if we want to ran with less chance to be preempted. sbatch --qos=assafzar <sbatch-file-name>

Scripts

/code/learning/main.py

Usages

Usage 1: without parameters, will run over the default big directory
Usage 2: main.py <working_folder> will run over the given directory

Examples

Usage 1: main.py will run over with the default big plate folder - option will be removed in the future
Usage 2: main.py C:\plates will run over C:\plates

Note: Currently the directory must contain the "csvs" folder from the downloader script

Output

Generates output in the given directory in a folder named "results"

/code/plateDownloader.py

Usages

Usage 1: plateDownloader.py <working_folder> -n <plate_amount>
Usage 2: plateDownloader.py <working_folder> -l <plate_number1> <plate_number2> ...
Usage 3: plateDownloader.py <working_folder> -n <plate_amount> -l <plate_number1> <plate_number2> ...

Examples

Usage 1: plateDownloader.py C:\plates -n 5 will download 5 random plates from the available plates at the ftp server to C:\plates
Usage 2: plateDownloader.py C:\plates -l 26569 26572 25732 will download plates: 26569, 26572 and 25732 to C:\plates
Usage 3: plateDownloader.py C:\plates -n 2 -l 26569 26575 26574 26576 will download 2 random plates from the given plates' number to C:\plates

Notes:

Will download files that not already exists in the given folders
With the "n" parameter, selects n plates randomly from the given list or from the ftp server

Output

Generates three folders in the given directory

tars: contain the raw tar.gz files of the plates
extracted: the relevant files from the compressed files
csvs: the relevant details for each plate in a csv file per plate, contain all the details for the cells from this plate.

/code/extractStatistics.py

Usages

Usage: extractStatistics.py <csv_folder> <output_folder>

Examples

Usage: extractStatistics.py C:\plates\csvs C:\plates\stats will run over the plates' csv files in C:\plates\csvs and create their statistics in C:\plates\stats

Notes:

None

Output

Currently generates:

Excel file with each plates' statistics.
Plot for every feature in each plate - will be replaced in the future

leorrose/CellProfiling-research

CellProfiling-research

Instructions

Preperations

Execute a batch job

Scripts

/code/learning/main.py

Usages

Examples

Output

/code/plateDownloader.py

Usages

Examples

Notes:

Output

/code/extractStatistics.py

Usages

Examples

Notes:

Output