This repo is based on the original CAMII pipeline, and where future development will take place. As of version 1.1, no major functionality is added, but speed is improved more than 10x by significant code refactoring.
Future plans include incorporation of hyperspectral images for diversity-optimized picking, image-to-taxonomy models, and end-to-end segmentation and classification.
The program runs on Python 3 and is tested with 3.10, third-party packages are:
- NumPy, SciPy, pandas, Polars, scikit-learn, scikit-image, scikit-bio, Matplotlib, seaborn
- opencv-python, Pillow
- python-tsp
- PyYAML, tqdm
- rich
All packages should be easily installed with pip.
pip3 install numpy scipy pandas polars scikit-learn scikit-image scikit-bio matplotlib seaborn opencv-python pillow python-tsp pyyaml tqdm
In the current setup, the camera in our robot system takes images of rectanguar culture plates under two light conditions: red light from bottom and white light along the upper edge of the plate. Images are output in .bmp
format. Put these image pairs in one directory.
Make sure that:
- File names end with
.bmp
and the string before the first_
is plate name or barcode. - There are two and only two images for each plate, and the image under red light comes before the image under white light when sorted by file name. This is likely already the case since this is the order in which the robot takes pictures.
To proceed, convert the .bmp
images into .png
format with
./data_transform.py process_bmp -i <input_dir> -o <output_dir>
Images in the output directory will come in groups of 3:
<barcode>_gs_red.png
, the picture taken with red light in grayscale.<barcode>_rgb_red.png
, the picture taken with red light.<barcode>_rgb_white.png
, the picture taken with white light.
We only convert the red light images to grayscale and this is what we use for colony detection.
Prepare a csv file with these columns for each plate:
barcode group num_picks_group num_picks_plate
This is useful for picking a given number of colonies from a group of plates while limiting the number of colonies picked from each individual plate.
A .yaml
file specifying arguments for the pipeline.
Based on a reference panel of CAMII pictures, we calculate calibration parameters to account for non-uniform illumination and other artifacts. In the current implmentation, we simply do this by dividing the average value of each pixel by the average over the entire image. Input images in the subsequent steps are then divided by these calibration parameters, so that pixels that typically have extreme values are brought closer to the mean, i.e., background is removed.
./calc_calib_params.py -i <input_dir_with_reference_bmp_pairs> -o <output_dir> -c <config_file>
You can find in this repo pre-computed calibration parameters at ./test_data/parameters/calib_parameter.npz
.
The robot migh not picking where we want, due to lens distortion and other systematic error. We can fit a linear model to correct for this. In current implementation the corrected picking coordinates (x'
and y'
) are fitted by:
x' = x - (ax2 * x^2 + ax1 * x + axy * y + bx)
y' = y - (ay2 * y^2 + ay1 * y + ayx * x + by)
where x
and y
on the right of the equal sign are the picking coordinates output by the last step, ax2, ax1, axy, bx, ay2, ay1, ayx, by
are fitted parameters.
Fitting such a model would require actually experimenting with the robot, but a .json
file with pre-computed model parameters is provided in ./test_data/parameters/correction_params.json
.
Microbial colonies are detected by the canonical pipeline of image processing: background subtraction, thresholding, contour detection, and contour filtering.
./detect_colonies.py -i <input_dir_with_png_pairs> -o <output_dir> -b <calibration_parameter_npz> -c <config_file>
When input path is a .png
image, colony detection is performed for this single image, and in the output directory, these output will be generated:
<barcode>_annot.json
, colony segmentation in coco format.<barcode>_image_contour.jpg
, segmentation contours overlaid on the white light image.<barcode>_image_gray_contour.jpg
, segmentation contours overlaid on the red light image in grayscale.<barcode>_metadata.csv
, metadata for each contour (i.e., putative colony).
When input path is a directory, colony detection is performed for all .png
images in the directory, and the same list of output files will be generated for each image in the directory.
A subset of all detected colonies will be selected for picking, under constraint set in the plate metadata. We start by selecting num_picks_group
colonies (in the metadata) from each group using farthest point algorithm. This algorithms randomly choose num_picks_group
colonies and iteratively refine this set until convergence by replacing a colony the current set by a colony that is farthest away from the current set. When doing replacement, the number of colonies selected for each plate is recorded to not exceed the num_picks_plate
(in the metadata) limit for each plate.
./select_colonies.py init -p <directory_with_png_images> -i <directory_with_segmentations> -o <output_dir> -m <path_to_metadata> -c <path_to_config>
In the output directory, these output will be generated for each plate:
<barcode>_annot_init.json
, colony segmentation (after initial selectio) in coco format.<barcode>_metadata_init.json
, metadata for each selected contour (i.e., putative colony).<barcode>_gray_contour_init.jpg
, segmentation contours (after initial selection) overlaid on the red light image in grayscale.
In this step you need to remove unwanted colonies selected in the first step yourself. I suggest quick online tool makesense.ai or Darwin V7 Lab since you could import and export coco annotations.
After manual twicking, output segmentation in coco format into the same output directory and name it as <barcode>_annot_init_post.json
.
After a few colonies are labeled as bad colonies, the constraints set in the metadata are no longer satisfied. In this step we run a simpler fartherst point algorithm to make up for the lost colonies.
./select_colonies.py post -p <directory_with_png_images> -i <input_dir> -m <path_to_metadata> -s init
In the output directory, these output will be generated for each plate:
<barcode>_annot_final.json
, colony segmentation (after post selection) in coco format.<barcode>_metadata_final.json
, metadata for each selected contour (i.e., putative colony).<barcode>_gray_contour_final.jpg
, segmentation contours (after post selection) overlaid on the red light image in grayscale.
If you want you can go back to graphical user interface again to exclude bad colonies. If you do this, store modifed segmentation annotation as <barcode>_annot_final_post.json
in the output directory and run the post selection step again (but make sure to specify -s final
and the output from the last step will be overwritten).
After you are good with the colony selection on each plate, finalize the selection by generating a few vislization and run Travelling Salesman Problem (TSP) to find the optimal pick order that minimizes robot movement.
./select_colonies.py final -p <directory_with_png_images> -i <input_dir_with_results_from_last_step> -o <output_dir> -m <path_to_metadata> -t [heuristic|exact]
In the output directory, these output will be generated for each plate:
<barcode>_gray_contour.jpg
, segmentation contours overlaid on the red light image in grayscale.<barcode>_metadata.json
, metadata for each selected contour (i.e., putative colony).<barcode>_picking.json
, picking coordinates of selected colonies in CSV format, first column is x coordinate and second column is y coordinate.<barcode>_rgb_red_contour.jpg
, segmentation contours overlaid on the red light image.<barcode>_rgb_white_contour.jpg
, segmentation contours overlaid on the white light image.
Just note that exact TSP optimization might take forever if you have hundreds of colonies to in a plate.
./correct_coords.py -i <input_dir_with_picking_json_from_last_step> -p <correction_parameter_json>
In the directory, we do correction for coordinates in all *_picking.json
files and store corrected coordiantes as <barcode>_Coordinates.csv
, named like this because of the requirement by the colony picking robot.