Dataset and benchmark for activity few-shot prediction with Cell Painting dataset.
-
Clone this repository to your HOME folder.
-
To install dependencies, simply run:
cd FSL_CP conda env create -f environment.yaml conda activate fslcp
Note: It is advised to run mamba instead of conda to save ~20 mins of your life.
All results are available in the result folder. In there, subfolder notebook is where all of the graphs come from.
Click on the hyperlink to download the images, csv files, and weights of multitask model.
Since the images are fairly big (~300G), you can also download them via curl or wget:
curl https://irods-web.zdv.uni-mainz.de/irods-rest/rest/fileContents/zdv/home/sonha/fsl_cp_images.zip?ticket=l2P9J6ConqQOLNF --output fsl_cp_images.zip
wget https://irods-web.zdv.uni-mainz.de/irods-rest/rest/fileContents/zdv/home/sonha/fsl_cp_images.zip?ticket=l2P9J6ConqQOLNF -O fsl_cp_images.zip
In addition, we supply a small sample of the dataset. It is useful for those who are curious what the dataset looks like, but cannot be used to run the scripts.
- Create a data folder, place the downloaded output folder (csv files) into it.
- Place the downloaded weights folder.
- Create an empty logs folder.
- Place the images folder anywhere you like.
The folder hierachy should look like this.
The codes for all models are placed in the fsl_cp folder. Simply run:
python fsl_cp/desired_file.py
You might need to change some of the flags to get it run properly on your system. So retrieve a list of flags, run:
python fsl_cp/desired_file.py -h
Run the script:
python fsl_cp/generate_cnn_embeddings.py -p path/to/image/folder
The base dataset class supports concatenating features from different CSV files. But if you generate new embeddings from the data, please save them to a CSV file (like norm_CP_feature_df.csv), and make sure:
- The first 3 columns are 'INCHIKEY', 'CPD_SMILES', 'SAMPLE_KEY'. The rest of the columns are embeddings.
- the 'SAMPLE_KEY'column is in the same order as in the norm_CP_feature_df.csv.
There are tutorial notebooks available in the notebook folder.