Inconsistent CSV format required for populate metadata on Plate vv Project
will-moore opened this issue · 1 comments
Looking at examples added to the README in #28
it seems that the CSV format has different rules for Screen compared with Project:
Screen example has columns of Well,Plate
and the column types are well,plate
.
Omitting the # header
has no effect on column types of well and plate
.
# header well,plate,s,d,l,d
Well,Plate,Drug,Concentration,Cell_Count,Percent_Mitotic
A1,plate01,DMSO,10.1,10,25.4
A2,plate01,DMSO,0.1,1000,2.54
A3,plate01,DMSO,5.5,550,4
B1,plate01,DrugX,12.3,50,44.43
This creates an OMERO.table with additional Well Name
and Plate Name
columns and the previous Well
and Plate
columns now have IDs instead of Names.
Well | Plate | Drug | Concentration | Cell Count | Percent Mitotic | Well Name | Plate Name |
---|---|---|---|---|---|---|---|
9154 | 3855 | DMSO | 10.1 | 10 | 25.4 | a1 | plate01 |
9155 | 3855 | DMSO | 0.1 | 1000 | 2.54 | a2 | plate01 |
If I use column named Plate Name
instead of Plate
in the csv, the script fails:
File "/Users/willadmin/Virtual/omero/lib/python2.7/site-packages/omero_metadata/populate.py", line 1241, in post_process
plate = columns_by_name['plate'].values[i] # FIXME
and it also fails if I use Well Name
instead of Well
.
When the target is a Project, we need columns named Dataset Name
and Image Name
in the CSV and these columns are of type String
.
# header s,s,d,l,s
Image Name,Dataset Name,Bounding_Box,Channel_Index,Channel_Name
img-01.png,dataset01,0.0469,1,DAPI
img-02.png,dataset01,0.142,2,GFP
img-03.png,dataset01,0.093,3,TRITC
img-04.png,dataset01,0.429,4,Cy5
This creates an OMERO.table with an additional Image
ID column:
Image Name | Dataset Name | Bounding_Box | Channel_Index | Channel_Name | Image |
---|---|---|---|---|---|
img-01.png | dataset01 | 0.0469 | 1 | DAPI | 36638 |
img-02.png | dataset01 | 0.142 | 2 | GFP | 36639 |
If I name the first column to Image
I get a failure with
ValueError: invalid literal for long() with base 10: 'img-01.png'
I'm just documenting this to clarify the results of my investigations and document the current behaviour. It seems inconsistent for Plate that we use Well
and Plate
columns that are really Well Name
and Plate Name
. The Project behaviour is more expected.
Also, if the project.csv
has any existing column named Image
, even if it is the correct Image ID or some other number or string then the script fails.