IDR/idr-metadata

idr0010-doil-dnadamage S-BIAD885

will-moore opened this issue · 29 comments

idr0010-doil-dnadamage

Imported 1 ome.zarr plate into OMERO. http://localhost:1080/webclient/?show=plate-202 Time taken for the import

1 hour

Started conversion of full dataset on pilot-zarr2-dev.

Installed minio client on pilot-zarr2-dev same as at #643 (comment)

I see that currently 62 / 148 plates have been converted so far (in ~22 hours) so will need another day...

Started to copy some data over. Can copy the rest once done, but this allows me to start import etc...

$ cd /ngff
$ /home/wmoore/mc cp -r idr0010/ uk1s3/idr0010/zarr

https://hms-dbmi.github.io/vizarr/?source=https://uk1s3.embassy.ebi.ac.uk/idr0010/zarr/100-27.ome.zarr

Image

The copy above is showing some errors...

mc: <ERROR> Failed to copy `/data/ngff/idr0010/120.ome.zarr/A/20/0/2/0/1/0/0/0`. Object does not exist
mc: <ERROR> Failed to copy `/data/ngff/idr0010/120.ome.zarr/A/25/.zattrs`. Object does not exist
mc: <ERROR> Failed to copy `/data/ngff/idr0010/120.ome.zarr/A/25/.zgroup`. Object does not exist
mc: <ERROR> Failed to copy `/data/ngff/idr0010/12-23.ome.zarr/J/30/.zattrs`. Object does not exist
mc: <ERROR> Failed to copy `/data/ngff/idr0010/120.ome.zarr/A/21/.zattrs`. Object does not exist
mc: <ERROR> Failed to copy `/data/ngff/idr0010/120.ome.zarr/A/25/0/.zattrs`. Object does not exist
mc: <ERROR> Failed to copy `/data/ngff/idr0010/120.ome.zarr/A/25/0/.zgroup`. Object does not exist
mc: <ERROR> Failed to copy `/data/ngff/idr0010/12-23.ome.zarr/K/8/0/0/0/0/0/0/0`. Object does not exist
mc: <ERROR> Failed to copy `/data/ngff/idr0010/120.ome.zarr/A/21/0/.zgroup`. Object does not exist
mc: <ERROR> Failed to copy `/data/ngff/idr0010/120.ome.zarr/A/21/0/0/0/0/0/0/0`. Object does not exist
mc: <ERROR> Failed to copy `/data/ngff/idr0010/120.ome.zarr/A/25/0/0/.zarray`. Object does not exist
mc: <ERROR> Failed to copy `/data/ngff/idr0010/12-23.ome.zarr/K/8/0/1/.zarray`. Object does not exist
mc: <ERROR> Failed to copy `/data/ngff/idr0010/12-23.ome.zarr/K/8/0/1/0/0/0/0/0`. Object does not exist
mc: <ERROR> Failed to copy `/data/ngff/idr0010/120.ome.zarr/A/21/0/1/.zarray`. Object does not exist
mc: <ERROR> Failed to copy `/data/ngff/idr0010/120.ome.zarr/A/25/0/0/0/0/0/0/0`. Object does not exist
mc: <ERROR> Failed to copy `/data/ngff/idr0010/120.ome.zarr/A/25/0/0/0/1/0/0/0`. Object does not exist
mc: <ERROR> Failed to copy `/data/ngff/idr0010/12-23.ome.zarr/K/8/0/1/0/1/0/0/0`. Object does not exist
mc: <ERROR> Failed to copy `/data/ngff/idr0010/120.ome.zarr/A/21/0/2/.zarray`. Object does not exist
mc: <ERROR> Failed to copy `/data/ngff/idr0010/120.ome.zarr/A/22/0/.zattrs`. Object does not exist
mc: <ERROR> Failed to copy `/data/ngff/idr0010/120.ome.zarr/A/25/0/1/.zarray`. Object does not exist
mc: <ERROR> Failed to copy `/data/ngff/idr0010/12-23.ome.zarr/K/8/0/2/.zarray`. Object does not exist
mc: <ERROR> Failed to copy `/data/ngff/idr0010/12-23.ome.zarr/K/8/0/2/0/0/0/0/0`. Object does not exist
mc: <ERROR> Failed to copy `/data/ngff/idr0010/120.ome.zarr/A/24/.zgroup`. Object does not exist
mc: <ERROR> Failed to copy `/data/ngff/idr0010/120.ome.zarr/A/25/0/1/0/0/0/0/0`. Object does not exist
mc: <ERROR> Failed to copy `/data/ngff/idr0010/120.ome.zarr/A/25/0/1/0/1/0/0/0`. Object does not exist
mc: <ERROR> Failed to copy `/data/ngff/idr0010/12-23.ome.zarr/K/8/0/2/0/1/0/0/0`. Object does not exist
mc: <ERROR> Failed to copy `/data/ngff/idr0010/120.ome.zarr/A/24/0/0/.zarray`. Object does not exist
mc: <ERROR> Failed to copy `/data/ngff/idr0010/120.ome.zarr/A/24/0/2/.zarray`. Object does not exist
mc: <ERROR> Failed to copy `/data/ngff/idr0010/120.ome.zarr/A/24/0/2/0/0/0/0/0`. Object does not exist
mc: <ERROR> Failed to copy `/data/ngff/idr0010/120.ome.zarr/A/25/0/2/.zarray`. Object does not exist
mc: <ERROR> Failed to copy `/data/ngff/idr0010/120.ome.zarr/A/1/0/2/0/1/0/0/0`. Object does not exist
mc: <ERROR> Failed to copy `/data/ngff/idr0010/120.ome.zarr/A/14/0/2/0/1/0/0/0`. Object does not exist
mc: <ERROR> Failed to copy `/data/ngff/idr0010/120.ome.zarr/A/24/0/2/0/1/0/0/0`. Object does not exist
mc: <ERROR> Failed to copy `/data/ngff/idr0010/120.ome.zarr/A/25/0/2/0/0/0/0/0`. Object does not exist
mc: <ERROR> Failed to copy `/data/ngff/idr0010/120.ome.zarr/A/15/.zattrs`. Object does not exist
mc: <ERROR> Failed to copy `/data/ngff/idr0010/138-27.ome.zarr/H/6/0/0/0/0/0/0/0`. Object does not exist
mc: <ERROR> Failed to copy `/data/ngff/idr0010/138-27.ome.zarr/H/6/0/0/0/1/0/0/0`. Object does not exist
mc: <ERROR> Failed to copy `/data/ngff/idr0010/138-27.ome.zarr/H/6/0/1/.zarray`. Object does not exist
mc: <ERROR> Failed to copy `/data/ngff/idr0010/135-21.ome.zarr/J/22/0/0/0/1/0/0/0`. Object does not exist
mc: <ERROR> Failed to copy `/data/ngff/idr0010/138-27.ome.zarr/H/14/0/.zgroup`. Object does not exist
mc: <ERROR> Failed to copy `/data/ngff/idr0010/138-27.ome.zarr/H/16/.zgroup`. Object does not exist
mc: <ERROR> Failed to copy `/data/ngff/idr0010/138-27.ome.zarr/H/6/0/1/0/0/0/0/0`. Object does not exist
mc: <ERROR> Failed to copy `/data/ngff/idr0010/135-21.ome.zarr/J/25/0/2/0/1/0/0/0`. Object does not exist
mc: <ERROR> Failed to copy `/data/ngff/idr0010/136-19.ome.zarr/I/14/0/1/.zarray`. Object does not exist
mc: <ERROR> Failed to copy `/data/ngff/idr0010/138-27.ome.zarr/H/17/0/1/0/0/0/0/0`. Object does not exist
mc: <ERROR> Failed to copy `/data/ngff/idr0010/138-27.ome.zarr/H/6/0/1/0/1/0/0/0`. Object does not exist
mc: <ERROR> Failed to copy `/data/ngff/idr0010/138-27.ome.zarr/H/6/0/2/.zarray`. Object does not exist
mc: <ERROR> Failed to copy `/data/ngff/idr0010/138-27.ome.zarr/H/19/.zattrs`. Object does not exist
mc: <ERROR> Failed to copy `/data/ngff/idr0010/136-19.ome.zarr/J/27/0/0/0/0/0/0/0`. Object does not exist
mc: <ERROR> Failed to copy `/data/ngff/idr0010/138-27.ome.zarr/H/19/0/.zgroup`. Object does not exist
mc: <ERROR> Failed to copy `/data/ngff/idr0010/136-19.ome.zarr/J/6/.zattrs`. Object does not exist
mc: <ERROR> Failed to copy `/data/ngff/idr0010/138-27.ome.zarr/H/6/0/2/0/1/0/0/0`. Object does not exist
mc: <ERROR> Failed to copy `/data/ngff/idr0010/138-27.ome.zarr/H/7/.zattrs`. Object does not exist
mc: <ERROR> Failed to copy `/data/ngff/idr0010/136-19.ome.zarr/K/18/0/0/.zarray`. Object does not exist
mc: <ERROR> Failed to copy `/data/ngff/idr0010/138-27.ome.zarr/H/22/.zgroup`. Object does not exist
mc: <ERROR> Failed to copy `/data/ngff/idr0010/138-27.ome.zarr/H/22/0/1/.zarray`. Object does not exist
mc: <ERROR> Failed to copy `/data/ngff/idr0010/138-27.ome.zarr/H/7/.zgroup`. Object does not exist
mc: <ERROR> Failed to copy `/data/ngff/idr0010/137-11.ome.zarr/B/25/.zgroup`. Object does not exist
mc: <ERROR> Failed to copy `/data/ngff/idr0010/137-11.ome.zarr/B/26/0/1/.zarray`. Object does not exist
mc: <ERROR> Failed to copy `/data/ngff/idr0010/138-27.ome.zarr/H/23/0/1/0/0/0/0/0`. Object does not exist
mc: <ERROR> Failed to copy `/data/ngff/idr0010/138-27.ome.zarr/H/7/0/.zattrs`. Object does not exist
mc: <ERROR> Failed to copy `/data/ngff/idr0010/137-11.ome.zarr/B/30/.zgroup`. Object does not exist
mc: <ERROR> Failed to copy `/data/ngff/idr0010/138-27.ome.zarr/H/24/0/1/0/0/0/0/0`. Object does not exist
mc: <ERROR> Failed to copy `/data/ngff/idr0010/138-27.ome.zarr/H/7/0/.zgroup`. Object does not exist
mc: <ERROR> Failed to copy `/data/ngff/idr0010/138-27.ome.zarr/H/7/0/0/.zarray`. Object does not exist
mc: <ERROR> Failed to copy `/data/ngff/idr0010/137-11.ome.zarr/B/5/0/1/.zarray`. Object does not exist
mc: <ERROR> Failed to copy `/data/ngff/idr0010/138-27.ome.zarr/H/26/0/2/.zarray`. Object does not exist
mc: <ERROR> Failed to copy `/data/ngff/idr0010/138-27.ome.zarr/H/31/.zattrs`. Object does not exist
mc: <ERROR> Failed to copy `/data/ngff/idr0010/138-27.ome.zarr/H/7/0/0/0/0/0/0/0`. Object does not exist
mc: <ERROR> Failed to copy `/data/ngff/idr0010/137-11.ome.zarr/B/6/0/2/.zarray`. Object does not exist
mc: <ERROR> Failed to copy `/data/ngff/idr0010/137-11.ome.zarr/C/10/0/.zattrs`. Object does not exist
mc: <ERROR> Failed to copy `/data/ngff/idr0010/138-27.ome.zarr/H/6/0/0/.zarray`. Object does not exist
mc: <ERROR> Failed to copy `/data/ngff/idr0010/138-27.ome.zarr/H/7/0/0/0/1/0/0/0`. Object does not exist
mc: <ERROR> Failed to copy `/data/ngff/idr0010/138-27.ome.zarr/H/7/0/1/.zarray`. Object does not exist
mc: <ERROR> Failed to copy `/data/ngff/idr0010/137-11.ome.zarr/C/13/0/0/0/0/0/0/0`. Object does not exist
mc: <ERROR> Failed to copy `/data/ngff/idr0010/138-27.ome.zarr/H/6/0/2/0/0/0/0/0`. Object does not exist
mc: <ERROR> Failed to copy `/data/ngff/idr0010/138-27.ome.zarr/H/7/0/1/0/0/0/0/0`. Object does not exist
mc: <ERROR> Failed to copy `/data/ngff/idr0010/137-11.ome.zarr/C/18/0/.zgroup`. Object does not exist

But these objects do exist. e.g:

$ cat /data/ngff/idr0010/137-11.ome.zarr/C/18/0/.zgroup
{
  "zarr_format" : 2
}

However, they are not uploaded so cause 404 and other errors - plate won't display at https://hms-dbmi.github.io/vizarr/?source=https://uk1s3.embassy.ebi.ac.uk/idr0010/zarr/120.ome.zarr

This was fixed with: (no errors):

/home/wmoore/mc cp -r idr0010/120.ome.zarr/A/ uk1s3/idr0010/zarr/120.ome.zarr/A/

Repeated for others above.

Weird... at least it worked on the second copy attempt.

We get one plate that isn't in the released study 153-29.ome.zarr. See https://idr.openmicroscopy.org/webclient/?show=screen-1351

ls idr0010/
...
153-29.ome.zarr

Repeat the copy:

cd /ngff
/home/wmoore/mc cp -r idr0010/ uk1s3/idr0010/zarr

Last plate to be processed: https://ome.github.io/ome-ngff-validator/?source=https://uk1s3.embassy.ebi.ac.uk/idr0010/zarr/99.ome.zarr

One more plate that in IDR, extra 153-29.ome.zarr:

$ ./mc ls uk1s3/idr0010/zarr | wc
    149     745    7330

@dominikl tried following #656 with idr0010 plates yesterday and got the same error that I got previously after updating symlinks:

#652 (comment)

NB: Plate import there takes 20 minutes.

Restarted import of all idr0010 plates just now to test again... (idr0125-pilot, http://localhost:1080/webclient/?show=screen-3202

Try again: NB: Using ZarrReader updated on idr0125-pilot at #643 (comment)

Import a single plate without chunks...

2023-05-18 21:01:48,615 1640361    [l.Client-0] INFO   ormats.importer.cli.LoggingImportMonitor - OBJECTS_RETURNED Step: 5 of 5  Logfile: 50477341
2023-05-18 21:01:53,416 1645162    [l.Client-2] INFO   ormats.importer.cli.LoggingImportMonitor - IMPORT_DONE Imported file: /ngff/idr0010/1-23.ome.zarr/OME/METADATA.ome.xml
Other imported objects:
Fileset:5287122

==> Summary
2705 files uploaded, 1 fileset, 1 plate created, 384 images imported, 0 errors in 0:26:38.197

Symlinks...

$ python idr-utils/scripts/managed_repo_symlinks.py Fileset:5287122 /idr0010/zarr --report

Fileset: 5287122 /data/OMERO/ManagedRepository/demo_2/Blitz-0-Ice.ThreadPool.Server-3/2023-05/18/20-35-15.516/
Render Image 14834193
fs_contents ['1-23.ome.zarr']
Link from /data/OMERO/ManagedRepository/demo_2/Blitz-0-Ice.ThreadPool.Server-3/2023-05/18/20-35-15.516/1-23.ome.zarr to /idr0010/zarr/1-23.ome.zarr

Plate looks good! 👍
http://localhost:1080/webclient/?show=plate-10519

Fileset swap...

$ python idr-utils/scripts/swap_filesets.py Plate:4501 Plate:10519 /tmp/idr0010_1-23_filesetswap.sql --report
UPDATE pixels SET name = 'METADATA.ome.xml', path = 'demo_2/Blitz-0-Ice.ThreadPool.Server-3/2023-05/18/20-35-15.516/1-23.ome.zarr/OME' where image in (select id from Image where fileset = 5287122);

$ PGPASSWORD=**** psql -U omero -d idr -h 192.168.10.102 -f /tmp/idr0010_1-23_filesetswap.sql 
UPDATE 384

We are seeing a different Well ordering issue on this Plate:
After the Fileset swap, the Wells are now appearing as follows:
(plate has 32 colums and 12 rows).
The Wells are ordered A1 -> A32 then B1 -> B32 etc. but the sequence goes not row by row but down first column, then up the second column, down the 3rd column etc.
Here's a sample of which Wells are being displayed in the updated Plate:

 A1, A24, A25, B16, B17...
 A2, A23, A26, B15, B18...
 A3, A22, A27, B14, B19...
 A4, A21, A28, B13, B20...
......................................
A11, A14,  B3,  B6, B27...
A12, A13,  B4,  B5, B28...

Screenshot 2023-05-19 at 15 50 14

cc @dgault

Checked that https://hms-dbmi.github.io/vizarr/?source=https://uk1s3.embassy.ebi.ac.uk/idr0010/zarr/99.ome.zarr looks good compared with https://idr.openmicroscopy.org/webclient/?show=plate-5936, so the NGFF looks valid, even if we still have issues importing / viewing in OMERO.

So, good to go ahead with zip and upload to BioStudies without waiting on ZarrReader issues above....

On pilot-zarr2-dev...

cd /data/ngff/idr0010
for i in */; do sudo zip -r "${i%/}.zip" "$i"; done

Last time I checked zip creation after running zip command above, it was mostly complete...

[wmoore@pilot-zarr2-dev idr0010]$ ls
100-27.ome.zarr      113-34.ome.zarr.zip  127-44.ome.zarr      141-27.ome.zarr.zip  155-45.ome.zarr      2-60.ome.zarr.zip   40-37.ome.zarr      73-40.ome.zarr
100-27.ome.zarr.zip  114-58.ome.zarr      127-44.ome.zarr.zip  142-29.ome.zarr      155-45.ome.zarr.zip  26-41.ome.zarr      40-37.ome.zarr.zip  74-26.ome.zarr
101-24.ome.zarr      114-58.ome.zarr.zip  128-35.ome.zarr      142-29.ome.zarr.zip  156-42.ome.zarr      26-41.ome.zarr.zip  41-31.ome.zarr      75-33.ome.zarr
101-24.ome.zarr.zip  115-11.ome.zarr      128-35.ome.zarr.zip  143-29.ome.zarr      156-42.ome.zarr.zip  27-37.ome.zarr      41-31.ome.zarr.zip  76-45.ome.zarr
102.ome.zarr         115-11.ome.zarr.zip  129-58.ome.zarr      143-29.ome.zarr.zip  157-46.ome.zarr      27-37.ome.zarr.zip  42-44.ome.zarr      77-20.ome.zarr
102.ome.zarr.zip     116-25.ome.zarr      129-58.ome.zarr.zip  14-35.ome.zarr       157-46.ome.zarr.zip  28-43.ome.zarr      42-44.ome.zarr.zip  78-31.ome.zarr
10-34.ome.zarr       116-25.ome.zarr.zip  130-16.ome.zarr      14-35.ome.zarr.zip   158-10.ome.zarr      28-43.ome.zarr.zip  43-04.ome.zarr      79-39.ome.zarr
10-34.ome.zarr.zip   117-12.ome.zarr      130-16.ome.zarr.zip  144-21.ome.zarr      158-10.ome.zarr.zip  29-30.ome.zarr      4-36.ome.zarr       80-29.ome.zarr
103.ome.zarr         117-12.ome.zarr.zip  131-20.ome.zarr      144-21.ome.zarr.zip  159-13.ome.zarr      29-30.ome.zarr.zip  44-13.ome.zarr      8-10.ome.zarr
103.ome.zarr.zip     118-33.ome.zarr      131-20.ome.zarr.zip  145-20.ome.zarr      159-13.ome.zarr.zip  30-44.ome.zarr      45-31.ome.zarr      81-44.ome.zarr
104-13.ome.zarr      118-33.ome.zarr.zip  132-19.ome.zarr      145-20.ome.zarr.zip  16-45.ome.zarr       30-44.ome.zarr.zip  46-33.ome.zarr      82-14.ome.zarr
104-13.ome.zarr.zip  119-43.ome.zarr      132-19.ome.zarr.zip  146-35.ome.zarr      16-45.ome.zarr.zip   3-11.ome.zarr       47-35.ome.zarr      83-48.ome.zarr
105-12.ome.zarr      119-43.ome.zarr.zip  133-29.ome.zarr      146-35.ome.zarr.zip  17-43.ome.zarr       3-11.ome.zarr.zip   48-29.ome.zarr      84-33.ome.zarr
105-12.ome.zarr.zip  120.ome.zarr         133-29.ome.zarr.zip  147-09.ome.zarr      17-43.ome.zarr.zip   31-48.ome.zarr      49-06.ome.zarr      85.ome.zarr
106-13.ome.zarr      120.ome.zarr.zip     13-3.ome.zarr        147-09.ome.zarr.zip  18-18.ome.zarr       31-48.ome.zarr.zip  5-12.ome.zarr       86-31.ome.zarr
106-13.ome.zarr.zip  121-11.ome.zarr      13-3.ome.zarr.zip    148-48.ome.zarr      18-18.ome.zarr.zip   32-42.ome.zarr      5-12.ome.zarr.zip   87-48.ome.zarr
107.ome.zarr         121-11.ome.zarr.zip  134-34.ome.zarr      148-48.ome.zarr.zip  19-29.ome.zarr       32-42.ome.zarr.zip  60-30.ome.zarr      88-40.ome.zarr
107.ome.zarr.zip     12-23.ome.zarr       134-34.ome.zarr.zip  149-21.ome.zarr      19-29.ome.zarr.zip   33-46.ome.zarr      61-43.ome.zarr      89-16.ome.zarr
108-24.ome.zarr      12-23.ome.zarr.zip   135-21.ome.zarr      149-21.ome.zarr.zip  20-29.ome.zarr       33-46.ome.zarr.zip  6-14.ome.zarr       90-27.ome.zarr
108-24.ome.zarr.zip  122-42.ome.zarr      135-21.ome.zarr.zip  150-15.ome.zarr      20-29.ome.zarr.zip   34-30.ome.zarr      62-01.ome.zarr      91-29.ome.zarr
109-36.ome.zarr      122-42.ome.zarr.zip  136-19.ome.zarr      150-15.ome.zarr.zip  21-58.ome.zarr       34-30.ome.zarr.zip  63-27.ome.zarr      9-12.ome.zarr
109-36.ome.zarr.zip  123-18.ome.zarr      136-19.ome.zarr.zip  15-11.ome.zarr       21-58.ome.zarr.zip   35-18.ome.zarr      64-41.ome.zarr      92-44.ome.zarr
110-35.ome.zarr      123-18.ome.zarr.zip  137-11.ome.zarr      15-11.ome.zarr.zip   22-21.ome.zarr       35-18.ome.zarr.zip  65-12.ome.zarr      93-40.ome.zarr
110-35.ome.zarr.zip  1-23.ome.zarr        137-11.ome.zarr.zip  151-48.ome.zarr      22-21.ome.zarr.zip   36-28.ome.zarr      66-37.ome.zarr      94-05.ome.zarr
11-08.ome.zarr       1-23.ome.zarr.zip    138-27.ome.zarr      151-48.ome.zarr.zip  23-15.ome.zarr       36-28.ome.zarr.zip  67-44.ome.zarr      95-48.ome.zarr
11-08.ome.zarr.zip   124-44.ome.zarr      138-27.ome.zarr.zip  152-14.ome.zarr      23-15.ome.zarr.zip   37-34.ome.zarr      68-30.ome.zarr      96-14.ome.zarr
111-07.ome.zarr      124-44.ome.zarr.zip  139-29.ome.zarr      152-14.ome.zarr.zip  24-27.ome.zarr       37-34.ome.zarr.zip  69-49.ome.zarr      97.ome.zarr
111-07.ome.zarr.zip  125-27.ome.zarr      139-29.ome.zarr.zip  153-29.ome.zarr      24-27.ome.zarr.zip   38-21.ome.zarr      70-12.ome.zarr      98-23.ome.zarr
112-02.ome.zarr      125-27.ome.zarr.zip  140-35.ome.zarr      153-29.ome.zarr.zip  25-28.ome.zarr       38-21.ome.zarr.zip  71-19.ome.zarr      99.ome.zarr
112-02.ome.zarr.zip  126-12.ome.zarr      140-35.ome.zarr.zip  154-43.ome.zarr      25-28.ome.zarr.zip   39-44.ome.zarr      7-14.ome.zarr
113-34.ome.zarr      126-12.ome.zarr.zip  141-27.ome.zarr      154-43.ome.zarr.zip  2-60.ome.zarr        39-44.ome.zarr.zip  72-33.ome.zarr

But not it seems all of idr0010 data has been moved or deleted?!

ssh pilot-zarr2-dev

[wmoore@pilot-zarr2-dev ~]$ cd /data/ngff/idr0010
-bash: cd: /data/ngff/idr0010: No such file or directory

$ ls /data/ngff/
idr0013  memo

Sorry, accidentely deleted the zarrs on pilot-zarr2-dev... thought it was already submitted to biostudies. I'll start the conversion again.

Running

for i in `ls /data/idr-metadata/idr0010-doil-dnadamage/screenA/plates`; do echo $i; ~/bioformats2raw-0.7.0-SNAPSHOT/bin/bioformats2raw --memo-directory ../memo /data/idr-metadata/idr0010-doil-dnadamage/screenA/plates/$i ${i%.*}.ome.zarr; done

now (in /data/ngff/idr0010)

dgault commented

The ZarrReader PR ome/ZarrReader#53 has been updated to try and improve the reordering behaviour to hopefully solve the issue seen in #641 (comment)

Just checking the sizes of zarr.zip on BioStudies, it looks like Plate 5-12.ome.zarr.zip is 465629 bytes, about 10x smaller than other plates.
Checking on IDR, that plate fails to render (so I guess we don't try to fix that now)?

Image

Checking counts of plates on https://www.ebi.ac.uk/biostudies/submissions/files?path=%2Fuser%2Fidr0010 I see that there are 149 zips there but only 148 at https://idr.openmicroscopy.org/webclient/?show=screen-1351

Using this JS Code on the biostudies page to find the difference ...

let url = "https://idr.openmicroscopy.org/webclient/api/plates/?id=1351"
let idr_plates = await fetch(url).then(rsp => rsp.json());
let idr_names = idr_plates.plates.map(p => p.name);
let names = [];
[].forEach.call(document.querySelectorAll("div [role='row'] .ag-cell[col-id='name']"), function(div) {
  names.push(div.innerHTML.trim().replace(".ome.zarr.zip", ""));
});
names.forEach(n => {if (idr_names.indexOf(n) == -1) {console.log(n)}; });

Returns:

153-29

This Plate doesn't appear in IDR - maybe it got removed from the submission for some reason?
I assume we can simply delete this from the biostudies submission page.

Specifically on #641 (comment). A few comments:

  • the plate loaded on IDR is stuck in an Import in progress state

  • trying to reimport the 5-12 plate in a testing environment fails during the second phase of the server-side import with

    2023-08-21 10:32:51,177 19008      [2-thread-1] INFO   ormats.importer.cli.LoggingImportMonitor - IMPORT_STARTED Logfile: 46134605
    2023-08-21 10:33:48,535 76366      [l.Client-0] INFO   ormats.importer.cli.LoggingImportMonitor - METADATA_IMPORTED Step: 1 of 5  Logfile: 46134605
    2023-08-21 10:33:55,721 83552      [l.Client-1] ERROR     ome.formats.importer.cli.ErrorHandler - INTERNAL_EXCEPTION: /uod/idr/metadata/idr0010-doil-dnadamage/screenA/plates/5-12.pattern
    java.lang.RuntimeException: Failure response on import!
    Category: ::omero::grid::ImportRequest
    Name: import-file-exception
    Parameters: {filename=demo_2/Blitz-0-Ice.ThreadPool.Server-26273/2023-08/21/10-32-40.039/metadata/idr0010-doil-dnadamage/screenA/plates/5-12.pattern, stacktrace=loci.formats.FormatException: Invalid tile size: x=0, y=0, w=696, h=520
      at loci.formats.FormatTools.checkTileSize(FormatTools.java:1025)
      at loci.formats.FormatTools.checkPlaneParameters(FormatTools.java:1001)
      at loci.formats.in.MinimalTiffReader.openBytes(MinimalTiffReader.java:289)
      at loci.formats.in.MetamorphReader.openBytes(MetamorphReader.java:286)
      at loci.formats.ImageReader.openBytes(ImageReader.java:467)
      at loci.formats.ReaderWrapper.openBytes(ReaderWrapper.java:348)
      at loci.formats.DimensionSwapper.openBytes(DimensionSwapper.java:249)
      at loci.formats.FileStitcher.openBytes(FileStitcher.java:493)
      at loci.formats.in.FilePatternReader.openBytes(FilePatternReader.java:144)
      at loci.formats.ImageReader.openBytes(ImageReader.java:467)
      at loci.formats.ChannelFiller.openBytes(ChannelFiller.java:156)
      at loci.formats.ChannelSeparator.openBytes(ChannelSeparator.java:229)
      at loci.formats.ReaderWrapper.openBytes(ReaderWrapper.java:348)
      at loci.formats.ReaderWrapper.openBytes(ReaderWrapper.java:348)
      at loci.formats.MinMaxCalculator.openBytes(MinMaxCalculator.java:269)
      at ome.services.blitz.repo.ManagedImportRequestI.parseDataByPlane(ManagedImportRequestI.java:872)
    ...
    
  • looking at the underlying binary files, there is indeed a dimension mismatch between the files for each channel:

    [sbesson@test114-omeroreadwrite 5-12]$ tiffinfo 0005-12\ 53BP1.stk  | grep Width
    TIFFReadDirectory: Warning, Unknown field with tag 317 (0x13d) encountered.
    TIFFReadDirectory: Warning, Unknown field with tag 33628 (0x835c) encountered.
    TIFFReadDirectory: Warning, Unknown field with tag 33629 (0x835d) encountered.
    TIFFReadDirectory: Warning, Unknown field with tag 33630 (0x835e) encountered.
    TIFFReadDirectory: Warning, Unknown field with tag 33631 (0x835f) encountered.
    TIFFFetchNormalTag: Warning, ASCII value for tag "ImageDescription" contains null byte in value; value incorrectly truncated during reading due to implementation limitations.
      Image Width: 695 Image Length: 520
    [sbesson@test114-omeroreadwrite 5-12]$ tiffinfo 0005-12\ Dapi.stk | grep Width
    TIFFReadDirectory: Warning, Unknown field with tag 317 (0x13d) encountered.
    TIFFReadDirectory: Warning, Unknown field with tag 33628 (0x835c) encountered.
    TIFFReadDirectory: Warning, Unknown field with tag 33629 (0x835d) encountered.
    TIFFReadDirectory: Warning, Unknown field with tag 33630 (0x835e) encountered.
    TIFFReadDirectory: Warning, Unknown field with tag 33631 (0x835f) encountered.
    TIFFFetchNormalTag: Warning, ASCII value for tag "ImageDescription" contains null byte in value; value incorrectly truncated during reading due to implementation limitations.
      Image Width: 696 Image Length: 520
    
  • note that all other plates in this study have fields of views with 696 x 520 so the problematic file is the 0005-12 53BP1.stk

It might be worth looking into the history of this submission to see if a workaround and/or an alternative binary file could be found with the correct dimensions in which case a reimport might be in scope.

Otherwise, a cleanup solution would be to de-annotate and delete this broken plate from production IDR in an upcoming release. /cc @francesw @will-moore

Looking in ls /uod/idr/filesets/idr0010-doil-dnadamage/20150501-original/Restored\ GW\ screen/5-12/ and compared the 0005-12\ 53BP1.stk tiffinfo with that from the corresponding file in screen 3-11 but found that this probably can't be used as a drop-in replacement. Biggest difference is all the tiff tags (don't actually know what these do)!

I don't have access to any of the historical discussion of this study. Was this on Trello (before Redmine)?

Thinking this is probably best to delete that plate 5-12 from IDR (and remove annotations from the table: https://idr.openmicroscopy.org/webclient/omero_table/14209182/?query=Plate-5894)

Updating ZarrReader on idr0125-pilot to test plate layout (currently looks as on #641 (comment))

sudo -u omero-server -s
cd
wget https://merge-ci.openmicroscopy.org/jenkins/job/BIOFORMATS-build/lastBuild/default/artifact/bio-formats-build/ZarrReader/target/OMEZarrReader-0.3.2-SNAPSHOT-jar-with-dependencies.jar
rm OMEZarrReader.jar
mv OMEZarrReader-0.3.2-SNAPSHOT-jar-with-dependencies.jar OMEZarrReader.jar

rm /opt/omero/server/OMERO.server/lib/client/OMEZarrReader.jar
cp OMEZarrReader.jar /opt/omero/server/OMERO.server/lib/client/
rm /opt/omero/server/OMERO.server/lib/server/OMEZarrReader.jar
cp OMEZarrReader.jar /opt/omero/server/OMERO.server/lib/server/

sudo service omero-server restart

This has no effect on the Images that are shown for each Well in the plate above.
Original Plate can be seen at https://hms-dbmi.github.io/vizarr/?source=https://uk1s3.embassy.ebi.ac.uk/idr0010/zarr/1-23.ome.zarr

Testing delete of Plate 5-12 and cleanup of annotations on idr0138-pilot, similar to idr0004 #637 (comment)

This first command took a very long time. Left running overnight!

omero metadata populate --context deletemap --report --wait 300 --batch 100 --localcfg '{"ns":["openmicroscopy.org/mapr/organism", "openmicroscopy.org/mapr/antibody", "openmicroscopy.org/mapr/gene", "openmicroscopy.org/mapr/cell_line", "openmicroscopy.org/mapr/phenotype", "openmicroscopy.org/mapr/sirna", "openmicroscopy.org/mapr/compound", "openmicroscopy.org/mapr/protein"], "typesToIgnore":["Annotation"]}' --cfg idr0010-screenA-bulkmap-config.yml Screen:1351

omero metadata populate --context deletemap --report --wait 300 --batch 100 --cfg idr0010-screenA-bulkmap-config.yml Screen:1351

python /uod/idr/metadata/idr-utils/scripts/annotate/clean_orphaned_maps.py
INFO:omero.util.Resources:Starting
INFO:omero.util.Resources:Starting
INFO:omero.util.Resources:Halted
INFO:root:Found 0 orphaned Organism maps
INFO:root:Found 0 orphaned Antibody maps
INFO:root:Found 634 orphaned Gene maps
INFO:root:Deleting 500 maps
INFO:root:Deleting 134 maps
INFO:root:Found 0 orphaned Cell Line maps
INFO:root:Found 2 orphaned Phenotype maps
INFO:root:Deleting 2 maps
INFO:root:Found 2 orphaned siRNA maps
INFO:root:Deleting 2 maps
INFO:root:Found 0 orphaned Compound maps
INFO:root:Found 0 orphaned Protein maps
INFO:root:Found 0 orphaned Notebook maps
INFO:root:Found 0 orphaned Study Info maps
INFO:root:Found 0 orphaned Study Components maps
INFO:omero.util.Resources:Halted

omero metadata deletebulkanns Screen:1351

Then we delete Plate...

$ omero delete Plate:5894 --report
Using session for demo@localhost:4064. Idle timeout: 10 min. Current group: Public
omero.cmd.Delete2 Plate:5894 ok
Steps: 6
Elapsed time: 4.917 secs.
Flags: []
Deleted objects
  Detector:11795
  DetectorSettings:12295
  ImagingEnvironment:1368964-1369347
  Instrument:12245
  Laser:1545
  Objective:12045
  ObjectiveSettings:11795
  CommentAnnotation:2581005
  FilesetAnnotationLink:22345
  Channel:10014077-10014844
  Image:3086564-3086947
  LogicalChannel:42039,42040
  OriginalFile:14186393-14186396
  Pixels:3086564-3086947
  PlaneInfo:8256047-8256430
  Thumbnail:2895206-2895589
  Fileset:22545
  FilesetEntry:13995483-13995485
  FilesetJobLink:105221-105225
  IndexingJob:112189
  JobOriginalFileLink:35059
  MetadataImportJob:112186
  PixelDataJob:112187
  ThumbnailGenerationJob:112188
  UploadJob:112185
  Plate:5894
  ScreenPlateLink:6094
  Well:1291863-1292246
  WellSample:2890313-2890696

TODO (still testing on idr0138-pilot):

  • delete all rows from 5-12 from Table (annotation.csv)
  • re-annotate all the other plates

I'm wondering if this is really the best course of action, since we lose a lot of study results by deleting them from the OMERO.table. That leaves open the possibility that we fix the images in future (if a user wants to work with the data)?

Decision in IDR meeting on Monday was not to extend the NGFF work to include unrelated cleanup work.
We will simply leave Plate 5-12 as it is.

There is still a ZarrReader issue outstanding for idr0010, but the data is ready to be submitted to BioStudies...

Deleted 5-12.ome.zarr.zip (invalid) and 153-29.ome.zarr.zip since this Plate isn't published in IDR.
Updated the idr0015_files.tsv accordingly.

Testing mkngff workflow for ALL 147 Plates on idr-testing:omeroreadwrite. idr0010.csv at IDR/idr-utils@631808b

(venv3) bash-4.2$ for r in $(cat $IDRID.csv); do
>   biapath=$(echo $r | cut -d',' -f2)
>   uuid=$(echo $biapath | cut -d'/' -f2)
>   fsid=$(echo $r | cut -d',' -f3)
>   omero mkngff sql --symlink_repo /data/OMERO/ManagedRepository --secret=$SECRET $fsid "/bia-integrator-data/$biapath/$uuid.zarr" > "$IDRID/$fsid.sql"
> done
Using session for demo@localhost:4064. Idle timeout: 10 min. Current group: Public
Found prefix demo_2/2016-10/14 // 04-18-15.445 for fileset 22563
Checking for prefix_dir /data/OMERO/ManagedRepository/demo_2/2016-10/14/04-18-15.445
Creating dir at /data/OMERO/ManagedRepository/demo_2/2016-10/14/04-18-15.445_mkngff
Creating symlink /data/OMERO/ManagedRepository/demo_2/2016-10/14/04-18-15.445_mkngff/0046b0d0-f20b-4482-84b1-4b2b154865fd.zarr -> /bia-integrator-data/S-BIAD885/0046b0d0-f20b-4482-84b1-4b2b154865fd/0046b0d0-f20b-4482-84b1-4b2b154865fd.zarr
...

....Got 142 (out of 147) sql scripts generated so far and cancelled it because I need to restart server - SECRET will be invalid in all these scripts.

In nearly 22 hours, 142 filesets processed -> 7 filesets an hour - 9 mins each.

On idr-testing, as omero-server user, exporting last bunch with

idr0010/67-44.ome.zarr,S-BIAD885/f5170a3f-aec7-4229-ab84-d19f592588cd,22554
idr0010/29-30.ome.zarr,S-BIAD885/f54fa4a2-851f-4a62-87c3-e401f0edfb4f,22523
idr0010/150-15.ome.zarr,S-BIAD885/f5511c21-e4d0-41a9-a419-396c8daa180c,20855
idr0010/43-04.ome.zarr,S-BIAD885/f67c947b-e88d-4124-905e-60cd570868d9,22539
idr0010/66-37.ome.zarr,S-BIAD885/f98d6f98-7434-41a7-a104-209536635967,22553
idr0010/77-20.ome.zarr,S-BIAD885/fc4f84a3-87f2-42b9-84c7-dba54604c57c,22564
(venv3) bash-4.2$ for r in $(cat $IDRID.csv); do
>   biapath=$(echo $r | cut -d',' -f2)
>   uuid=$(echo $biapath | cut -d'/' -f2)
>   fsid=$(echo $r | cut -d',' -f3)
>   omero mkngff sql $fsid "/bia-integrator-data/$biapath/$uuid.zarr" >> "$IDRID/$fsid.sql"
> done
Using session for demo@localhost:4064. Idle timeout: 10 min. Current group: Public
Found prefix: demo_2/2016-10/14/02-08-12.416 for fileset: 22554
...