Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

idr0026-weigelin-immunotherapy S-BIAD860 #648

Open
will-moore opened this issue Feb 22, 2023 · 37 comments
Open

idr0026-weigelin-immunotherapy S-BIAD860 #648

will-moore opened this issue Feb 22, 2023 · 37 comments

Comments

@will-moore
Copy link
Member

idr0026-weigelin-immunotherapy

@will-moore will-moore moved this to test convert in NGFF conversion Feb 22, 2023
@dominikl dominikl moved this from test convert to re-import test image in NGFF conversion Feb 27, 2023
@dominikl
Copy link
Member

dominikl commented Mar 6, 2023

Conversion time: 9min
Import time: 72h

@dominikl dominikl moved this from re-import test image to convert all data to NGFF in NGFF conversion Mar 6, 2023
@will-moore
Copy link
Member Author

Trying to estimate how much space is needed for this conversion.

First image is uint16 (2 bytes), 507 x 507 x 21 x 71 x 4 = approx 3 GB.

Images vary in size for the study, but about 111 .pattern images (see IDR/idr-utils#56)
need converting.

Maybe 300 GB or more needed (maybe up to 500 GB)?

@will-moore will-moore self-assigned this Jun 20, 2023
@will-moore
Copy link
Member Author

will-moore commented Jun 20, 2023

Looks like all the pattern files we need to convert are under:

$ ls /uod/idr/metadata/idr0026-weigelin-immunotherapy/patterns/ | wc
    111     111    3970

Corresponds to image count from IDR/idr-utils#56

$ screen -S idr0026_bf2raw

$ conda activate bioformats2raw

$ cd /data
$ sudo chown wmoore ./idr0026
$ cd idr0026

$ for i in `ls /uod/idr/metadata/idr0026-weigelin-immunotherapy/patterns/`; do echo $i; /home/wmoore/bioformats2raw-0.6.0-24/bin/bioformats2raw --memo-directory ../memo /uod/idr/metadata/idr0026-weigelin-immunotherapy/patterns/$i ${i%.*}.ome.zarr; done

@will-moore will-moore moved this from convert all data to NGFF to upload data to s3 in NGFF conversion Jun 21, 2023
@will-moore
Copy link
Member Author

$ aws --endpoint-url https://uk1s3.embassy.ebi.ac.uk s3 mb s3://idr0026
make_bucket: idr0026
$ aws --endpoint-url https://uk1s3.embassy.ebi.ac.uk s3api put-bucket-policy --bucket idr0026 --policy file://policy.json
$ aws --endpoint-url https://uk1s3.embassy.ebi.ac.uk s3api put-bucket-cors --bucket idr0026  --cors-configuration file://cors.json
$ /home/wmoore/mc cp -r idr0026/ uk1s3/idr0026/zarr
...3.140926_14-52-18.03.ome.zarr/OME/METADATA.ome.xml: 282.79 GiB / 282.79 GiB ━━━━━━━━━━━━━

@will-moore
Copy link
Member Author

will-moore commented Jun 22, 2023

Checking on s3...

E.g. https://ome.github.io/ome-ngff-validator/?source=https://uk1s3.embassy.ebi.ac.uk/idr0026/zarr/3.50.6-3.140922_11-36-07.00.ome.zarr/0/

This image has only a single omero:channels, so the images appear as single channel in vizarr, even though the zarr array data is 4-channels and they look OK in validator.

Image

cc @sbesson

@will-moore will-moore added the bug label Jun 22, 2023
@will-moore will-moore removed their assignment Jun 22, 2023
@will-moore
Copy link
Member Author

However, the OME.xml looks OK, e.g. https://uk1s3.embassy.ebi.ac.uk/idr0026/zarr/3.50.6-3.140922_11-36-07.00.ome.zarr/OME/METADATA.ome.xml
This has 4 channels, so maybe the .zattrs omero.channel info is not so critical when we import to OMERO. But it's still wrong!

<Pixels BigEndian="true" DimensionOrder="XYZCT" ID="Pixels:0" Interleaved="false" SignificantBits="16" SizeC="4" SizeT="71" SizeX="507" SizeY="507" SizeZ="21" Type="uint16">
<Channel ID="Channel:0:0" Name="FD6_GREEN" SamplesPerPixel="1">
<LightPath/>
</Channel>
<Channel ID="Channel:0:1" Name="FD5_BLUE" SamplesPerPixel="1">
<LightPath/>
</Channel>
<Channel ID="Channel:0:2" Name="BD8_RED" SamplesPerPixel="1">
<LightPath/>
</Channel>
<Channel ID="Channel:0:3" Name="BD7_RED" SamplesPerPixel="1">
<LightPath/>
</Channel>

@will-moore will-moore moved this from upload data to s3 to create new Fileset to replace original Fileset in NGFF conversion Jun 23, 2023
@will-moore will-moore moved this from create new Fileset to replace original Fileset to upload some data to s3 and test in NGFF conversion Jun 26, 2023
@will-moore
Copy link
Member Author

On pilot-idr0125...

sudo mkdir /idr0026 && sudo /opt/goofys --endpoint https://uk1s3.embassy.ebi.ac.uk/ -o allow_other idr0026 /idr0026


# copy metadata-only images....
screen -S idr0010_aws_sync
aws s3 sync --no-sign-request --exclude '*' --include "*/.z*" --include "*.xml" --endpoint-url https://uk1s3.embassy.ebi.ac.uk s3://idr0026/zarr .

# import all images into Dataset

for dir in *; do
  omero import -d 15352 --transfer=ln_s --depth=100 --name=${dir/.ome.zarr/} --skip=all $dir --file /tmp/$dir.log  --errs /tmp/$dir.err;
done


$ python idr-utils/scripts/managed_repo_symlinks.py Dataset:15352 /idr0026/zarr

These look good in OMERO, compared to existing IDR

Image

@sbesson sbesson self-assigned this Jun 28, 2023
@sbesson
Copy link
Member

sbesson commented Jun 28, 2023

@will-moore not 100% sure of what went wrong on your conversion but using the converter library shipping with the current IDR version of Bio-Formats, I get

$ /opt/bioformats2raw/bioformats2raw-0.6.0-24/bin/bioformats2raw -p 3.49.6-3.140922_11-33-57.00.pattern 3.49.6-3.140922_11-33-57.00.zarr
OpenJDK 64-Bit Server VM warning: You have loaded library /tmp/opencv_openpnp6732331320596727672/nu/pattern/opencv/linux/x86_64/libopencv_java342.so which might have disabled stack guard. The VM will try to fix the stack guard now.
It's highly recommended that you fix the library with 'execstack -c <libfile>', or link it with '-z noexecstack'.
..>
[0/0]  99% │███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▉│ 3239/3240 (0:02:38 / 0:00:00) 
[0/0] 100% │████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████│ 3240/3240 (0:02:38 / 0:00:00) 
[0/1] 100% │████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████│ 3240/3240 (0:00:04 / 0:00:00) 

and the omero key contains metadata about all four channels as expected

$ cat 3.49.6-3.140922_11-33-57.00.zarr/0/.zattrs 
{
  "multiscales" : [ {
    "metadata" : {
      "method" : "loci.common.image.SimpleImageScaler",
      "version" : "Bio-Formats 0.6.10"
    },
    "axes" : [ {
      "name" : "t",
      "type" : "time"
    }, {
      "name" : "c",
      "type" : "channel"
    }, {
      "name" : "z",
      "type" : "space"
    }, {
      "name" : "y",
      "type" : "space"
    }, {
      "name" : "x",
      "type" : "space"
    } ],
    "name" : "11-33-57_PMT - PMT [BD2_GREEN] [00]_Time Time0000.tif",
    "datasets" : [ {
      "path" : "0",
      "coordinateTransformations" : [ {
        "scale" : [ 1.0, 1.0, 1.0, 1.0, 1.0 ],
        "type" : "scale"
      } ]
    }, {
      "path" : "1",
      "coordinateTransformations" : [ {
        "scale" : [ 1.0, 1.0, 1.0, 2.0, 2.0 ],
        "type" : "scale"
      } ]
    } ],
    "version" : "0.4"
  } ],
  "omero" : {
    "channels" : [ {
      "color" : "FF0000",
      "coefficient" : 1,
      "active" : true,
      "label" : "BD2_GREEN",
      "window" : {
        "min" : 372.0,
        "max" : 15788.0,
        "start" : 372.0,
        "end" : 15788.0
      },
      "family" : "linear",
      "inverted" : false
    }, {
      "color" : "00FF00",
      "coefficient" : 1,
      "active" : true,
      "label" : "BD8_DEEPR",
      "window" : {
        "min" : 373.0,
        "max" : 16188.0,
        "start" : 373.0,
        "end" : 16188.0
      },
      "family" : "linear",
      "inverted" : false
    }, {
      "color" : "0000FF",
      "coefficient" : 1,
      "active" : true,
      "label" : "BD7_RED",
      "window" : {
        "min" : 759.0,
        "max" : 8978.0,
        "start" : 759.0,
        "end" : 8978.0
      },
      "family" : "linear",
      "inverted" : false
    }, {
      "color" : "FF0000",
      "coefficient" : 1,
      "active" : false,
      "label" : "FD6_FDRED",
      "window" : {
        "min" : 237.0,
        "max" : 12339.0,
        "start" : 237.0,
        "end" : 12339.0
      },
      "family" : "linear",
      "inverted" : false
    } ],
    "rdefs" : {
      "defaultT" : 0,
      "model" : "color",
      "defaultZ" : 13
    }
  }
}

Sounds like the best way forward would be to redo the whole conversion?

@will-moore
Copy link
Member Author

@sbesson Looking at #648 (comment), it looks like I used the same version: bioformats2raw-0.6.0-24.
But I'll delete and try again...

@will-moore
Copy link
Member Author

Testing... (and failing!)...

(bioformats2raw) [wmoore@pilot-zarr1-dev test]$ pwd
/data/idr0026/test
$ ~/bioformats2raw-0.6.0-24/bin/bioformats2raw /uod/idr/metadata/idr0026-weigelin-immunotherapy/patterns/3.65.9-6.141023_15-45-09.03.pattern 3.65.9-6.141023_15-45-09.03.ome.zarr

cat 3.65.9-6.141023_15-45-09.03.ome.zarr/0/.zattrs

{
  "multiscales" : [ {
    "metadata" : {
      "method" : "loci.common.image.SimpleImageScaler",
      "version" : "Bio-Formats 0.6.10"
    },
    "axes" : [ {
      "name" : "t",
      "type" : "time"
    }, {
      "name" : "c",
      "type" : "channel"
    }, {
      "name" : "z",
      "type" : "space"
    }, {
      "name" : "y",
      "type" : "space"
    }, {
      "name" : "x",
      "type" : "space"
    } ],
    "name" : "15-45-09_PMT - PMT [BD7_RED] [03]_Time Time0074.tif",
    "datasets" : [ {
      "path" : "0",
      "coordinateTransformations" : [ {
        "scale" : [ 1.0, 1.0, 1.0, 1.0, 1.0 ],
        "type" : "scale"
      } ]
    }, {
      "path" : "1",
      "coordinateTransformations" : [ {
        "scale" : [ 1.0, 1.0, 1.0, 2.0, 2.0 ],
        "type" : "scale"
      } ]
    } ],
    "version" : "0.4"
  } ],
  "omero" : {
    "channels" : [ {
      "color" : "808080",
      "coefficient" : 1,
      "active" : true,
      "label" : "Channel 0",
      "window" : {
        "min" : 332.0,
        "max" : 15788.0,
        "start" : 332.0,
        "end" : 15788.0
      },
      "family" : "linear",
      "inverted" : false
    } ],
    "rdefs" : {
      "defaultT" : 0,
      "model" : "greyscale",
      "defaultZ" : 0
    }
  }
}

@will-moore
Copy link
Member Author

Even using the same lib as @sbesson gives me same result?!

(bioformats2raw) [wmoore@pilot-zarr1-dev test]$ /opt/bioformats2raw/bioformats2raw-0.6.0-24/bin/bioformats2raw /uod/idr/metadata/idr0026-weigelin-immunotherapy/patterns/3.49.6-3.140922_11-33-57.01.pattern 3.49.6-3.140922_11-33-57.01.ome.zarr

$ cat 3.49.6-3.140922_11-33-57.01.ome.zarr/0/.zattrs
{
  "multiscales" : [ {
    "metadata" : {
      "method" : "loci.common.image.SimpleImageScaler",
      "version" : "Bio-Formats 0.6.10"
    },
    "axes" : [ {
      "name" : "t",
      "type" : "time"
    }, {
      "name" : "c",
      "type" : "channel"
    }, {
      "name" : "z",
      "type" : "space"
    }, {
      "name" : "y",
      "type" : "space"
    }, {
      "name" : "x",
      "type" : "space"
    } ],
    "name" : "11-33-57_PMT - PMT [FD6_FDRED] [01]_Time Time0028.tif",
    "datasets" : [ {
      "path" : "0",
      "coordinateTransformations" : [ {
        "scale" : [ 1.0, 1.0, 1.0, 1.0, 1.0 ],
        "type" : "scale"
      } ]
    }, {
      "path" : "1",
      "coordinateTransformations" : [ {
        "scale" : [ 1.0, 1.0, 1.0, 2.0, 2.0 ],
        "type" : "scale"
      } ]
    } ],
    "version" : "0.4"
  } ],
  "omero" : {
    "channels" : [ {
      "color" : "808080",
      "coefficient" : 1,
      "active" : true,
      "label" : "Channel 0",
      "window" : {
        "min" : 376.0,
        "max" : 15788.0,
        "start" : 376.0,
        "end" : 15788.0
      },
      "family" : "linear",
      "inverted" : false
    } ],
    "rdefs" : {
      "defaultT" : 0,
      "model" : "greyscale",
      "defaultZ" : 0
    }
  }

@sbesson
Copy link
Member

sbesson commented Jun 29, 2023

I suspect there's something wrong with your environment and particularly the bioformats2raw Conda environment that you're using. Can you try deactivating Conda completely and simply running /opt/bioformats2raw/bioformats2raw-0.6.0-24/bin/bioformats2raw /uod/idr/metadata/idr0026-weigelin-immunotherapy/patterns/3.49.6-3.140922_11-33-57.01.pattern 3.49.6-3.140922_11-33-57.01.ome.zarr ?

@will-moore
Copy link
Member Author

That didn't work either!

I wanted to try on a different machine completely...

$ ssh pilot-zarr2-dev
$ cd /data
$ sudo mkdir idr0026
$ sudo chown wmoore idr0026
$ /opt/bioformats2raw/bioformats2raw-0.6.0-24/bin/bioformats2raw -p /uod/idr/metadata/idr0026-weigelin-immunotherapy/patterns/3.49.6-3.140922_11-33-57.01.pattern 3.49.6-3.140922_11-33-57.01.ome.zarr
OpenJDK 64-Bit Server VM warning: You have loaded library /tmp/opencv_openpnp632426155291925201/nu/pattern/opencv/linux/x86_64/libopencv_java342.so which might have disabled stack guard. The VM will try to fix the stack guard now.
It's highly recommended that you fix the library with 'execstack -c <libfile>', or link it with '-z noexecstack'.
Exception in thread "main" picocli.CommandLine$ExecutionException: Error while calling command (com.glencoesoftware.bioformats2raw.Converter@6ce139a4): java.io.FileNotFoundException: /uod/idr/metadata/idr0026-weigelin-immunotherapy/patterns/3.49.6-3.140922_11-33-57.01.pattern (No such file or directory)

$ cd /uod/idr/metadata/
$ ls
idr0010-doil-dnadamage  idr0054-segura-tonsilhyperion

Do I need to clone all of https://github.com/IDR/idr-metadata here?

@will-moore
Copy link
Member Author

Just confirming...

(base) [wmoore@pilot-zarr1-dev idr0026]$ mkdir test3
(base) [wmoore@pilot-zarr1-dev idr0026]$ cd test3 
(base) [wmoore@pilot-zarr1-dev test3]$ /opt/bioformats2raw/bioformats2raw-0.6.0-24/bin/bioformats2raw /uod/idr/metadata/idr0026-weigelin-immunotherapy/patterns/3.49.6-3.140922_11-33-57.01.pattern 3.49.6-3.140922_11-33-57.01.ome.zarr
OpenJDK 64-Bit Server VM warning: You have loaded library /tmp/opencv_openpnp7313185331649622917/nu/pattern/opencv/linux/x86_64/libopencv_java342.so which might have disabled stack guard. The VM will try to fix the stack guard now.
It's highly recommended that you fix the library with 'execstack -c <libfile>', or link it with '-z noexecstack'.
2023-06-29 10:14:29,998 [main] WARN  loci.formats.in.BaseTiffReader - unknown creation date format: 2014-09-22 11:34:50

$ cat 3.49.6-3.140922_11-33-57.01.ome.zarr/0/.zattrs 
{
  "multiscales" : [ {
    "metadata" : {
      "method" : "loci.common.image.SimpleImageScaler",
      "version" : "Bio-Formats 0.6.10"
    },
    "axes" : [ {
      "name" : "t",
      "type" : "time"
    }, {
      "name" : "c",
      "type" : "channel"
    }, {
      "name" : "z",
      "type" : "space"
    }, {
      "name" : "y",
      "type" : "space"
    }, {
      "name" : "x",
      "type" : "space"
    } ],
    "name" : "11-33-57_PMT - PMT [FD6_FDRED] [01]_Time Time0028.tif",
    "datasets" : [ {
      "path" : "0",
      "coordinateTransformations" : [ {
        "scale" : [ 1.0, 1.0, 1.0, 1.0, 1.0 ],
        "type" : "scale"
      } ]
    }, {
      "path" : "1",
      "coordinateTransformations" : [ {
        "scale" : [ 1.0, 1.0, 1.0, 2.0, 2.0 ],
        "type" : "scale"
      } ]
    } ],
    "version" : "0.4"
  } ],
  "omero" : {
    "channels" : [ {
      "color" : "808080",
      "coefficient" : 1,
      "active" : true,
      "label" : "Channel 0",
      "window" : {
        "min" : 376.0,
        "max" : 15788.0,
        "start" : 376.0,
        "end" : 15788.0
      },
      "family" : "linear",
      "inverted" : false
    } ],
    "rdefs" : {
      "defaultT" : 0,
      "model" : "greyscale",
      "defaultZ" : 0
    }
  }
}

@sbesson
Copy link
Member

sbesson commented Jun 29, 2023

Do I need to clone all of https://github.com/IDR/idr-metadata here?

For the sake of testing, you might just want to copy the single .pattern file you want to test directly. Otherwise, yes need t clone the whole repository unless @francesw wants to look into extracting idr0026 into a standalone Git repository

Just confirming...

The inconsistency in the output is very concerning. Have you tried after fully deactivating Conda, not just your environment?

@will-moore
Copy link
Member Author

Have you tried after fully deactivating Conda, not just your environment?

No. How do you do that?

@sbesson
Copy link
Member

sbesson commented Jun 29, 2023

conda deactivate

@will-moore
Copy link
Member Author

I already did that. How's that different from deactivating your environment?

I tried on a different machine...
Cloned idr-metadata and moved it to /uod/idr/metadata

cd /uod/idr/metadata/
sudo -Es git clone git@github.com:IDR/idr-metadata.git
cd ../
sudo mv metadata/idr-metadata ./
sudo rm metadata   # symlink to /data/idr-metadata
sudo mv idr-metadata metadata

Then tried...

cd /data/idr0026/
wmoore@pilot-zarr2-dev idr0026]$ /opt/bioformats2raw/bioformats2raw-0.6.0-24/bin/bioformats2raw -p /uod/idr/metadata/idr0026-weigelin-immunotherapy/patterns/3.49.6-3.140922_11-33-57.01.pattern 3.49.6-3.140922_11-33-57.01.ome.zarr

$ cat 3.49.6-3.140922_11-33-57.01.ome.zarr/0/.zattrs
{
  "multiscales" : [ {
    "metadata" : {
      "method" : "loci.common.image.SimpleImageScaler",
      "version" : "Bio-Formats 0.6.10"
    },
    "axes" : [ {
      "name" : "t",
      "type" : "time"
    }, {
      "name" : "c",
      "type" : "channel"
    }, {
      "name" : "z",
      "type" : "space"
    }, {
      "name" : "y",
      "type" : "space"
    }, {
      "name" : "x",
      "type" : "space"
    } ],
    "name" : "11-33-57_PMT - PMT [FD6_FDRED] [01]_Time Time0028.tif",
    "datasets" : [ {
      "path" : "0",
      "coordinateTransformations" : [ {
        "scale" : [ 1.0, 1.0, 1.0, 1.0, 1.0 ],
        "type" : "scale"
      } ]
    }, {
      "path" : "1",
      "coordinateTransformations" : [ {
        "scale" : [ 1.0, 1.0, 1.0, 2.0, 2.0 ],
        "type" : "scale"
      } ]
    } ],
    "version" : "0.4"
  } ],
  "omero" : {
    "channels" : [ {
      "color" : "808080",
      "coefficient" : 1,
      "active" : true,
      "label" : "Channel 0",
      "window" : {
        "min" : 376.0,
        "max" : 15788.0,
        "start" : 376.0,
        "end" : 15788.0
      },
      "family" : "linear",
      "inverted" : false
    } ],
    "rdefs" : {
      "defaultT" : 0,
      "model" : "greyscale",
      "defaultZ" : 0
    }
  }
}

WAT!?

@sbesson
Copy link
Member

sbesson commented Jun 29, 2023

@will-moore I think I found the source of the issue. Can you try one more test with your last configuration, running sudo /opt/bioformats2raw/bioformats2raw-0.6.0-24/bin/bioformats2raw -p /uod/idr/metadata/idr0026-weigelin-immunotherapy/patterns/3.49.6-3.140922_11-33-57.01.pattern 3.49.6-3.140922_11-33-57.01.ome.zarr (note the sudo at the beginning of the command)?

@will-moore
Copy link
Member Author

will-moore commented Jun 30, 2023

The above removal of contents of the bucket ran very slowly and has only resulted in the removal of a handful of zarr filesets out of the 111 originally there.

Since we want to delete ALL the filesets uploaded, probably quicker to delete the bucket and recreate..

ran

./mc rb --force uk1s3/idr0026

This seemed to hang/time-out and doesn't seem to have had any affect:

$ ./mc ls uk1s3/idr0026/zarr | wc
     95     475    6719

Reverted to running the rm again in a screen

./mc rm --force --recursive uk1s3/idr0026/zarr

@will-moore
Copy link
Member Author

@sbesson - Seems that the memo issue is something it would be good to fix (or at least warn) to prevent others suffering the pain above! I can create an issue somewhere, but where?

@sbesson
Copy link
Member

sbesson commented Jun 30, 2023

From my side, the immediate candidates are:

Possibly the outstanding action would be to retest a similar scenario using bioformats2raw 0.7.0, a multi-channel pattern dataset and identify whether it's IDR specific. /cc @melissalinkert

@will-moore
Copy link
Member Author

Ah - apologies @sbesson: I just realised you meant that there is probably just 1 issue (not 4) but it needs testing to determine where the issue lies!

@sbesson
Copy link
Member

sbesson commented Jun 30, 2023

Retested with a simpler version of the pattern file with 2 timepoints compatible with upstream Bio-Formats

cat 3.49.6-3.140922_11-33-57.00.pattern 
/uod/idr/filesets/idr0026-weigelin-immunotherapy/20170222-symlinks/PNAS_2015/treatment start day 3/mouse 49/day 6-3/time lapse/140922_11-33-57/11-33-57_PMT - PMT [<BD2_GREEN,BD8_DEEPR,BD7_RED,FD6_FDRED>] [00]_Time Time<0000-0001>.tif

Placed a copy of this pattern file under patterns owned by a different user and executed the two following commands:

/opt/bioformats2raw/bioformats2raw-0.6.1/bin/bioformats2raw 3.49.6-3.140922_11-33-57.00.pattern 3.49.6-3.140922_11-33-57.00.zarr
 /opt/bioformats2raw/bioformats2raw-0.6.1/bin/bioformats2raw patterns/3.49.6-3.140922_11-33-57.00.pattern 3.49.6-3.140922_11-33-57.00_2.zarr

The .zattrs are identical between both conversion and contain omero metadata for the four channels specified in the pattern file:

 (base) [sbesson@pilot-zarr2-dev tmp]$ diff 3.49.6-3.140922_11-33-57.00.zarr/0/.zattrs 3.49.6-3.140922_11-33-57.00_2.zarr/0/.zattrs 
(base) [sbesson@pilot-zarr2-dev tmp]$ tail -n 50 3.49.6-3.140922_11-33-57.00.zarr/0/.zattrs
      },
      "family" : "linear",
      "inverted" : false
    }, {
      "color" : "00FF00",
      "coefficient" : 1,
      "active" : true,
      "label" : "BD8_DEEPR",
      "window" : {
        "min" : 401.0,
        "max" : 16188.0,
        "start" : 401.0,
        "end" : 16188.0
      },
      "family" : "linear",
      "inverted" : false
    }, {
      "color" : "0000FF",
      "coefficient" : 1,
      "active" : true,
      "label" : "BD7_RED",
      "window" : {
        "min" : 801.0,
        "max" : 8055.0,
        "start" : 801.0,
        "end" : 8055.0
      },
      "family" : "linear",
      "inverted" : false
    }, {
      "color" : "FF0000",
      "coefficient" : 1,
      "active" : false,
      "label" : "FD6_FDRED",
      "window" : {
        "min" : 250.0,
        "max" : 10867.0,
        "start" : 250.0,
        "end" : 10867.0
      },
      "family" : "linear",
      "inverted" : false
    } ],
    "rdefs" : {
      "defaultT" : 0,
      "model" : "color",
      "defaultZ" : 13
    }
  }
}

Based on the above, I am leaning towards options 1 and 2 i.e. it's an IDR/bioformats specific issue which probably will be classified as wontfix as one of the aims of the ongoing conversion work is to get rid of this fork entirely

@will-moore will-moore removed the bug label Jun 30, 2023
@will-moore
Copy link
Member Author

Started creating zips in a Screen

cd /data/idr0026
for i in */; do zip -r "${i%/}.zip" "$i"; done

@will-moore
Copy link
Member Author

will-moore commented Jul 1, 2023

With all the previous Filesets deleted from s3, uploaded just a couple of different new ones to test...

(base) [wmoore@pilot-zarr1-dev ~]$ ./mc cp -r /data/idr0026/3.49.6-3.140922_11-33-57.00.ome.zarr uk1s3/idr0026/zarr/3.49.6-3.140922_11-33-57.00.ome.zarr
...zarr/OME/METADATA.ome.xml: 2.78 GiB / 2.78 GiB ━━━━━━━━━━━━━━━ 61.69 MiB/s 46s(base) 
(base) [wmoore@pilot-zarr1-dev ~]$ ./mc cp -r /data/idr0026/7.56.10-3.140926_14-52-18.03.ome.zarr uk1s3/idr0026/zarr/7.56.10-3.140926_14-52-18.03.ome.zarr
...zarr/OME/METADATA.ome.xml: 7.69 GiB / 7.69 GiB ━━━━━━━━━━━━━━━ 73.85 MiB/s 1m46s

Ooops - got an extra directory in there, but the images look good:

https://ome.github.io/ome-ngff-validator/?source=https://uk1s3.embassy.ebi.ac.uk/idr0026/zarr/7.56.10-3.140926_14-52-18.03.ome.zarr/7.56.10-3.140926_14-52-18.03.ome.zarr/0/

@will-moore
Copy link
Member Author

Uploading zips to BioStudies...

(base) [wmoore@pilot-zarr1-dev bin]$ ./ascp -P33001 -i ../etc/asperaweb_id_dsa.openssh -d /data/idr0026/idr0026 bsaspera_w@hx-fasp-1.ebi.ac.uk:5f/136e8d-xxxxxxxxx

@will-moore will-moore moved this from upload some data to s3 and test to BioStudies Submission in NGFF conversion Jul 1, 2023
@will-moore will-moore assigned will-moore and unassigned sbesson Jul 1, 2023
@will-moore
Copy link
Member Author

(base) [wmoore@pilot-zarr1-dev data]$ sudo rm -rf idr0026/

@will-moore will-moore assigned francesw and unassigned will-moore Jul 12, 2023
@francesw francesw changed the title idr0026-weigelin-immunotherapy to NGFF idr0026-weigelin-immunotherapy S-BIAD860 Aug 24, 2023
@francesw francesw removed their assignment Aug 24, 2023
@francesw francesw moved this from BioStudies Submission to Data on Embassy s3 in NGFF conversion Aug 24, 2023
@will-moore
Copy link
Member Author

Currently we have 20 out of 111 Filesets "viewable" at https://uk1s3.embassy.ebi.ac.uk/bia-integrator-data/pages/S-BIAD860.html...

idr0026/3.66.6-3.141020_15-41-29.02.ome.zarr,S-BIAD860/04219d38-3c9a-4ed7-97ba-65e8538b1e73,23273
idr0026/3.67.9-6.141023_12-39-26.04.ome.zarr,S-BIAD860/1506a279-9c9d-4fcc-b5ff-a89bacb80c11,23335
idr0026/3.66.6-3.141020_17-15-27.04.ome.zarr,S-BIAD860/1d535c04-916e-47a7-857f-f731aa1f1951,23280
idr0026/3.65.6-3.141020_15-39-00.02.ome.zarr,S-BIAD860/1e0d94df-af47-432e-917f-48687290f336,23377
idr0026/3.65.6-3.141020_17-15-07.04.ome.zarr,S-BIAD860/2e2d2806-53df-4c35-a9be-25c7ca53699d,23384
idr0026/3.66.9-6.141020_15-41-29.01.ome.zarr,S-BIAD860/2f3e36a6-05d8-4a60-9f4e-d8b87e5d8fdf,23302
idr0026/3.66.9-6.141020_15-41-29.00.ome.zarr,S-BIAD860/3b8e0297-c95c-4460-adf8-75a29bfc132b,23301
idr0026/3.66.9-6.141020_15-41-29.03.ome.zarr,S-BIAD860/487f0bdd-a020-4cff-bfcb-887edd21c9ca,23304
idr0026/3.66.6-3.141020_15-41-29.04.ome.zarr,S-BIAD860/4dab8ca2-3511-43c0-a0e9-9ec1a87aabb6,23275
idr0026/3.66.9-6.141023_15-49-01.00.ome.zarr,S-BIAD860/519ad2f4-0f5a-4ad4-ac6f-5535573f11bf,23311
idr0026/3.66.6-3.141020_15-41-29.03.ome.zarr,S-BIAD860/5a578c22-3dac-456a-ac08-b240c85c7b8a,23274
idr0026/7.51.10-3.140926_10-43-58.00.ome.zarr,S-BIAD860/7b7cc2ee-5dfd-445d-a0b4-4f58448486d0,23415
idr0026/3.65.6-3.141020_15-39-00.04.ome.zarr,S-BIAD860/7ee9776a-95ec-4861-950e-c6f0884ef27b,23379
idr0026/7.48.10-3.140926_12-18-43.00.ome.zarr,S-BIAD860/9640c08d-8cba-4e32-a32d-f593b230fadf,23445
idr0026/7.48.10-3.140926_12-18-43.02.ome.zarr,S-BIAD860/a1b618b9-4e99-4c91-95d5-fbcf45f44109,23447
idr0026/3.65.9-6.141023_15-45-09.03.ome.zarr,S-BIAD860/aa0dece8-179b-4f72-9468-df0ad91a1c20,23408
idr0026/3.66.6-3.141020_15-41-29.01.ome.zarr,S-BIAD860/aef6ffa0-5360-49f2-aa89-1f52b924cc3a,23272
idr0026/3.64.9-6.141023_12-21-30.02.ome.zarr,S-BIAD860/cc6b7eac-c829-463f-aa52-14007014da5b,23397
idr0026/7.51.10-3.140926_10-43-58.03.ome.zarr,S-BIAD860/d6a19971-d7f4-47e0-beb1-77788de12d93,23418
idr0026/7.51.10-3.140926_10-43-58.02.ome.zarr,S-BIAD860/dd0be90a-ff66-410c-86b8-63d3fb6faedb,23417
for r in $(cat idr0026.csv); do
  biapath=$(echo $r | cut -d',' -f2)
  uuid=$(echo $biapath | cut -d'/' -f2)
  fsid=$(echo $r | cut -d',' -f3)
  omero mkngff sql --symlink_repo /data/OMERO/ManagedRepository --secret=$SECRET $fsid "/bia-integrator-data/$biapath/$uuid.zarr" > "$fsid.sql"
done

...Found prefix demo_2/2017-04/13 // 07-17-06.573 for fileset 23418
Checking for prefix_dir /data/OMERO/ManagedRepository/demo_2/2017-04/13/07-17-06.573
Creating dir at /data/OMERO/ManagedRepository/demo_2/2017-04/13/07-17-06.573_mkngff
Creating symlink /data/OMERO/ManagedRepository/demo_2/2017-04/13/07-17-06.573_mkngff/d6a19971-d7f4-47e0-beb1-77788de12d93.zarr -> /bia-integrator-data/S-BIAD860/d6a19971-d7f4-47e0-beb1-77788de12d93/d6a19971-d7f4-47e0-beb1-77788de12d93.zarr
Using session for demo@localhost:4064. Idle timeout: 10 min. Current group: Public
Found prefix demo_2/2017-04/13 // 07-06-10.670 for fileset 23417
Checking for prefix_dir /data/OMERO/ManagedRepository/demo_2/2017-04/13/07-06-10.670
Creating dir at /data/OMERO/ManagedRepository/demo_2/2017-04/13/07-06-10.670_mkngff
Creating symlink /data/OMERO/ManagedRepository/demo_2/2017-04/13/07-06-10.670_mkngff/dd0be90a-ff66-410c-86b8-63d3fb6faedb.zarr -> /bia-integrator-data/S-BIAD860/dd0be90a-ff66-410c-86b8-63d3fb6faedb/dd0be90a-ff66-410c-86b8-63d3fb6faedb.zarr
for r in $(cat idr0026.csv); do
  fsid=$(echo $r | cut -d',' -f3)
  psql -U omero -d idr -h $DBHOST -f "$fsid.sql"
done

...
BEGIN
 mkngff_fileset 
----------------
        5287479
(1 row)
COMMIT
BEGIN
 mkngff_fileset 
----------------
        5287480
(1 row)
COMMIT

@will-moore
Copy link
Member Author

All good (missing thumbnails in screenshot are for images not included in the 20 updated by mkngff above:

Screenshot 2023-08-29 at 19 23 50

@will-moore will-moore moved this from Data on Embassy s3 to create new Filesets in idr-next in NGFF conversion Sep 4, 2023
@will-moore
Copy link
Member Author

will-moore commented Sep 12, 2023

Testing on idr-testing:omeroreadwrite...

Updated to today's OMEZarrReader.jar (only on omeroreadwrite server - not proxies).

Use all 111 Images in idr0026.csv - see IDR/idr-utils@003b3a3

Started mkngff at 10:37...

@will-moore
Copy link
Member Author

mkngff just done (nearly 12:00).
apply sql and view image on just readwrite server with ssh -A idr-testing.openmicroscopy.org -L 1080:omeroreadwrite:80
E.g. http://localhost:1080/webclient/?show=image-3261651

$ grep -A 2 "saved memo" /opt/omero/server/OMERO.server/var/log/Blitz-0.log | grep -A 2 "13-14-13.681_mkngff"
2023-09-12 10:59:40,189 DEBUG [                   loci.formats.Memoizer] (l.Server-2) saved memo file: /data/OMERO/BioFormatsCache/data/OMERO/ManagedRepository/demo_2/2017-04/12/13-14-13.681_mkngff/3e8c077e-5612-4ae1-a385-cfb5fb507822.zarr/OME/.METADATA.ome.xml.bfmemo (39334 bytes)
2023-09-12 10:59:40,189 DEBUG [                   loci.formats.Memoizer] (l.Server-2) start[1694516354319] time[25869] tag[loci.formats.Memoizer.setId]
2023-09-12 10:59:40,189 INFO  [                ome.io.nio.PixelsService] (l.Server-2) Creating BfPixelBuffer: /data/OMERO/ManagedRepository/demo_2/2017-04/12/13-14-13.681_mkngff/3e8c077e-5612-4ae1-a385-cfb5fb507822.zarr/OME/METADATA.ome.xml Series: 0

25869ms is 26 secs for setId

@will-moore will-moore moved this from check_pixels to pixels validated in NGFF conversion Nov 28, 2023
@imagesc-bot
Copy link

This issue has been mentioned on Image.sc Forum. There might be relevant details there:

https://forum.image.sc/t/file-format-to-store-images-using-ngff-coverter/98320/10

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
Status: NGFF studies
Development

No branches or pull requests

5 participants