ArgumentError: Unable to determine chunksize of non-range views.
Closed this issue · 21 comments
Hello,
I am trying to see if I am doing something wrong here. This piece of code worked a couple of months ago and no longer works. I tried on multiple dataset/cube and I have the same error.
Thanks!
using YAXArrays
using Zarr
using DimensionalData
using Dates
path="gs://cmip6/CMIP6/ScenarioMIP/DKRZ/MPI-ESM1-2-HR/ssp585/r1i1p1f1/3hr/tas/gn/v20190710/"
store = zopen(path, consolidated=true)
ds = open_dataset(store)
ds.tas[Ti=Where(x-> Dates.monthday(x) != (2,29))]
ERROR: ArgumentError: Unable to determine chunksize of non-range views.
Stacktrace:
[1] eachchunk_view(::DiskArrays.Chunked{DiskArrays.ChunkRead{…}}, vv::SubArray{Float32, 3, ZArray{…}, Tuple{…}, false})
@ DiskArrays ~/.julia/packages/DiskArrays/6JA8Z/src/subarray.jl:29
[2] eachchunk(a::DiskArrays.SubDiskArray{Float32, 3, ZArray{…}, Tuple{…}, false})
@ DiskArrays ~/.julia/packages/DiskArrays/6JA8Z/src/subarray.jl:25
[3] rebuild(A::YAXArray{…}, data::DiskArrays.SubDiskArray{…}, dims::Tuple{…}, refdims::Tuple{}, name::DimensionalData.NoName, metadata::Dict{…})
@ YAXArrays.Cubes ~/.julia/packages/YAXArrays/b5XBB/src/Cubes/Cubes.jl:201
[4] rebuild
@ ~/.julia/packages/DimensionalData/GaADx/src/array/array.jl:85 [inlined]
[5] rebuildsliced
@ ~/.julia/packages/DimensionalData/GaADx/src/array/array.jl:100 [inlined]
[6] rebuildsliced
@ ~/.julia/packages/DimensionalData/GaADx/src/array/array.jl:99 [inlined]
[7] view
@ ~/.julia/packages/DimensionalData/GaADx/src/array/indexing.jl:125 [inlined]
[8] _dim_view
@ ~/.julia/packages/DimensionalData/GaADx/src/array/indexing.jl:110 [inlined]
[9] #view#110
@ ~/.julia/packages/DimensionalData/GaADx/src/array/indexing.jl:81 [inlined]
[10] getindex(::YAXArray{Float32, 3, ZArray{…}, Tuple{…}, Dict{…}}; kwargs::@Kwargs{Ti::Where{…}})
@ YAXArrays.Cubes ~/.julia/packages/YAXArrays/b5XBB/src/Cubes/Cubes.jl:488
[11] top-level scope
@ REPL[33]:1
Some type information was truncated. Use `show(err)` to see complete types.
What versions of the packages are you on?
edit - See messages below this one for relevant information.
I am on latest versions for all packages (DimensionalData, YAXArrays, DiskArrays). I tried with older version, close to last spring 2024, but somehow, I still haven't found where it is breaking yet...
I will try to look closely at the date when the code worked (i.e. JuliaDataCubes/YAXArrays.jl#357)
Here's some configurations I tried so far:
# latest
[0703355e] DimensionalData v0.27.6
[3c3547ce] DiskArrays v0.4.4
[c21b50f5] YAXArrays v0.5.10
⌅ [0703355e] DimensionalData v0.26.8
⌃ [3c3547ce] DiskArrays v0.4.3
⌃ [c21b50f5] YAXArrays v0.5.5
⌅ [0703355e] DimensionalData v0.26.8
[3c3547ce] DiskArrays v0.4.4
⌃ [c21b50f5] YAXArrays v0.5.5
[0703355e] DimensionalData v0.27.6
⌃ [3c3547ce] DiskArrays v0.4.2
⌃ [c21b50f5] YAXArrays v0.5.6
[0703355e] DimensionalData v0.27.6
⌃ [3c3547ce] DiskArrays v0.4.2
⌃ [c21b50f5] YAXArrays v0.5.7
[0703355e] DimensionalData v0.27.6
⌃ [3c3547ce] DiskArrays v0.4.2
[c21b50f5] YAXArrays v0.5.10
[0703355e] DimensionalData v0.27.6
⌃ [3c3547ce] DiskArrays v0.4.3
[c21b50f5] YAXArrays v0.5.10
I found versions where it works!
ds.tas[Ti=Where(x-> Dates.monthday(x) != (2,29))]
384×192×251120 YAXArray{Float32,3} with dimensions:
Dim{:lon} Sampled{Float64} 0.0:0.9375:359.0625 ForwardOrdered Regular Points,
Dim{:lat} Sampled{Float64} Float64[-89.28422753251364, -88.35700351866494, …, 88.35700351866494, 89.28422753251364] ForwardOrdered Irregular Points,
Ti Sampled{DateTime} DateTime[2015-01-01T03:00:00, …, 2101-01-01T00:00:00] ForwardOrdered Irregular Points
units: K
name: tas
Total size: 68.97 GB
Versions are:
⌅ [0703355e] DimensionalData v0.25.8
⌅ [3c3547ce] DiskArrays v0.3.23
⌃ [c21b50f5] YAXArrays v0.5.3
Due to Zarr and DimensionalData requirements in the MWE, I am unable to install DiskArrays@v0.4.0 to see if it is the version the break the code or if this is due to YAXArrays going to v0.5.4
Zarr was causing the req incompatibilities. I tried with a NetCDF file with the following versions and I get the error. So, either from DiskArrays going to v0.4.0 or YAXArrays going to v0.5.4
Env with error:
⌅ [0703355e] DimensionalData v0.26.8
⌃ [3c3547ce] DiskArrays v0.4.0
[30363a11] NetCDF v0.12.0
⌃ [c21b50f5] YAXArrays v0.5.4
Most likely DiskArrays 0.4, it's a major reworking of some indexing internals. @meggart will know
This comes from changes to eachchunk I managed to reduce this to a DiskArrays problem on DiskArrays master:
using DiskArrays
using DiskArrays.TestTypes
julia> a = TestTypes.ChunkedDiskArray(rand(100,100),(10,10))
100×100 ChunkedDiskArray{Float64, 2, Matrix{Float64}}
Chunked: (
[10, 10, 10, 10, 10, 10, 10, 10, 10, 10]
[10, 10, 10, 10, 10, 10, 10, 10, 10, 10]
)
julia> v = view(a, [1,2,4],:);
julia> eachchunk(v)
ERROR: ArgumentError: Unable to determine chunksize of non-range views.
Stacktrace:
[1] eachchunk_view(::DiskArrays.Chunked{…}, vv::SubArray{…})
@ DiskArrays ~/.julia/dev/DiskArrays/src/subarray.jl:29
[2] eachchunk(a::DiskArrays.SubDiskArray{Float64, 2, ChunkedDiskArray{…}, Tuple{…}, false})
@ DiskArrays ~/.julia/dev/DiskArrays/src/subarray.jl:25
[3] top-level scope
@ REPL[17]:1
Some type information was truncated. Use `show(err)` to see complete types.
I am not sure what is the reason we have this check there and whether we could try to forward this to the unchunked access of the data.
This might already be solved in #181
Great, will keep an eye on latest updates from DiskArrays.jl and report back when it is updated. Thanks!
Is there a way I can test the #181 commit? I tried with the commit (add DiskArrays#5ef1432d4e925cf550c1cfdd3e083eca80db1fe9
and add DiskArrays#5ef1432
), but didn't worked. Perhaps it is because the commit is from another repo? (I never tried to pin a package with a specific commit).
I'd like to test the commit to see if it is resolved. Thanks!
Its in https://github.com/ConnectedSystems/DiskArrays.jl fix-index-issue branch
You probably have to manually clone it?
Its in https://github.com/ConnectedSystems/DiskArrays.jl fix-index-issue branch
You probably have to manually clone it?
ah! And then do a dev
? I'll try to so how hard it is (I guess I would need to dev YAXArrays too and change the [deps] section)
no you can just add it and it will work:
] add https://github.com/ConnectedSystems/DiskArrays.jl#fix-index-issue
or of cource git clone
and then dev
It works! 😄
ds.tas[Ti=Where(x-> Dates.monthday(x) != (2,29))]
╭────────────────────────────────────╮
│ 384×192×251120 YAXArray{Float32,3} │
├────────────────────────────────────┴─────────────────────────────────────────────────────── dims ┐
↓ lon Sampled{Float64} 0.0:0.9375:359.0625 ForwardOrdered Regular Points,
→ lat Sampled{Float64} [-89.28422753251364, -88.35700351866494, …, 88.35700351866494, 89.28422753251364] ForwardOrdered Irregular Points,
↗ Ti Sampled{DateTime} [2015-01-01T03:00:00, …, 2101-01-01T00:00:00] ForwardOrdered Irregular Points
├──────────────────────────────────────────────────────────────────────────────────────── metadata ┤
Dict{String, Any} with 10 entries:
"units" => "K"
"history" => "2019-07-21T06:26:13Z altered by CMOR: Treated scalar dimension: 'height'. 201…
"name" => "tas"
"cell_methods" => "area: mean time: point"
"cell_measures" => "area: areacella"
"long_name" => "Near-Surface Air Temperature"
"coordinates" => "height"
"standard_name" => "air_temperature"
"_FillValue" => 1.0f20
"comment" => "near-surface (usually, 2 meter) air temperature"
├─────────────────────────────────────────────────────────────────────────────────────── file size ┤
file size: 68.97 GB
[0703355e] DimensionalData v0.27.6
[3c3547ce] DiskArrays v0.4.5 `https://github.com/ConnectedSystems/DiskArrays.jl#fix-index-issue`
[30363a11] NetCDF v0.12.0
[c21b50f5] YAXArrays v0.5.10
[0a941bbe] Zarr v0.9.4
@rafaqz Note that if I did dev DiskArrays
after add ...ConnectedSystems...
, the dev version was reverting back to 0.4.4. Only adding the version from the ConnectedSystem
was sufficient to test the commit. Thanks again!
@rafaqz I have a side-question for this working example (the MWE provided in this thread: JuliaDataCubes/YAXArrays.jl#357, reproduced below). I am trying to rebuild
the array (i.e. updating the dimensions values), but I have trouble understanding how I can do it. Any pointer would be helpful!
using YAXArrays
using Zarr
using DimensionalData
using Dates
path="gs://cmip6/CMIP6/ScenarioMIP/DKRZ/MPI-ESM1-2-HR/ssp585/r1i1p1f1/3hr/tas/gn/v20190710/"
store = zopen(path, consolidated=true)
ds = open_dataset(store)
# Selecting only entry without Feb 29th.
ds_subset = ds.tas[Ti=Where(x-> Dates.monthday(x) != (2,29))]
# Taking and reinterpreting the timevector
date_vec = lookup(ds_subset, dimt)
# New time vector
datevec_noleap = CFTime.reinterpret.(DateTimeNoLeap, date_vec)
Then I'd like to rebuild the dimensions :Ti
with datevec_noleap
, for example, I can rebuild with
rebuild(:Ti, datevec_noleap)
newdim = Ti [DateTimeNoLeap(2001-01-01T00:00:00), …, DateTimeNoLeap(2010-12-31T00:00:00)]
However, I still haven't been able to assign this new rebuilded dim to ds_subset
DD is mostly straight functional code... You have to rebuild constructed objects and assign them to a new variable.
In this case, set
is a rebuild
helper. So you can do new_array = set(some_dd_array, Ti => datevec_noleap)
and set
will figure out how to rebuild it correctly.
You can put pretty much any dimensions of lookup property after the =>
as there is no ambiguity as to what you mean to do.
Nice, will give a try monday morning 😄
Thanks a lot!
DD is mostly straight functional code... You have to rebuild constructed objects and assign them to a new variable.
In this case,
set
is arebuild
helper. So you can donew_array = set(some_dd_array, Ti => datevec_noleap)
andset
will figure out how to rebuild it correctly.You can put pretty much any dimensions of lookup property after the
=>
as there is no ambiguity as to what you mean to do.
It works nicely, thanks!
Its in ConnectedSystems/DiskArrays.jl fix-index-issue branch
You probably have to manually clone it?
You can also checkout the PR branch locally from a normally dev'ed DiskArrays via this tutorial
https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/reviewing-changes-in-pull-requests/checking-out-pull-requests-locally