"SYSTEM: show(lasterr) caused an error" when running `DINCAE.reconstruct_points`

Question

"SYSTEM: show(lasterr) caused an error" when running `DINCAE.reconstruct_points`

Closed this issue 2 years ago · 5 comments

Describe the bug

Trying to run DINCAE on CPR observations.

Stracktrace

Stacktrace:
  [1] throw_boundserror(A::NCDatasets.CFVariable{Union{Missing, Float64}, 1, NCDatasets.Variable{Float64, 1, NCDataset{Nothing}}, NCDatasets.Attributes{NCDataset{Nothing}}, NamedTuple{(:fillvalue, :missing_values, :scale_factor, :add_offset, :calendar, :time_origin, :time_factor), Tuple{Float64, Tuple{Float64}, Nothing, Nothing, Nothing, Nothing, Nothing}}}, I::Tuple{StepRangeLen{Float64, Base.TwicePrecision{Float64}, Base.TwicePrecision{Float64}, Int64}})
    @ Base ./abstractarray.jl:703
  [2] checkbounds
    @ ./abstractarray.jl:668 [inlined]
  [3] _getindex
    @ ./multidimensional.jl:874 [inlined]
  [4] getindex(A::NCDatasets.CFVariable{Union{Missing, Float64}, 1, NCDatasets.Variable{Float64, 1, NCDataset{Nothing}}, NCDatasets.Attributes{NCDataset{Nothing}}, NamedTuple{(:fillvalue, :missing_values, :scale_factor, :add_offset, :calendar, :time_origin, :time_factor), Tuple{Float64, Tuple{Float64}, Nothing, Nothing, Nothing, Nothing, Nothing}}}, I::StepRangeLen{Float64, Base.TwicePrecision{Float64}, Base.TwicePrecision{Float64}, Int64})
    @ Base ./abstractarray.jl:1241
  [5] loadragged(ncvar::NCDatasets.CFVariable{Union{Missing, Float64}, 1, NCDatasets.Variable{Float64, 1, NCDataset{Nothing}}, NCDatasets.Attributes{NCDataset{Nothing}}, NamedTuple{(:fillvalue, :missing_values, :scale_factor, :add_offset, :calendar, :time_origin, :time_factor), Tuple{Float64, Tuple{Float64}, Nothing, Nothing, Nothing, Nothing, Nothing}}}, index::Colon)
    @ NCDatasets ~/.julia/packages/NCDatasets/gTGnf/src/variable.jl:152
  [6] (::DINCAE.var"#131#132"{String})(ds::NCDataset{Nothing})
    @ DINCAE ~/.julia/packages/DINCAE/OlSY0/src/points.jl:442
  [7] NCDataset(f::DINCAE.var"#131#132"{String}, args::String; kwargs::Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
    @ NCDatasets ~/.julia/packages/NCDatasets/gTGnf/src/dataset.jl:241
  [8] NCDataset
    @ ~/.julia/packages/NCDatasets/gTGnf/src/dataset.jl:238 [inlined]
  [9] loaddata
    @ ~/.julia/packages/DINCAE/OlSY0/src/points.jl:441 [inlined]
 [10] reconstruct_points(T::Type, Atype::Type, filename::String, varname::String, grid::Tuple{StepRangeLen{Float64, Base.TwicePrecision{Float64}, Base.TwicePrecision{Float64}, Int64}, StepRangeLen{Float64, Base.TwicePrecision{Float64}, Base.TwicePrecision{Float64}, Int64}}, fnames_rec::Vector{String}; epochs::Int64, batch_size::Int64, truth_uncertain::Bool, enc_nfilter_internal::StepRange{Int64, Int64}, skipconnections::UnitRange{Int64}, clip_grad::Float64, regularization_L1_beta::Int64, regularization_L2_beta::Float64, save_epochs::StepRange{Int64, Int64}, upsampling_method::Symbol, probability_skip_for_training::Float64, jitter_std_pos::Tuple{Float32, Float32}, ntime_win::Int64, learning_rate::Float64, learning_rate_decay_epoch::Int64, min_std_err::Float64, loss_weights_refine::Tuple{Float64}, auxdata_files::Vector{NamedTuple{(:filename, :varname, :errvarname), Tuple{String, String, String}}}, savesnapshot::Bool)
    @ DINCAE ~/.julia/packages/DINCAE/OlSY0/src/points.jl:545
 [11] top-level scope

so the problem comes at the reading step with the function DINCAE.loaddata(filename,varname):
https://github.com/gher-uliege/DINCAE.jl/blob/main/src/points.jl#L440-L457,
which uses loadragged.

Question

Why does the input has to be in written as contiguous ragged array representation?

To Reproduce

Please provide a minimal code example which reproduces the behavior (bug, performance regression, ...).

Environment

operating system: Ubuntu

Input file

netcdf CPRdata {
dimensions:
	obs = 250021 ;
	depth = 1 ;
variables:
	double time(obs) ;
		time:_CoordinateAxisType = "Time" ;
		string time:actual_range = "1958-01-01", "2020-12-23" ;
		time:axis = "T" ;
		time:calendar = "Gregorian" ;
		time:ioos_category = "Time" ;
		time:long_name = "Valid Time GMT" ;
		time:standard_name = "time" ;
		time:time_origin = "01-JAN-1900 00:00:00" ;
		time:units = "days since 1900-01-01T00:00:00Z" ;
	double latitude(obs) ;
		latitude:_CoordinateAxisType = "Lat" ;
		latitude:_FillValue = -999. ;
		latitude:actual_range = 28.045f, 90.f ;
		latitude:axis = "Y" ;
		latitude:ioos_category = "Location" ;
		latitude:latitude_reference_datum = "geographical coordinates, WGS84 projection" ;
		latitude:long_name = "Latitude" ;
		latitude:missing_value = -999. ;
		latitude:standard_name = "latitude" ;
		latitude:units = "degrees_north" ;
		latitude:valid_max = 90. ;
		latitude:valid_min = -90. ;
	double longitude(obs) ;
		longitude:_CoordinateAxisType = "Lon" ;
		longitude:_FillValue = -999. ;
		longitude:actual_range = -15.4244, 180.002 ;
		longitude:axis = "X" ;
		longitude:ioos_category = "Location" ;
		longitude:long_name = "Longitude" ;
		longitude:longitude_reference_datum = "geographical coordinates, WGS84 projection" ;
		longitude:missing_value = -999. ;
		longitude:standard_name = "longitude" ;
		longitude:units = "degrees_east" ;
		longitude:valid_max = 180. ;
		longitude:valid_min = -180. ;
	double Calanus_Finmarchicus(obs) ;
		Calanus_Finmarchicus:_FillValue = -999. ;
		Calanus_Finmarchicus:actual_range = 0., 5012. ;
		Calanus_Finmarchicus:coordinates = "time" ;
		Calanus_Finmarchicus:long_name = "Abunance of Calanus Finmarchicus" ;
		Calanus_Finmarchicus:sdn_parameter_urn = "SDN:P01::Z302M01Z" ;
		Calanus_Finmarchicus:sdn_parameter_name = "Abundance of Calanus finmarchicus (ITIS: 85272: WoRMS 104464) per unit volume of the water body by optical microscopy" ;
		Calanus_Finmarchicus:sdn_uom_urn = "SDN:P06::UCPL" ;
		Calanus_Finmarchicus:sdn_uom_name = "Individuals per cubic meter" ;
		Calanus_Finmarchicus:AphiaID = "104464" ;
		Calanus_Finmarchicus:missing_value = -999. ;
		Calanus_Finmarchicus:units = "Individuals per cubic meter" ;
		Calanus_Finmarchicus:sample_dimension = "obs" ;
...
}

Answer 1 · 2023-03-14T11:15:21.000Z

Should we move this to https://github.com/gher-uliege/DINCAE.jl ?

Why does the input has to be in written as contiguous ragged array representation?

For altimetry: we have a vector of tracks (vector of vectors) and when we split the data (training vs tests) we do not split the tracks. So it is good to keep the information about which data points are in a same track.

Maybe for the CPRs this is relevant too (where tracks are campaigns).

Answer 2 · 2023-03-14T12:35:34.000Z

yes sure I wrote it to the wrong repository!!!
should be DINCAE.jl

Answer 3 · 2023-03-14T13:47:26.000Z

Is there an example of such a file; so I can use ncgen on it?

Answer 4 · 2023-03-15T09:21:35.000Z

The doc string of DINCAE.reconstruct_points contains the output of ncgen of the altimetry test case.

https://github.com/gher-uliege/DINCAE.jl/blob/main/src/points.jl#L492-L508

And this info:

The file should contain the variables lon (longitude), lat (latitude), dtime (time of measurement) and id (numeric identifier, only used by post processing scripts) and dates (time instance of the gridded field). The file should be in the contiguous ragged array representation as specified by the CF convention allowing to group data points into "features" (e.g. tracks for altimetry). Every feature can also contain a single data point.

Should we add something to make it more clear? The data variable is called sla in my case, but this can be adapted (e.g. Calanus_Finmarchicus, in this case the varname parameter of DINCAE.reconstruct_points is "Calanus_Finmarchicus" ).

Answer 5 · 2023-03-15T09:33:30.000Z

ok thanks!
I'd seen the docstring but wasn't sure about the ragged array format,
now that's clear.

Instead of track I will use the sampleID, which seems to indicate the cruise tracks.