gher-uliege/DINCAE.jl

regularization_L2_beta error

EhsanMehdipour opened this issue · 4 comments

Hi,

When I initialize DINCAE with regularization_L2_beta = 0.001, I recieve the following error.

ERROR: LoadError: MethodError: no method matching abs2(::CuArray{Float32, 4, CUDA.Mem.DeviceBuffer})

Closest candidates are:
  abs2(!Matched::Complex)
   @ Base complex.jl:281
  abs2(!Matched::ForwardDiff.Dual{T}) where T
   @ ForwardDiff ~/.julia/packages/ForwardDiff/PcZ48/src/dual.jl:238
  abs2(!Matched::DualNumbers.Dual)
   @ DualNumbers ~/.julia/packages/DualNumbers/5knFX/src/dual.jl:204
  ...

Stacktrace:
  [1] MappingRF
    @ ./reduce.jl:95 [inlined]
  [2] _foldl_impl(op::Base.MappingRF{typeof(abs2), Base.BottomRF{typeof(Base.add_sum)}}, init::Base._InitialValue, itr::Zygote.Params{Zygote.Buffer{Any, Vector{Any}}})
    @ Base ./reduce.jl:58
  [3] foldl_impl
    @ ./reduce.jl:48 [inlined]
  [4] mapfoldl_impl(f::typeof(abs2), op::typeof(Base.add_sum), nt::Base._InitialValue, itr::Zygote.Params{Zygote.Buffer{Any, Vector{Any}}})
    @ Base ./reduce.jl:44
  [5] mapfoldl(f::Function, op::Function, itr::Zygote.Params{Zygote.Buffer{Any, Vector{Any}}}; init::Base._InitialValue)
    @ Base ./reduce.jl:170
  [6] mapfoldl
    @ ./reduce.jl:170 [inlined]
  [7] #mapreduce#292
    @ ./reduce.jl:302 [inlined]
  [8] mapreduce
    @ ./reduce.jl:302 [inlined]
  [9] #sum#295
    @ ./reduce.jl:530 [inlined]
 [10] sum(f::Function, a::Zygote.Params{Zygote.Buffer{Any, Vector{Any}}})
    @ Base ./reduce.jl:530
 [11] loss_function(model::DINCAE.StepModel{DINCAE.var"#52#56"{Float64}, DINCAE.var"#53#57"{Bool, Int64, Int64}}, xin::CuArray{Float32, 4, CUDA.Mem.DeviceBuffer}, xtrue::CuArray{Float32, 4, CUDA.Mem.DeviceBuffer})
    @ DINCAE ~/DINCAE/DINCAE.jl/src/model.jl:220
 [12] reconstruct(Atype::Type, data_all::Vector{Vector{NamedTuple{(:filename, :varname, :obs_err_std, :jitter_std, :isoutput), Tuple{String, String, Int64, Float64, Bool}}}}, fnames_rec::Vector{String}; epochs::Int64, batch_size::Int64, truth_uncertain::Bool, enc_nfilter_internal::Vector{Int64}, skipconnections::UnitRange{Int64}, clip_grad::Float64, regularization_L1_beta::Int64, regularization_L2_beta::Float64, save_epochs::StepRange{Int64, Int64}, is3D::Bool, upsampling_method::Symbol, ntime_win::Int64, learning_rate::Float64, learning_rate_decay_epoch::Float64, min_std_err::Float64, loss_weights_refine::Tuple{Float64, Float64}, cycle_periods::Tuple{Float64}, output_ndims::Int64, direction_obs::Nothing, remove_mean::Bool, paramfile::Nothing, laplacian_penalty::Int64, laplacian_error_penalty::Int64)
    @ DINCAE ~/DINCAE/DINCAE.jl/src/model.jl:490
 [13] top-level scope
    @ ~/DINCAE/python/DINCAE/8_3.jl:106

Can you provide me a (minimal) reproducible example ?
I just tried this example below (with CUDA 5.2.0 and Flux 0.14.13) but I did not have the error:

Maybe this error occurs when you combine different options? Feel free to adapt the example below to what is necessary to trigger the error if you want.

using DINCAE
using Base.Iterators
using Random
using NCDatasets
using CUDA

const F = Float32
Atype = CuArray{F}

filename = "avhrr_sub_add_clouds_n10.nc"

if !isfile(filename)
    download("https://dox.ulg.ac.be/index.php/s/2yFgNMkpsGumVSM/download", filename)
end


data = [
   (filename = filename,
    varname = "SST",
    obs_err_std = 1,
    jitter_std = 0.05,
    isoutput = true,
   )
]
data_test = data;
data_all = [data,data_test]

epochs = 3
batch_size = 5
save_each = 10
skipconnections = [1,2]
enc_nfilter_internal = round.(Int,32 * 2 .^ (0:3))
clip_grad = 5.0
save_epochs = [epochs]
ntime_win = 3
upsampling_method = :nearest

fnames_rec = [tempname()]
paramfile = tempname()

losses = DINCAE.reconstruct(
    Atype,data_all,fnames_rec;
    epochs = epochs,
    batch_size = batch_size,
    enc_nfilter_internal = enc_nfilter_internal,
    clip_grad = clip_grad,
    save_epochs = save_epochs,
    upsampling_method = upsampling_method,
    ntime_win = ntime_win,
    paramfile = paramfile,
    regularization_L2_beta = 0.001,
    )

Output for me:

julia> include("/home/abarth/.julia/dev/DINCAE/test/test_DINCAE_SST_1.jl");
[ Info: Number of threads: 1
SST data shape: 112×112×10 data range: (13.575001f0, 17.775002f0)
SST data shape: 112×112×10 data range: (13.575001f0, 17.775002f0)
[ Info: Output variables:  ["SST"]
[ Info: Input size:        112×112×10×5
[ Info: Input sum:         -9574.162
[ Info: Number of filters in encoder: [10, 32, 64, 128, 256]
[ Info: Number of filters in decoder: [2, 32, 64, 128, 256]
[ Info: Gamma:             10.0
[ Info: Number of filters: [10, 32, 64, 128, 256]
skip connections at level 4
skip connections at level 3
skip connections at level 2
[ Info: using device:      gpu
[ Info: Output size:       112×112×2×5
[ Info: Output range:      (-1.6730540579839301, 2.0068249099692506)
[ Info: Output sum:        51437.69955057635
[ Info: Initial loss:      1.206571102595678
epoch:     1 loss 0.6725
epoch:     2 loss 3.8611
epoch:     3 loss -0.6723
Save output 3
  1.291923 seconds (2.55 M allocations: 179.634 MiB, 7.65% gc time)
  59.881056 seconds (136.24 M allocations: 7.765 GiB, 6.52% gc time, 0.01% compilation time)

Thanks!

Hi,

I altered the value of loss_weights_refine from (1.,) to (0.3,0.7) and faced the same error.
Is there any incompatibility between refinement and regularization?

The Hyperparameters that I am using for the test case:

epochs = 3
batch_size = 5
skipconnections = [1,2]
enc_nfilter_internal = round.(Int,32 * 2 .^ (0:3))
regularization_L2_beta = 0.001
ntime_win = 3
upsampling_method = :nearest
loss_weights_refine = (0.3,0.7) ## With refinement
# loss_weights_refine = (1.,) ## without refinement
save_epochs = [epochs]
truth_uncertain = true
remove_mean=false

Thanks, I could now reproduce the error and committed a fix. Does it also work for you?

Thank you it is working now.