`evaluate` errors
vboussange opened this issue · 3 comments
Describe the bug
Hey folks, I am new to MLJ.jl
and love your work!
However, it seems that the function evaluate
does not work properly, and errors when given more than one model.
To Reproduce
using Distributions
using LinearAlgebra
using DataFrames
using UnPack
using MLJ
## Generating synthetic data
n_features = 5
datasize = 1500
T = Float64
TI = Int64
# covariance matrix, needs to be symmetric
Σ = rand(T, n_features, n_features) * 2 .- 1
Σ = Σ*Σ'
μ = randn(T, n_features)
X = DataFrame(rand(MvNormal(μ, Σ), datasize)',:auto)
a = randn(T, n_features)
y = TI[]
for i in 1:datasize
Xi = X[i,:] |> Vector
push!(y, rand(Poisson(exp.(a' * Xi))))
end
# building a GLM
LinearRegressor = MLJ.@load LinearCountRegressor pkg=GLM
linearregressor = LinearRegressor()
mach = machine(linearregressor, X, y)
fit!(mach)
# works
# building a neural network regressor
using Flux, MLJFlux
mutable struct MyBuilder <: MLJFlux.Builder
nhidden # number of neurons in hidden layers
σ1 #hidden layers activation function
σ2 #output activation function
end
function MLJFlux.build(nn::MyBuilder, rng, n_in, n_out)
init = Flux.glorot_uniform(rng)
@unpack nhidden, σ1, σ2 = nn
return Chain(Dense(n_in, nhidden, σ1, init=init),
BatchNorm(nhidden),
Dense(nhidden, nhidden, σ1, init=init),
BatchNorm(nhidden),
Dense(nhidden, n_out, σ1, init=init),
σ2)
end
NN = MLJ.@load NeuralNetworkRegressor pkg=MLJFlux
nnflux = NN(builder = MyBuilder(64, relu, softplus),
batch_size=100,
epochs=200,
loss = Flux.poisson_loss)
mach = machine(nnflux, X, y)
fit!(mach) # works
# comparing multiple models
mymodels = [nnflux, linearregressor]
multi_model = TunedModel(models=mymodels,
resampling = CV(nfolds=3),
measure = rms,
check_measure = false)
e = MLJ.evaluate(multi_model, X, y, resampling = CV(nfolds=2),
measure=rms,
verbosity=6,
# acceleration = CPUThreads()
)
#=
┌ Error: Problem fitting the machine machine(DeterministicTunedModel(model = NeuralNetworkRegressor(builder = MyBuilder(nhidden = 64, …), …), …), …).
└ @ MLJBase ~/.julia/packages/MLJBase/fEiP2/src/machines.jl:682
[ Info: Running type checks...
[ Info: Type checks okay.
ERROR: MethodError: no method matching abs(::LocationScale{Int64, Discrete, Poisson{Float64}})
=#
Expected behavior
The above code works for mymodels = [nnflux]
and my models = [linearregressor]
, but does not seem to work when both models are provided.
Versions
[992eb4ea] CondaPkg v0.2.22
[a93c6f00] DataFrames v1.6.1
[31c24e10] Distributions v0.25.102
[587475ba] Flux v0.14.6
[add582a8] MLJ v0.20.1
[094fc8d1] MLJFlux v0.4.0
[caf8df21] MLJGLMInterface v0.3.5
[6099a3de] PythonCall v0.9.14
[274fc56d] PythonPlot v1.0.3
[3a884ed6] UnPack v1.0.2
@vboussange Thanks for putting MLJ through it's paces and for the positive feedback.
The issue here is that you are using TunedModel
to wrap two models with different prediction type. One predicts probability distributions, while the other predicts point values:
julia> prediction_type(LinearRegressor)
:probabilistic
julia> prediction_type(NN)
:deterministic
This should be disallowed, but isn't, and we can see the wrapped model decides, without any good reason, to be :deterministic
:
julia> prediction_type(multi_model)
:deterministic
So training the TunedModel
tries to apply rms
directly to probabilistic (Poisson) distributions and so fails.
Below is a workaround. The changes I've made are marked A
and B
:
A
force predictions of the linear model to be deterministic by applyingmean
to the probabilistic predictionsB
makes sure that NN receivesContinuous
data, instead ofCount
data, to suppress the scitype warning you have been getting (not a critical correction)
using Distributions
using LinearAlgebra
using DataFrames
using UnPack
using MLJ
## Generating synthetic data
n_features = 5
datasize = 1500
T = Float64
TI = Int64
# covariance matrix, needs to be symmetric
Σ = rand(T, n_features, n_features) * 2 .- 1
Σ = Σ*Σ'
μ = randn(T, n_features)
X = DataFrame(rand(MvNormal(μ, Σ), datasize)',:auto);
a = randn(T, n_features)
y = TI[]
for i in 1:datasize
Xi = X[i,:] |> Vector
push!(y, rand(Poisson(exp.(a' * Xi))))
end
# building a GLM
LinearRegressor = MLJ.@load LinearCountRegressor pkg=GLM
linearregressor = LinearRegressor()
linearregressor = linearregressor |> (y -> mean.(y)) # <--------- A
mach = machine(linearregressor, X, y)
fit!(mach)
# works
# building a neural network regressor
using Flux, MLJFlux
mutable struct MyBuilder <: MLJFlux.Builder
nhidden # number of neurons in hidden layers
σ1 #hidden layers activation function
σ2 #output activation function
end
function MLJFlux.build(nn::MyBuilder, rng, n_in, n_out)
init = Flux.glorot_uniform(rng)
@unpack nhidden, σ1, σ2 = nn
return Chain(Dense(n_in, nhidden, σ1, init=init),
BatchNorm(nhidden),
Dense(nhidden, nhidden, σ1, init=init),
BatchNorm(nhidden),
Dense(nhidden, n_out, σ1, init=init),
σ2)
end
NN = MLJ.@load NeuralNetworkRegressor pkg=MLJFlux
nnflux = NN(builder = MyBuilder(64, relu, softplus),
batch_size=100,
epochs=200,
loss = Flux.poisson_loss)
nnflux = ContinuousEncoder() |> nnflux # <------------ B
mach = machine(nnflux, X, y)
fit!(mach) # works
# comparing multiple models
mymodels = [nnflux, linearregressor]
multi_model = TunedModel(
models=mymodels,
resampling = CV(nfolds=3),
measure = rms,
check_measure = false,
)
e = MLJ.evaluate(multi_model, X, y, resampling = CV(nfolds=2),
measure=rms,
verbosity=6,
# acceleration = CPUThreads()
)
# PerformanceEvaluation object with these fields:
# model, measure, operation, measurement, per_fold,
# per_observation, fitted_params_per_fold,
# report_per_fold, train_test_rows, resampling, repeats
# Extract:
# ┌────────────────────────┬───────────┬─────────────┬─────────┬──────────────┐
# │ measure │ operation │ measurement │ 1.96*SE │ per_fold │
# ├────────────────────────┼───────────┼─────────────┼─────────┼──────────────┤
# │ RootMeanSquaredError() │ predict │ 30.2 │ 34.5 │ [39.9, 15.0] │
# └────────────────────────┴───────────┴─────────────┴─────────┴──────────────┘
Closing in favour of JuliaAI/MLJTuning.jl#200
Sweet, thanks for the inputs!