Increasing max_depth causes memory leak
john-waczak opened this issue · 4 comments
I have been able to train an EvoTreeRegressor with the default parameters successfully. When I try to increase the max_depth
parameter beyond 10 suddenly my memory usage spikes and Julia dies.
Here's a snippet from the REPL
julia> evo = EvoTreeRegressor(max_depth=15, rng=42)
EvoTreeRegressor(
loss = EvoTrees.Linear(),
nrounds = 10,
λ = 0.0,
γ = 0.0,
η = 0.1,
max_depth = 15,
min_weight = 1.0,
rowsample = 1.0,
colsample = 1.0,
nbins = 64,
α = 0.5,
metric = :mse,
rng = MersenneTwister(42),
device = "cpu")
julia> mach = machine(evo, Xtrain, CDOM_train)
Machine{EvoTreeRegressor{Float64,…},…} trained 0 times; caches data
args:
1: Source @710 ⏎ `Table{AbstractVector{Continuous}}`
2: Source @134 ⏎ `AbstractVector{Continuous}`
julia> fit!(mach, verbosity=2)
[ Info: Training Machine{EvoTreeRegressor{Float64,…},…}.
Process julia killed
@john-waczak Thanks for reporting! Good to know about this.
A complete minimum working example might speed up resolution, ideally without the the MLJ wrap.
Okay, here's a MWE. I crash when running the following on Ubuntu 21.04 machine w/ 16GB ram and 4 core i7-7700HQ @ 2.80GHz
using EvoTrees
# Simple Regression Demo
n=2000;
X = 2*(rand(n,2) .- 0.5);
y = X[:,1].^5 + X[:,2].^4 - X[:,1].^4 - X[:,2].^3
size(X)
size(y)
# train for first time with default settings
params1 = EvoTreeRegressor()
model = fit_evotree(params1, X, y)
# train wit increased max_depth
# this causes julia to crash
params2 = EvoTreeRegressor(max_depth=20)
model = fit_evotree(params2, X, y)
Here's the output of Pkg.status:
(evoTree_bug) pkg> status
Status `~/gitRepos/evoTree_bug/Project.toml`
[f6006082] EvoTrees v0.8.4
Thanks for reporting!
For what I can tell, it doesn't seem an issue per se or a memory leak, but more of a consequence of the design choices geared toward fitting speed which results in significant memory pre-allocations. Specifically, histograms for each tree nodes are pre-allocated, and in the case of a depth of 20, there are over 500K such nodes. What appears like a memory leak is actually a long pre-allocation process.
However, in gradient boosted model, each tree act as a weak learner and as such, I'm not aware of situation where depth much greater than 10 were of any value. Typically, a depth in the 3-8 range will best perform. Let me know if you are in a situation where greater depth is needed. I'm afraid though a significantly different design, potentially less efficient, would be needed to support such scenarios,
@jeremiedb Thanks for your reply! That makes a lot of sense. I think I should be more than fine with a smaller max_depth
. I was trying some hyper-parameter variation just to see what would happen and noticed the script kept dying once it got past 10 or so.