Memory leakage upon repeated training
CasBex opened this issue · 1 comments
Hi, I've been creating some random forest regressors lately and I've noticed high memory usage during hyperparameter tuning. It turns out that there is some memory leakage in the package. For some reason Julia does not delete the trees when they become unreachable.
Following is a MWE: after finishing run_forests, some memory should be reclaimed but it doesn't happen and memory usage increases. When running the second loop however, memory usage stays constant.
using DecisionTree
function run_forests(features, labels)
forest = build_forest(labels, features)
labels .+= apply_forest(forest, features)
labels ./= 2
end
function run_something_else(features, labels)
C = repeat(features, inner=(2,2))
labels ./= vec(sum(C, dims=2))[1:length(labels)]
end
const features = rand(10_000, 10)
const labels = sum(features, dims=2) |> vec
# notice memory consumption increases every couple of iterations
for i = 1:1_000
run_forests(features, labels)
@info "Iteration $i current memory used" Sys.maxrss()
end
# notice memory consumption does not increase every couple of iterations
for i = 1:1_000
run_something_else(features, labels)
@info "Iteration $i current memory used" Sys.maxrss()
end
Any idea what might cause this?
Yes. I can confirm on Julia 1.10 aarch64 Apple Darwin. During execution, htop
memory usage will slowly increase each time that I call include("tmp.jl")
.
Any idea what might cause this?
Could be a (multi-threading) leak somewhere in Julia: https://github.com/search?q=repo%3AJuliaLang%2Fjulia+memory&type=issues&p=2