Caret object: Inconsistent grid creation with documentation
Rek27 opened this issue · 3 comments
Problem: According to the documentation, Tree depth hyperparameter should be 4-10 (optimal). For CPU, this hyperparameter can be any integer up to 16. Problem comes when looking at the function in catboost.caret that is generating the grid. It depends on the tuneLength which means, if someone does random search with tuneLength > 16, they will get NaN as the metric value (in my case Accuracy).
catboost.caret
...
$grid
function (x, y, len = 5, search = "grid")
{
if (search == "grid") {
grid <- expand.grid(depth = c(2, 4, 6), learning_rate = exp(-(0:len)),
iterations = 100, l2_leaf_reg = 1e-06, rsm = 0.9,
border_count = 255)
}
else {
grid <- data.frame(depth = sample.int(len, len, replace = TRUE),
learning_rate = runif(len, min = 1e-06, max = 1),
iterations = rep(100, len), l2_leaf_reg = sample(c(0.1,
0.001, 1e-06), len, replace = TRUE), rsm = sample(c(1,
0.9, 0.8, 0.7), len, replace = TRUE), border_count = sample(c(255),
len, replace = TRUE))
}
return(grid)
}
...
Shouldn't the grid be limited to 16 most? Not really to depend on the tuneLength.
catboost version: 1.2.2
Operating System: Windows 10 x64
CPU: AMD Ryzen 5 PRO 5650U
GPU: not using
Many thanks for paying attention!
The fix is on its way to github.com/catboost ...
Fixed in 2b3a9ea
My excuses for mixing up issues 2609 <-> 2606 :-\