Caret object: Inconsistent grid creation with documentation

Question

Caret object: Inconsistent grid creation with documentation

Rek27 opened this issue 2 months ago · 3 comments

Problem: According to the documentation, Tree depth hyperparameter should be 4-10 (optimal). For CPU, this hyperparameter can be any integer up to 16. Problem comes when looking at the function in catboost.caret that is generating the grid. It depends on the tuneLength which means, if someone does random search with tuneLength > 16, they will get NaN as the metric value (in my case Accuracy).

catboost.caret

...
$grid
function (x, y, len = 5, search = "grid")
{
if (search == "grid") {
grid <- expand.grid(depth = c(2, 4, 6), learning_rate = exp(-(0:len)),
iterations = 100, l2_leaf_reg = 1e-06, rsm = 0.9,
border_count = 255)
}
else {
grid <- data.frame(depth = sample.int(len, len, replace = TRUE),
learning_rate = runif(len, min = 1e-06, max = 1),
iterations = rep(100, len), l2_leaf_reg = sample(c(0.1,
0.001, 1e-06), len, replace = TRUE), rsm = sample(c(1,
0.9, 0.8, 0.7), len, replace = TRUE), border_count = sample(c(255),
len, replace = TRUE))
}
return(grid)
}
...

Shouldn't the grid be limited to 16 most? Not really to depend on the tuneLength.

catboost version: 1.2.2
Operating System: Windows 10 x64
CPU: AMD Ryzen 5 PRO 5650U
GPU: not using

Answer 1 · 2024-03-14T15:14:42.000Z

Many thanks for paying attention!
The fix is on its way to github.com/catboost ...

Answer 2 · 2024-03-14T17:33:19.000Z

Fixed in 2b3a9ea

Answer 3 · 2024-03-15T04:35:31.000Z

My excuses for mixing up issues 2609 <-> 2606 :-\