catboost/benchmarks

'data' is numpy array of floating point numerical type, it means no categorical features, but 'cat_features' parameter specifies nonzero number of categorical features

iyliamjd opened this issue · 1 comments

Hi everyone, it's me again. I have run this code. I get error code below:
pool = Pool(data, label, cat_features=cat_cols)

the error :
'data' is numpy array of floating point numerical type, it means no categorical features," _catboost.CatBoostError: 'data' is numpy array of floating point numerical type, it means no categorical features, but 'cat_features' parameter specifies nonzero number of categorical features

Does anyone know what is happening, i did't change any of the code but got error maybe because of my train and test file. But I dont know how is the structure for test and train file.

If I'm not mistaken, you should be doing issues here:
https://github.com/catboost/catboost/issues
I would recommend to create the new ones there, because we check that place all the time.

About this issue: the problem, you're facing is that you are passing floating point numbers to categorical columns, which is not allowed. Here's an explanation, why it is forbidden:
https://catboost.ai/docs/concepts/faq.html#why-float-and-nan-values-are-forbidden-for-cat-features

We are planning to allow it for python though, it's one of the open problems for new contributors:
https://github.com/catboost/catboost/blob/master/open_problems/open_problems.md

So what you need to do, is you need to convert those columns to integers or to strings.