serengil/chefboost

'Series' object has no attribute 'Decision'

Markus-Go opened this issue · 2 comments

When running the golf example:

df = pd.read_csv("data/golf.txt")
config = {'algorithm': 'C4.5'}
model = chef.fit(df, config = config, target_label = 'Decision')

I get the following error:

[INFO]:  10 CPU cores will be allocated in parallel running
C4.5  tree is going to be built...

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
~\AppData\Local\Temp\ipykernel_28440\547795482.py in ?()
     10 import pandas as pd
     11 
     12 df = pd.read_csv("data/golf.txt")
     13 config = {'algorithm': 'C4.5'}
---> 14 model = chef.fit(df, config = config, target_label = 'Decision')

C:\Lib\site-packages\chefboost\Chefboost.py in ?(df, config, target_label, validation_df)
    209                 if enableParallelism == True:
    210                         json_file = "outputs/rules/rules.json"
    211                         functions.createFile(json_file, "[\n")
    212 
--> 213 		trees = Training.buildDecisionTree(df, root = root, file = file, config = config
    214                                 , dataset_features = dataset_features
    215 				, parent_level = 0, leaf_id = 0, parents = 'root', validation_df = validation_df, main_process_id = process_id)
    216 

C:\Lib\site-packages\chefboost\\chefboost\training\Training.py in ?(df, root, file, config, dataset_features, parent_level, leaf_id, parents, tree_id, validation_df, main_process_id)
    432                 pivot = pd.DataFrame(subdataset.Decision.value_counts()).reset_index()
    433                 pivot = pivot.rename(columns = {"Decision": "Instances","index": "Decision"})
    434                 pivot = pivot.sort_values(by = ["Instances"], ascending = False).reset_index()
    435 
--> 436                 else_decision = "return '%s'" % (pivot.iloc[0].Decision)
    437 
    438                 if enableParallelism != True:
    439                         functions.storeRule(file,(functions.formatRule(root), "else:"))

C:\Lib\site-packages\chefboost\Lib\site-packages\pandas\core\generic.py in ?(self, name)
   5985             and name not in self._accessors
   5986             and self._info_axis._can_hold_identifiers_and_holds_name(name)
   5987         ):
   5988             return self[name]
-> 5989         return object.__getattribute__(self, name)

AttributeError: 'Series' object has no attribute 'Decision'

I think someone run into the same issue on stackoverflow.

I also have the same issue. Besides, the fit() function cannot even run on the "golf.txt" dataset shown in the tutorial.

I believe this was caused by a change in pandas related to reset_index.
I created a hotfix that works with pandas 2.1.3 in #36