'Series' object has no attribute 'Decision'
Markus-Go opened this issue · 2 comments
Markus-Go commented
When running the golf example:
df = pd.read_csv("data/golf.txt")
config = {'algorithm': 'C4.5'}
model = chef.fit(df, config = config, target_label = 'Decision')
I get the following error:
[INFO]: 10 CPU cores will be allocated in parallel running
C4.5 tree is going to be built...
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
~\AppData\Local\Temp\ipykernel_28440\547795482.py in ?()
10 import pandas as pd
11
12 df = pd.read_csv("data/golf.txt")
13 config = {'algorithm': 'C4.5'}
---> 14 model = chef.fit(df, config = config, target_label = 'Decision')
C:\Lib\site-packages\chefboost\Chefboost.py in ?(df, config, target_label, validation_df)
209 if enableParallelism == True:
210 json_file = "outputs/rules/rules.json"
211 functions.createFile(json_file, "[\n")
212
--> 213 trees = Training.buildDecisionTree(df, root = root, file = file, config = config
214 , dataset_features = dataset_features
215 , parent_level = 0, leaf_id = 0, parents = 'root', validation_df = validation_df, main_process_id = process_id)
216
C:\Lib\site-packages\chefboost\\chefboost\training\Training.py in ?(df, root, file, config, dataset_features, parent_level, leaf_id, parents, tree_id, validation_df, main_process_id)
432 pivot = pd.DataFrame(subdataset.Decision.value_counts()).reset_index()
433 pivot = pivot.rename(columns = {"Decision": "Instances","index": "Decision"})
434 pivot = pivot.sort_values(by = ["Instances"], ascending = False).reset_index()
435
--> 436 else_decision = "return '%s'" % (pivot.iloc[0].Decision)
437
438 if enableParallelism != True:
439 functions.storeRule(file,(functions.formatRule(root), "else:"))
C:\Lib\site-packages\chefboost\Lib\site-packages\pandas\core\generic.py in ?(self, name)
5985 and name not in self._accessors
5986 and self._info_axis._can_hold_identifiers_and_holds_name(name)
5987 ):
5988 return self[name]
-> 5989 return object.__getattribute__(self, name)
AttributeError: 'Series' object has no attribute 'Decision'
I think someone run into the same issue on stackoverflow.
nj2nu commented
I also have the same issue. Besides, the fit() function cannot even run on the "golf.txt" dataset shown in the tutorial.
JannisBush commented
I believe this was caused by a change in pandas related to reset_index.
I created a hotfix that works with pandas 2.1.3 in #36