could not determine class_counts_ from previously fitted classifier
ggous opened this issue · 5 comments
Describe the bug
When running visualizer, I am receiving a warning :
yellowbrick/classifier/base.py:232: YellowbrickWarning: could not determine class_counts_ from previously fitted classifier
which results in fitting again the classifier.
To Reproduce
from yellowbrick.classifier import ClassificationReport, ConfusionMatrix
from sklearn import datasets
from sklearn.model_selection import train_test_split
import xgboost as xgb
import matplotlib.pyplot as plt
X, y = datasets.load_iris(return_X_y=True)
x_train, x_val, y_train, y_val = train_test_split(
X,
y,
stratify=y,
test_size = 0.2)
model = xgb.XGBClassifier(objective ='multi:softprob',
num_class=3,
use_label_encoder=False,
enable_categorical=False,
n_estimators=10)
model.fit(x_train,
y_train,
early_stopping_rounds=10,
eval_set=[(x_train, y_train), (x_val, y_val)])
fig, ax = plt.subplots()
visualizer = ClassificationReport(model,
is_fitted=True)
visualizer.score(x_val, y_val)
visualizer.show()
Dataset
from sklearn import datasets
Expected behavior
Since, we declare is_fitted=True
, it should not fit again.
Traceback
/home/ggous/miniconda3/envs/sklearn/lib/python3.9/site-packages/yellowbrick/classifier/base.py:232: YellowbrickWarning: could not determine class_counts_ from previously fitted classifier
warnings.warn(
Desktop (please complete the following information):
- OS: Linux Mint
- Python Version : 3.9.12 miniconda
- Yellowbrick Version : 1.5
Additional context
I think sometimes it takes too much time to fit again?
@ggous thank you for using Yellowbrick and for reporting the issue that you found to us! I hope that you're finding Yellowbrick useful.
In order to use the ClassificationReport
the model needs the class_counts_
learned attribute. This appears in most scikit-learn classifiers. I believe the xgb
package adds learned attributes if it understands it's in a scikit-learn context. I am not really sure why it doesn't have it when you fit the model and after -- I don't use the xgb
package very often.
Could you try directly adding class_counts_
to the model before creating the visualizer to see if that helps things?
@lwgray do you have experience using xgb
-- if so, perhaps you could comment on this issue?
Hi bbengfort.
I can't find an attribute of class_counts_
in xgb.
I am not sure how to add it before the visualizer.
Also, is there a way for only saving the visualizer without showing it?
I am using visualizer.show(outpath='./file.png')
but I want only to save , not to display the result plot.
This code address both concerns but @bbengfort maybe you have a better trick to stop showing the plot. I feel like we answered this question before.
from yellowbrick.classifier import ClassificationReport, ConfusionMatrix
from sklearn import datasets
from sklearn.model_selection import train_test_split
import xgboost as xgb
import matplotlib.pyplot as plt
X, y = datasets.load_iris(return_X_y=True)
x_train, x_val, y_train, y_val = train_test_split(X,y,stratify=y,test_size = 0.2)
model = xgb.XGBClassifier(objective ='multi:softprob',
num_class=3,
use_label_encoder=False,
enable_categorical=False,
n_estimators=10)
model.fit(x_train,
y_train,
early_stopping_rounds=10,
eval_set=[(x_train, y_train), (x_val, y_val)])
# Specify class counts on the model
model.class_counts_ = 3
fig, ax = plt.subplots()
visualizer = ClassificationReport(model, is_fitted=True)
visualizer.score(x_val, y_val)
# Clear Figure works but @bbengfort might have a better approach
visualizer.show('test.png', clear_figure=True);
@lwgray thank you for adding those suggestions!
@ggous if you're in a Jupyter notebook, this StackOverflow post has some suggestions for preventing the image from being rendered. Otherwise clear_figure
as @lwgray mentioned is probably your best bet.