learning curve visualizer for catboost automl using Pipelines
dbrami opened this issue · 2 comments
Describe the issue
I'm getting "TypeError: ContribEstimator.init() got an unexpected keyword argument 'memory'" while trying to plot a learning curve
I have emailed via list serve but there is no code formatting so question unreadable (like my markdown here :/ )
@DistrictDataLabs/team-oz-maintainers
The following code works:
from yellowbrick.classifier import ROCAUC
from yellowbrick.contrib.wrapper import wrap
model = wrap(pipeline)
visualizer = ROCAUC(model)
visualizer.fit(X_train, y_train)
visualizer.score(X_test, y_test)
visualizer.show()
But the following does not:
from yellowbrick.model_selection import LearningCurve
# Create the learning curve visualizer
#cv = StratifiedKFold(n_splits=12)
sizes = np.linspace(0.3, 1.0, 10)
# Instantiate the classification model and visualizer
#model = MultinomialNB()
visualizer = LearningCurve(
model, scoring='f1_weighted', train_sizes=sizes)
visualizer.fit(X, y) # Fit the data to the visualizer
visualizer.show() # Finalize and render the figure
With visualizer fit or show causing issue:
---------------------------------------------------------------------------
Empty Traceback (most recent call last)
File ~/miniconda3/envs/ML/lib/python3.10/site-packages/joblib/parallel.py:862, in Parallel.dispatch_one_batch(self, iterator)
861 try:
--> 862 tasks = self._ready_batches.get(block=False)
863 except queue.Empty:
864 # slice the iterator n_jobs * batchsize items at a time. If the
865 # slice returns less than that, then the current batchsize puts
(...)
868 # accordingly to distribute evenly the last items between all
869 # workers.
File ~/miniconda3/envs/ML/lib/python3.10/queue.py:168, in Queue.get(self, block, timeout)
167 if not self._qsize():
--> 168 raise Empty
169 elif timeout is None:
Empty:
During handling of the above exception, another exception occurred:
TypeError Traceback (most recent call last)
Cell In [49], line 1
----> 1 visualizer.fit(X, y) # Fit the data to the visualizer
2 visualizer.show()
File ~/miniconda3/envs/ML/lib/python3.10/site-packages/yellowbrick/model_selection/learning_curve.py:249, in LearningCurve.fit(self, X, y)
233 sklc_kwargs = {
234 key: self.get_params()[key]
235 for key in (
(...)
245 )
246 }
248 # compute the learning curve and store the scores on the estimator
--> 249 curve = sk_learning_curve(self.estimator, X, y, **sklc_kwargs)
250 self.train_sizes_, self.train_scores_, self.test_scores_ = curve
252 # compute the mean and standard deviation of the training data
File ~/miniconda3/envs/ML/lib/python3.10/site-packages/sklearn/model_selection/_validation.py:1558, in learning_curve(estimator, X, y, groups, train_sizes, cv, scoring, exploit_incremental_learning, n_jobs, pre_dispatch, verbose, shuffle, random_state, error_score, return_times, fit_params)
1555 for n_train_samples in train_sizes_abs:
1556 train_test_proportions.append((train[:n_train_samples], test))
-> 1558 results = parallel(
1559 delayed(_fit_and_score)(
1560 clone(estimator),
1561 X,
1562 y,
1563 scorer,
1564 train,
1565 test,
1566 verbose,
1567 parameters=None,
1568 fit_params=fit_params,
1569 return_train_score=True,
1570 error_score=error_score,
1571 return_times=return_times,
1572 )
1573 for train, test in train_test_proportions
1574 )
1575 results = _aggregate_score_dicts(results)
1576 train_scores = results["train_scores"].reshape(-1, n_unique_ticks).T
File ~/miniconda3/envs/ML/lib/python3.10/site-packages/joblib/parallel.py:1085, in Parallel.__call__(self, iterable)
1076 try:
1077 # Only set self._iterating to True if at least a batch
1078 # was dispatched. In particular this covers the edge
(...)
1082 # was very quick and its callback already dispatched all the
1083 # remaining jobs.
1084 self._iterating = False
-> 1085 if self.dispatch_one_batch(iterator):
1086 self._iterating = self._original_iterator is not None
1088 while self.dispatch_one_batch(iterator):
File ~/miniconda3/envs/ML/lib/python3.10/site-packages/joblib/parallel.py:873, in Parallel.dispatch_one_batch(self, iterator)
870 n_jobs = self._cached_effective_n_jobs
871 big_batch_size = batch_size * n_jobs
--> 873 islice = list(itertools.islice(iterator, big_batch_size))
874 if len(islice) == 0:
875 return False
File ~/miniconda3/envs/ML/lib/python3.10/site-packages/sklearn/model_selection/_validation.py:1560, in <genexpr>(.0)
1555 for n_train_samples in train_sizes_abs:
1556 train_test_proportions.append((train[:n_train_samples], test))
1558 results = parallel(
1559 delayed(_fit_and_score)(
-> 1560 clone(estimator),
1561 X,
1562 y,
1563 scorer,
1564 train,
1565 test,
1566 verbose,
1567 parameters=None,
1568 fit_params=fit_params,
1569 return_train_score=True,
1570 error_score=error_score,
1571 return_times=return_times,
1572 )
1573 for train, test in train_test_proportions
1574 )
1575 results = _aggregate_score_dicts(results)
1576 train_scores = results["train_scores"].reshape(-1, n_unique_ticks).T
File ~/miniconda3/envs/ML/lib/python3.10/site-packages/sklearn/base.py:88, in clone(estimator, safe)
86 for name, param in new_object_params.items():
87 new_object_params[name] = clone(param, safe=False)
---> 88 new_object = klass(**new_object_params)
89 params_set = new_object.get_params(deep=False)
91 # quick sanity check of the parameters of the clone
TypeError: ContribEstimator.__init__() got an unexpected keyword argument 'memory'
My model is generated a scikit-learn pipeline using FLAML. I'm doing multi-class classification and best estimator is catboost.
Hello @dbrami and thank you for reaching out!
To help you, we ask that you provide
- A reproducible example of the visualizer error with Python code that we can run locally to reproduce the error. For instance, you might use one of the yellowbrick datasets so that we have the same values for X and y as you. Please be sure to include all the required import statements (e.g.
import numpy as np
) and double-check the commented-out lines of code -- some of those lines seem like they should be uncommented. - Information about your operating system [e.g. Windows, macOS], your Python Version [e.g. 2.7, 3.6, miniconda], and your Yellowbrick Version [e.g. 0.7]
Note - as for the markdown formatting - it looks like there were only 2 backticks (instead of the needed 3 backticks) in a few places, which were the cause of your rendering error. Hope this helps!
Hi there - we're going to close this issue out as it has gone stale. Feel free to reopen if there are updates!