SKL2RiverClassifier partial_fit method
tcallies opened this issue · 1 comments
Versions
river version: 0.21.0
skmultiflow version: 0.5.3
Python version: 3.10.12
Operating system: Ubuntu 22.03
Describe the bug
A river model created from an SKMultiflow Model transformed via river.compat.SKL2RiverClassifier
passes a list to an skmultiflow Classifier's partial_fit method. ClassifierMixins expect 2-dimensional numpy.ndarrays, so this leads to an indexing error.
Steps/code to reproduce
import numpy as np
np.float = float #Hack for skmultiflow and numpy compatability
from river.datasets import Elec2
from river.compat import SKL2RiverClassifier
from skmultiflow.lazy import SAMKNNClassifier
data = Elec2()
model = SKL2RiverClassifier(SAMKNNClassifier(), classes=[True, False])
for x, y in data:
model.learn_one(x, y)
The np.float = float
hack is unfortunately necessary due to updated numpy versions but unrelated to the issue (but required for the import to work correctly).
Fix
This is easily fixed by calling np.asarray()
on X (additionally need to import numpy in sklearn_to_river.py)
def learn_one(self, x, y):
self.estimator.partial_fit(X=np.asarray([self._align_dict(x)]), y=[y], classes=self.classes)
Actually just wanted to share in case anyone else encounters, also don't know if there's any architecture/legacy reasons for this.
River isn't intended to support skmultiflow. But thanks for sharing the tip!