online-ml/river

SKL2RiverClassifier partial_fit method

tcallies opened this issue · 1 comments

Versions

river version: 0.21.0
skmultiflow version: 0.5.3
Python version: 3.10.12
Operating system: Ubuntu 22.03

Describe the bug

A river model created from an SKMultiflow Model transformed via river.compat.SKL2RiverClassifier passes a list to an skmultiflow Classifier's partial_fit method. ClassifierMixins expect 2-dimensional numpy.ndarrays, so this leads to an indexing error.

image

Steps/code to reproduce

import numpy as np 
np.float = float #Hack for skmultiflow and numpy compatability

from river.datasets import Elec2
from river.compat import SKL2RiverClassifier
from skmultiflow.lazy import SAMKNNClassifier

data = Elec2()
model = SKL2RiverClassifier(SAMKNNClassifier(), classes=[True, False])
for x, y in data: 
    model.learn_one(x, y)

The np.float = float hack is unfortunately necessary due to updated numpy versions but unrelated to the issue (but required for the import to work correctly).

Fix

This is easily fixed by calling np.asarray() on X (additionally need to import numpy in sklearn_to_river.py)

    def learn_one(self, x, y):
        self.estimator.partial_fit(X=np.asarray([self._align_dict(x)]), y=[y], classes=self.classes)

Actually just wanted to share in case anyone else encounters, also don't know if there's any architecture/legacy reasons for this.

River isn't intended to support skmultiflow. But thanks for sharing the tip!