Have validate_input return reshaped input
tms-bananaquit opened this issue · 0 comments
tms-bananaquit commented
Motivating example - this fails:
detector = KdqTreeStreaming(window_size=5)
for i, row in df.iterrows():
detector.update(X=row[row.index != 'rain'], y_true=None, y_pred=None)
This does not:
detector = KdqTreeStreaming(window_size=5)
for i, row in df.iterrows():
detector.update(X=np.array(row[row.index != 'rain']).reshape(1, -1), y_true=None, y_pred=None)
Probably it's because row
is being treated as a column vector by default. It'd make sense if validate_input
returned a reshaped array for use by the detector internally.
Annoyances:
- For drift detectors that could take large vectors (i.e., data drift), copying the data is wasteful.
update
andset_reference
both callvalidate_input
, but don't themselves return data, because that looks funny. Forset_reference
especially, copying the data is necessary. For this to look clean,validate_input
will need to be called in the child classes instead of the parents.