mitre/menelaus

Have validate_input return reshaped input

tms-bananaquit opened this issue · 0 comments

Motivating example - this fails:

detector = KdqTreeStreaming(window_size=5)
for i, row in df.iterrows():
    detector.update(X=row[row.index != 'rain'], y_true=None, y_pred=None)

This does not:

detector = KdqTreeStreaming(window_size=5)
for i, row in df.iterrows():
    detector.update(X=np.array(row[row.index != 'rain']).reshape(1, -1), y_true=None, y_pred=None)

Probably it's because row is being treated as a column vector by default. It'd make sense if validate_input returned a reshaped array for use by the detector internally.

Annoyances:

  • For drift detectors that could take large vectors (i.e., data drift), copying the data is wasteful.
  • update and set_reference both call validate_input, but don't themselves return data, because that looks funny. For set_reference especially, copying the data is necessary. For this to look clean, validate_input will need to be called in the child classes instead of the parents.