[Question] How to determine to which class each probability column belongs?
PGijsbers opened this issue · 2 comments
Is it possible to determine to which class each probability columns belongs?
Consider the following example:
import numpy as np
x = np.random.random((150, 4))
y = np.asarray(list("abc") * 50).reshape(-1, 1)
import pandas as pd
data = pd.DataFrame(np.hstack([x, y]), columns=["f1", "f2", "f3", "f4", "target"])
print(data.head())
from lightautoml.tasks import Task
from lightautoml.automl.presets.tabular_presets import TabularUtilizedAutoML
task = Task("multiclass")
automl = TabularUtilizedAutoML(task=task, timeout=30)
automl.fit_predict(data, roles=dict(target="target"))
preds = automl.predict(data[["f1", "f2", "f3", "f4"]])
The resulting preds
is a NumpyDataset
with data shape (150,3)
representing features WeightedBlend_{0,1,2}
. Nowhere in the meta-data can I find whether the first column probabilities correspond to class 'a'
(or any other class). Am I missing something here?
As far as I can tell the column order depends on the class order in the original training data. But I can't find this explicitly anywhere, nor can I find a progamatic way of retrieving the order of labels as used by lightautoml
. I would expect e.g. a classes_
property or the feature names to reflect the classes of which the probability is predicted in each column.
Thank you, sorry for the duplicate issue.