the shape of ERROR_TENSOR
zml24 opened this issue · 2 comments
zml24 commented
Hi, I read the file error_tensor.npy
and find that the shape of ERROR_TENSOR is (215, 4, 2, 8, 183)
.
After computing, I find the number of standardizer
is 2; the number of dim_reducer
is 8; the number of estimator
is 183; maybe the number of dataset
is 215. So, what is the number in the shape 4
means?
Here are infos in classification.json
{
"imputer":
{"algorithms": ["SimpleImputer"],
"hyperparameters": {
"SimpleImputer": {"strategy": ["mean", "median", "most_frequent", "constant"]}
}},
"encoder":
{"algorithms": [null, "OneHotEncoder"],
"hyperparameters": {
"OneHotEncoder": {"handle_unknown": ["ignore"], "sparse": [0]}
}},
"standardizer":
{"algorithms": [null, "StandardScaler"],
"hyperparameters": {
"StandardScaler": {}
}},
"dim_reducer":
{"algorithms": [null, "PCA", "VarianceThreshold", "SelectKBest"],
"hyperparameters": {
"PCA": {"n_components": ["25%", "50%", "75%"]},
"VarianceThreshold": {},
"SelectKBest": {"k": ["25%", "50%", "75%"]}
}},
"estimator":
{"algorithms": ["KNN", "DT", "RF", "GBT", "AB", "lSVM", "Logit", "Perceptron", "GNB", "MLP", "ExtraTrees"],
"hyperparameters": {
"KNN": {"n_neighbors": [1, 3, 5, 7, 9, 11, 13, 15], "p": [1, 2]},
"DT": {"min_samples_split": [2,4,8,16,32,64,128,256,512,1024,0.01,0.001,0.0001,1e-05]},
"RF": {"min_samples_split": [2,4,8,16,32,64,128,256,512,1024,0.1,0.01,0.001,0.0001,1e-05], "criterion": ["gini", "entropy"]},
"GBT": {"learning_rate": [0.001,0.01,0.025,0.05,0.1,0.25,0.5], "max_depth": [3, 6], "max_features": [null, "log2"]},
"AB": {"n_estimators": [50, 100], "learning_rate": [1.0, 1.5, 2.0, 2.5, 3.0]},
"lSVM": {"C": [0.125,0.25,0.5,0.75,1,2,4,8,16]},
"Logit": {"C": [0.25,0.5,0.75,1,1.5,2,3,4], "solver": ["liblinear", "saga"], "penalty": ["l1", "l2"]},
"Perceptron": {},
"GNB": {},
"MLP": {"learning_rate_init": [0.0001,0.001,0.01], "learning_rate": ["adaptive"], "solver": ["sgd", "adam"], "alpha": [0.0001, 0.01]},
"ExtraTrees": {"min_samples_split": [2,4,8,16,32,64,128,256,512,1024,0.1,0.01,0.001,0.0001,1e-05], "criterion": ["gini", "entropy"]}
}}
}
chengrunyang commented
Hi, sorry for the confusion! This was an earlier version that did not include the encoder dimension. I will push a newer version in which the error tensor has size (n_datasets, 4, 2, 2, 8, 183)
, which includes 4 data imputers, 2 encoders, 2 standardizers, 8 dimensionality reducers and 183 estimators.
zml24 commented
Thanks!