reproducibility
Ulitochka opened this issue · 3 comments
Ulitochka commented
Hello everyone.
While using the library, we encountered a couple of cases in which we get different predicates on the same data with the same model configuration.
The first case
We train the model with these settings:
roles = {'target': 'label', 'text': ['text']}
task = Task('binary', metric='auc')
automl = TabularNLPAutoML(task=task,
timeout=100000,
general_params={'use_algos': ['nn', 'cb', 'lgb', 'linear_l2']},
gpu_ids='0',
reader_params={'n_jobs': 12},
cpu_limit=13,
text_params={'lang': 'ru'},
nn_params={
'lang': 'ru',
'snap_params': {'k': 1, 'early_stopping': True, 'patience': 1, 'swa': False},
'max_length': 256,
'bs': 16,
'bert_name': 'DeepPavlov/rubert-base-cased-conversational',
'pooling': 'cls' },
nn_pipeline_params={'text_features': 'bert'},
autonlp_params={'model_name': 'random_lstm_bert'},
gbm_pipeline_params={'text_features': 'embed'}, # tfidf embed
linear_pipeline_params={'text_features': 'embed'},
verbose=2
)
We predict on the test data and get the result # 1:
def to_labels(pos_probs, threshold):
return (pos_probs >= threshold). astype ('int`)'
test_pred = automl.predict(test_pd)
labels = to_labels(test_pred.data[:, 0], 0.5)
print(classification_report(test_pd[roles['target']].values, labels, digits=4))
We repeat the training again, we get result # 2 on the test data, while the result is # 1 != result # 2
Please tell me what this behavior can be related to?
alexmryzhkov commented
Hi @Ulitochka,
To figure out why the results are not equal, could you please share the training logs of both models?
Alex
dev-rinchin commented
@Ulitochka do you train the model on the same machine in both cases?
github-actions commented
Stale issue message