No such file or directory: './gmm_test/GMM_100_500_0.pkl'
Closed this issue · 2 comments
I encountered this error when running gmm.py. I noticed that the model needed to be retrained with split100 before to generate the necessary distance map. I would like to generate confidence levels for each CLEAN inference. What else is necessary to do this after running gmm.py? Thanks
Hi kaden1670, the error No such file or directory: './gmm_test/GMM_100_500_0.pkl'
doesn't reply to the need to retrain a CLEAN model with split100, this error happened because there is not a gmm_test
folder at the same level of the gmm.py
script. Adding one would resolve the error.
As for the second question, in the script, the default is to train 40 GMM using the same sklearn parameters, and each of them can be used to infer the probability prediction of query samples. Please refer to the function predict_proba() in the official sklearn documentation.
Thank you for the response @canallee . How do I grab the embedding distance between the query protein and the CLEAN-predicted EC number? I see that distances, neg_distances = get_dist(ec, train_data, report_metrics=True, pretrained=True, neg_target=100, negative=negative)
is how you would do so, but passing in the filename for an input in place of train_data = 'split100' doesn't seem to work.