Prediction of new points when feeding in a precompute euclidean distance matrix while training
jayshah1397 opened this issue · 0 comments
jayshah1397 commented
Hi, I’m using pre-computed distances between points (euclidean distances) for training the hdbscan clustering model (by passing metric='precomputed'). But when I want to predict the clusters and probabilities for new points (for example, test set) using the approximate_predict() function, I cannot use the raw feature values as is.
In the case for prediction, do I need to compute euclidean distances of every new point with the set of data points I used to train the model?
Would that be the right approach?