scikit-learn-contrib/hdbscan

Prediction of new points when feeding in a precompute euclidean distance matrix while training

jayshah1397 opened this issue · 0 comments

Hi, I’m using pre-computed distances between points (euclidean distances) for training the hdbscan clustering model (by passing metric='precomputed'). But when I want to predict the clusters and probabilities for new points (for example, test set) using the approximate_predict() function, I cannot use the raw feature values as is.

In the case for prediction, do I need to compute euclidean distances of every new point with the set of data points I used to train the model?
Would that be the right approach?